repeat polyglutamine sequences: Topics by Science.gov

Sample records for repeat polyglutamine sequences

In Vitro Expansion of CAG, CAA, and Mixed CAG/CAA Repeats.

PubMed

Figura, Grzegorz; Koscianska, Edyta; Krzyzosiak, Wlodzimierz J

2015-08-11

Polyglutamine diseases, including Huntington's disease and a number of spinocerebellar ataxias, are caused by expanded CAG repeats that are located in translated sequences of individual, functionally-unrelated genes. Only mutant proteins containing polyglutamine expansions have long been thought to be pathogenic, but recent evidence has implicated mutant transcripts containing long CAG repeats in pathogenic processes. The presence of two pathogenic factors prompted us to attempt to distinguish the effects triggered by mutant protein from those caused by mutant RNA in cellular models of polyglutamine diseases. We used the SLIP (Synthesis of Long Iterative Polynucleotide) method to generate plasmids expressing long CAG repeats (forming a hairpin structure), CAA-interrupted CAG repeats (forming multiple unstable hairpins) or pure CAA repeats (not forming any secondary structure). We successfully modified the original SLIP protocol to generate repeats of desired length starting from constructs containing short repeat tracts. We demonstrated that the SLIP method is a time- and cost-effective approach to manipulate the lengths of expanded repeat sequences.
The most prevalent genetic cause of ALS-FTD, C9orf72 synergizes the toxicity of ATXN2 intermediate polyglutamine repeats through the autophagy pathway

PubMed Central

Ciura, Sorana; Sellier, Chantal; Campanari, Maria-Letizia; Charlet-Berguerand, Nicolas; Kabashi, Edor

2016-01-01

ABSTRACT The most common genetic cause for amyotrophic lateral sclerosis and frontotemporal dementia (ALS-FTD) is repeat expansion of a hexanucleotide sequence (GGGGCC) within the C9orf72 genomic sequence. To elucidate the functional role of C9orf72 in disease pathogenesis, we identified certain molecular interactors of this factor. We determined that C9orf72 exists in a complex with SMCR8 and WDR41 and that this complex acts as a GDP/GTP exchange factor for RAB8 and RAB39, 2 RAB GTPases involved in macroautophagy/autophagy. Consequently, C9orf72 depletion in neuronal cultures leads to accumulation of unresolved aggregates of SQSTM1/p62 and phosphorylated TARDBP/TDP-43. However, C9orf72 reduction does not lead to major neuronal toxicity, suggesting that a second stress may be required to induce neuronal cell death. An intermediate size of polyglutamine repeats within ATXN2 is an important genetic modifier of ALS-FTD. We found that coexpression of intermediate polyglutamine repeats (30Q) of ATXN2 combined with C9orf72 depletion increases the aggregation of ATXN2 and neuronal toxicity. These results were confirmed in zebrafish embryos where partial C9orf72 knockdown along with intermediate (but not normal) repeat expansions in ATXN2 causes locomotion deficits and abnormal axonal projections from spinal motor neurons. These results demonstrate that C9orf72 plays an important role in the autophagy pathway while genetically interacting with another major genetic risk factor, ATXN2, to contribute to ALS-FTD pathogenesis. PMID:27245636
The most prevalent genetic cause of ALS-FTD, C9orf72 synergizes the toxicity of ATXN2 intermediate polyglutamine repeats through the autophagy pathway.

PubMed

Ciura, Sorana; Sellier, Chantal; Campanari, Maria-Letizia; Charlet-Berguerand, Nicolas; Kabashi, Edor

2016-08-02

The most common genetic cause for amyotrophic lateral sclerosis and frontotemporal dementia (ALS-FTD) is repeat expansion of a hexanucleotide sequence (GGGGCC) within the C9orf72 genomic sequence. To elucidate the functional role of C9orf72 in disease pathogenesis, we identified certain molecular interactors of this factor. We determined that C9orf72 exists in a complex with SMCR8 and WDR41 and that this complex acts as a GDP/GTP exchange factor for RAB8 and RAB39, 2 RAB GTPases involved in macroautophagy/autophagy. Consequently, C9orf72 depletion in neuronal cultures leads to accumulation of unresolved aggregates of SQSTM1/p62 and phosphorylated TARDBP/TDP-43. However, C9orf72 reduction does not lead to major neuronal toxicity, suggesting that a second stress may be required to induce neuronal cell death. An intermediate size of polyglutamine repeats within ATXN2 is an important genetic modifier of ALS-FTD. We found that coexpression of intermediate polyglutamine repeats (30Q) of ATXN2 combined with C9orf72 depletion increases the aggregation of ATXN2 and neuronal toxicity. These results were confirmed in zebrafish embryos where partial C9orf72 knockdown along with intermediate (but not normal) repeat expansions in ATXN2 causes locomotion deficits and abnormal axonal projections from spinal motor neurons. These results demonstrate that C9orf72 plays an important role in the autophagy pathway while genetically interacting with another major genetic risk factor, ATXN2, to contribute to ALS-FTD pathogenesis.
Cathepsins L and Z Are Critical in Degrading Polyglutamine-containing Proteins within Lysosomes*

PubMed Central

Bhutani, Nidhi; Piccirillo, Rosanna; Hourez, Raphael; Venkatraman, Prasanna; Goldberg, Alfred L.

2012-01-01

In neurodegenerative diseases caused by extended polyglutamine (polyQ) sequences in proteins, aggregation-prone polyQ proteins accumulate in intraneuronal inclusions. PolyQ proteins can be degraded by lysosomes or proteasomes. Proteasomes are unable to hydrolyze polyQ repeat sequences, and during breakdown of polyQ proteins, they release polyQ repeat fragments for degradation by other cellular enzymes. This study was undertaken to identify the responsible proteases. Lysosomal extracts (unlike cytosolic enzymes) were found to rapidly hydrolyze polyQ sequences in peptides, proteins, or insoluble aggregates. Using specific inhibitors against lysosomal proteases, enzyme-deficient extracts, and pure cathepsins, we identified cathepsins L and Z as the lysosomal cysteine proteases that digest polyQ proteins and peptides. RNAi for cathepsins L and Z in different cell lines and adult mouse muscles confirmed that they are critical in degrading polyQ proteins (expanded huntingtin exon 1) but not other types of aggregation-prone proteins (e.g. mutant SOD1). Therefore, the activities of these two lysosomal cysteine proteases are important in host defense against toxic accumulation of polyQ proteins. PMID:22451661
Evolution and function of CAG/polyglutamine repeats in protein–protein interaction networks

PubMed Central

Schaefer, Martin H.; Wanker, Erich E.; Andrade-Navarro, Miguel A.

2012-01-01

Expanded runs of consecutive trinucleotide CAG repeats encoding polyglutamine (polyQ) stretches are observed in the genes of a large number of patients with different genetic diseases such as Huntington's and several Ataxias. Protein aggregation, which is a key feature of most of these diseases, is thought to be triggered by these expanded polyQ sequences in disease-related proteins. However, polyQ tracts are a normal feature of many human proteins, suggesting that they have an important cellular function. To clarify the potential function of polyQ repeats in biological systems, we systematically analyzed available information stored in sequence and protein interaction databases. By integrating genomic, phylogenetic, protein interaction network and functional information, we obtained evidence that polyQ tracts in proteins stabilize protein interactions. This happens most likely through structural changes whereby the polyQ sequence extends a neighboring coiled-coil region to facilitate its interaction with a coiled-coil region in another protein. Alteration of this important biological function due to polyQ expansion results in gain of abnormal interactions, leading to pathological effects like protein aggregation. Our analyses suggest that research on polyQ proteins should shift focus from expanded polyQ proteins into the characterization of the influence of the wild-type polyQ on protein interactions. PMID:22287626
Glial response to polyglutamine-mediated stress

PubMed Central

Vig, Parminder J.S.; Shao, Qingmei; Lopez, Maripar E

2009-01-01

Neurodegenerative trinucleotide (CAG) repeat disorders are caused by the expansion of polyglutamine tracts within the disease proteins. Some of these proteins have an unknown function. How does expanded polyglutamine cause target neurons to degenerate, is not clear. Recent evidence suggests that intercellular miscommunication may contribute to polyglutamine pathogenesis in CAG repeat disorders. Polyglutamine induced degeneration of the target neuron can be mediated via glia-neuron interactions. Here we hypothesize during neurodegenerative process the failure of cell: cell interactions have more severe consequences than alterations in intracellular neuron biology. We further believe that bidirectional communication between neurons and glia are prerequisite for the normal development and function of either cell-type. Understanding intercellular signaling mechanisms such as glial trophic factors and their receptors, cell adhesion or other well-defined signaling molecules provide opportunities for developing potential therapies. PMID:20046986
Evidence that a proposed repeated segment of glutamine residues is expressed in the Huntington disease protein

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jou, Y.S.; Myers, R.M.

1994-09-01

Huntington disease (HD) appears to be caused by a mutation that results in an expanded number of CAG repeats at the 5{prime} end of the gene. The nucleotide sequence of the gene and cDNA clones predicts a 347 kd protein that contains a stretch of polyglutamine, encoded by the CAG repeat, located 17 amino acids downstream from the proposed translation initiation site. Because understanding the mechanisms of the pathology of HD depends on whether the CAG-repeat is expressed in the protein, we used antibodies directed against portions of the predicted HD gene product to probe the structure of the proteinmore » in tissue culture cells. Two peptides, one located amino-terminal to the proposed polyglutamine stretch (hd1 peptide FESLKSFQQ from amino acids 11-19) and one located in the carboxy-terminal half of the predicted protein (hd2 peptide QQPRNKPLK from amino acids 2531-2539), were used to elicit polyclonal antibodies in NZW rabbits. We affinity-purified the antibodies and used them to analyze the HD protein. Both antisera specifically recognize the peptides used to elicit them, as well as the appropriate portions of the HD protein expressed in E. coli. Western blot analysis showed that both antisera recognize a protein with an apparent molecular weight of approximately 350,000 in human, monkey, rat and mouse cell lines, including two neutronal cell lines. These results, in combination with immunoprecipitation experiments, suggest strongly that the proposed polyglutamine stretch is indeed translated in the HD protein and is evolutionarily conserved in various mammalian species.« less
Solution Model of the Intrinsically Disordered Polyglutamine Tract-Binding Protein-1

PubMed Central

Rees, Martin; Gorba, Christian; de Chiara, Cesira; Bui, Tam T.T.; Garcia-Maya, Mitla; Drake, Alex F.; Okazawa, Hitoshi; Pastore, Annalisa; Svergun, Dmitri; Chen, Yu Wai

2012-01-01

Polyglutamine tract-binding protein-1 (PQBP-1) is a 265-residue nuclear protein that is involved in transcriptional regulation. In addition to its role in the molecular pathology of the polyglutamine expansion diseases, mutations of the protein are associated with X-linked mental retardation. PQBP-1 binds specifically to glutamine repeat sequences and proline-rich regions, and interacts with RNA polymerase II and the spliceosomal protein U5-15kD. In this work, we obtained a biophysical characterization of this protein by employing complementary structural methods. PQBP-1 is shown to be a moderately compact but largely disordered molecule with an elongated shape, having a Stokes radius of 3.7 nm and a maximum molecular dimension of 13 nm. The protein is monomeric in solution, has residual β-structure, and is in a premolten globule state that is unaffected by natural osmolytes. Using small-angle x-ray scattering data, we were able to generate a low-resolution, three-dimensional model of PQBP-1. PMID:22500761
HIP1, a human homologue of S. cerevisiae Sla2p, interacts with membrane-associated huntingtin in the brain.

PubMed

Kalchman, M A; Koide, H B; McCutcheon, K; Graham, R K; Nichol, K; Nishiyama, K; Kazemi-Esfarjani, P; Lynn, F C; Wellington, C; Metzler, M; Goldberg, Y P; Kanazawa, I; Gietz, R D; Hayden, M R

1997-05-01

Huntington disease (HD) is associated with the expansion of a polyglutamine tract, greater than 35 repeats, in the HD gene product, huntingtin. Here we describe a novel huntingtin interacting protein, HIP1, which co-localizes with huntingtin and shares sequence homology and biochemical characteristics with Sla2p, a protein essential for function of the cytoskeleton in Saccharomyces cerevisiae. The huntingtin-HIP1 interaction is restricted to the brain and is inversely correlated to the polyglutamine length in huntingtin. This provides the first molecular link between huntingtin and the neuronal cytoskeleton and suggests that, in HD, loss of normal huntingtin-HIP1 interaction may contribute to a defect in membrane-cytoskeletal integrity in the brain.
Association of premenstrual/menstrual symptoms with perinatal depression and a polymorphic repeat in the polyglutamine tract of the retinoic acid induced 1 gene.

PubMed

Tan, Ene-Choo; Tan, Hui-San; Chua, Tze-Ern; Lee, Theresa; Ng, Jasmine; Ch'ng, Ying-Chia; Choo, Chih-Huei; Chen, Helen Y

2014-06-01

Depression during pregnancy or after childbirth is the most frequent perinatal illness affecting women. We investigated the length distribution of a trinucleotide repeat in RAI1, which has not been studied in perinatal depression or in the Chinese population. Cases (n=139) with confirmed diagnosis of clinical (major) depression related to pregnancy/postpartum were recruited from the outpatient clinic. Controls were patients who came to the obstetrics clinics and scored <7 on the Edinburgh Postnatal Depression Scale (EPDS) (n=540). Saliva samples for DNA analysis, demographic information and self-reported frequency of occurrence of various premenstrual/menstrual symptoms were collected from all participants. Genomic DNA was extracted from saliva and relevant region sequenced to determine the number of CAG/CAA repeats that encodes the polyglutamine tract in the N terminal of the protein. Difference between groups was assessed by chi-square analysis for categorical variables and analysis of variance for quantitative scores. Compared to control subjects, patients with perinatal depression reported more frequent mood changes, cramps, nausea, vomiting, diarrhoea, and headache during premenstrual/menstrual periods (p=0.000). For the RAI1 gene CAG/CAA repeat, there was a statistically significant difference in the genotypic distribution between cases and controls (p=0.031). There was also a statistically significant association between the 14-repeat allele and perinatal depression (p=0.016). Family history, previous mental illness, and physical and psychological symptoms during the premenstrual/menstrual periods were self-reported. EPDS screening was done only once for controls. The RAI1 gene polyglutamine repeat has a different distribution in our population. The 14-repeat allele is associated with perinatal depression and more frequent experience of physical and psychological symptoms during menstrual period. Copyright © 2014 Elsevier B.V. All rights reserved.
Analysis of polyglutamine-coding repeats in the TATA-binding protein in different human populations and in patients with schizophrenia an bipolar affective disorder

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rubinsztein, D.C.; Leggo, J.; Crow, T.J.

A new class of disease (including Huntington disease, Kennedy disease, and spinocerebellar ataxias types 1 and 3) results from abnormal expansions of CAG trinucleotides in the coding regions of genes. In all of these diseases the CAG repeats are thought to be translated into polyglutamine tracts. There is accumulating evidence arguing for CAG trinucleotide expansions as one of the causative disease mutations in schizophrenia and bipolar affective disorder. We and others believe that the TATA-binding protein (TBP) is an important candidate to investigate in these diseases as it contains a highly polymorphic stretch of glutamine codons, which are close tomore » the threshold length where the polyglutamine tracts start to be associated with disease. Thus, we examined the lengths of this polyglutamine repeat in normal unrelated East Anglians, South African Blacks, sub-Saharan Africans mainly from Nigeria, and Asian Indians. We also examined 43 bipolar affective disorder patients and 65 schizophrenic patients. The range of polyglutamine tract-lengths that we found in humans was from 26-42 codons. No patients with bipolar affective disorder and schizophrenia had abnormal expansions at this locus. 22 refs., 1 tab.« less
From Pathways to Targets: Understanding the Mechanisms behind Polyglutamine Disease

PubMed Central

Weber, Jonasz Jeremiasz; Sowa, Anna Sergeevna

2014-01-01

The history of polyglutamine diseases dates back approximately 20 years to the discovery of a polyglutamine repeat in the androgen receptor of SBMA followed by the identification of similar expansion mutations in Huntington's disease, SCA1, DRPLA, and the other spinocerebellar ataxias. This common molecular feature of polyglutamine diseases suggests shared mechanisms in disease pathology and neurodegeneration of disease specific brain regions. In this review, we discuss the main pathogenic pathways including proteolytic processing, nuclear shuttling and aggregation, mitochondrial dysfunction, and clearance of misfolded polyglutamine proteins and point out possible targets for treatment. PMID:25309920
The CAA repeat polymorphism in the ZFHX3 gene is associated with risk of coronary heart disease in a Chinese population.

PubMed

Sun, Shunchang; Zhang, Wenwu; Chen, Xi; Song, Huiwen

2015-04-01

Coronary heart disease (CHD) is a disease resulting from the interaction between genetic variations and environmental factors. Zinc finger homeobox 3 (ZFHX3) is a transcription factor and contains a poly-glutamine tract in a compositionally biased region that is encoded by exon 9, containing a cluster of CAG and CAA triplets followed by the polymorphic CAA repeats: (CAG)2(CAA)2(CAG)3CAACAG(CAA)nGCA. Thus, nine successive glutamine residues precede the poly-glutamine tract, encoded by the polymorphic CAA repeats. The aim of this study was to investigate the association of the CAA repeat polymorphism in exon 9 of the ZFHX3 gene with the risk of CHD in a Chinese population. The CAA repeat polymorphism was determined by polymerase chain reaction followed by DNA sequencing in 321 CHD patients. Genotype frequencies were compared using the non-parametric mood median test. Four alleles of CAG(CAA)10GCA, CAG(CAA)8GCA, CAG(CAA)9GCA, and CAG(CAA)11GCA were found in Chinese CHD patients in exon 9 of the ZFHX3 gene. The CAG(CAA)10GCA was a major allele (95.95%), and the CAG(CAA)8GCA was a minor allele (3.58%). The CAG(CAA)9GCA and CAG(CAA)11GCA were rare alleles (0.31% and 0.16%). The CAG(CAA)10GCA allele encodes a poly-glutamine tract of 19 residues. Importantly, the CHD patients homozygous for the CAG(CAA)10GCA allele had a higher risk of CHD, compared to the heterozygous patients carrying a CAG(CAA)8GCA allele. Moreover, the CAG(CAA)10GCA allele was significantly associated with hypertension, diabetes mellitus, or dyslipidemia (P < 0.05). Thus, the CAA repeat polymorphism in exon 9 of the ZFHX3 gene contributes to the CHD susceptibility in the Chinese population.
Perturbation of the Akt/Gsk3-β signalling pathway is common to Drosophila expressing expanded untranslated CAG, CUG and AUUCU repeat RNAs.

PubMed

van Eyk, Clare L; O'Keefe, Louise V; Lawlor, Kynan T; Samaraweera, Saumya E; McLeod, Catherine J; Price, Gareth R; Venter, Deon J; Richards, Robert I

2011-07-15

Recent evidence supports a role for RNA as a common pathogenic agent in both the 'polyglutamine' and 'untranslated' dominant expanded repeat disorders. One feature of all repeat sequences currently associated with disease is their predicted ability to form a hairpin secondary structure at the RNA level. In order to investigate mechanisms by which hairpin-forming repeat RNAs could induce neurodegeneration, we have looked for alterations in gene transcript levels as hallmarks of the cellular response to toxic hairpin repeat RNAs. Three disease-associated repeat sequences--CAG, CUG and AUUCU--were specifically expressed in the neurons of Drosophila and resultant common transcriptional changes assessed by microarray analyses. Transcripts that encode several components of the Akt/Gsk3-β signalling pathway were altered as a consequence of expression of these repeat RNAs, indicating that this pathway is a component of the neuronal response to these pathogenic RNAs and may represent an important common therapeutic target in this class of diseases.
Expanded ATXN3 frameshifting events are toxic in Drosophila and mammalian neuron models.

PubMed

Stochmanski, Shawn J; Therrien, Martine; Laganière, Janet; Rochefort, Daniel; Laurent, Sandra; Karemera, Liliane; Gaudet, Rebecca; Vyboh, Kishanda; Van Meyel, Don J; Di Cristo, Graziella; Dion, Patrick A; Gaspar, Claudia; Rouleau, Guy A

2012-05-15

Spinocerebellar ataxia type 3 is caused by the expansion of the coding CAG repeat in the ATXN3 gene. Interestingly, a -1 bp frameshift occurring within an (exp)CAG repeat would henceforth lead to translation from a GCA frame, generating polyalanine stretches instead of polyglutamine. Our results show that transgenic expression of (exp)CAG ATXN3 led to -1 frameshifting events, which have deleterious effects in Drosophila and mammalian neurons. Conversely, transgenic expression of polyglutamine-encoding (exp)CAA ATXN3 was not toxic. Furthermore, (exp)CAG ATXN3 mRNA does not contribute per se to the toxicity observed in our models. Our observations indicate that expanded polyglutamine tracts in Drosophila and mouse neurons are insufficient for the development of a phenotype. Hence, we propose that -1 ribosomal frameshifting contributes to the toxicity associated with (exp)CAG repeats.
Instability of expanded CAG/CAA repeats in spinocerebellar ataxia type 17.

PubMed

Gao, Rui; Matsuura, Tohru; Coolbaugh, Mary; Zühlke, Christine; Nakamura, Koichiro; Rasmussen, Astrid; Siciliano, Michael J; Ashizawa, Tetsuo; Lin, Xi

2008-02-01

Trinucleotide repeat expansions are dynamic mutations causing many neurological disorders, and their instability is influenced by multiple factors. Repeat configuration seems particularly important, and pure repeats are thought to be more unstable than interrupted repeats. But direct evidence is still lacking. Here, we presented strong support for this hypothesis from our studies on spinocerebellar ataxia type 17 (SCA17). SCA17 is a typical polyglutamine disease caused by CAG repeat expansion in TBP (TATA binding protein), and is unique in that the pure expanded polyglutamine tract is coded by either a simple configuration with long stretches of pure CAGs or a complex configuration containing CAA interruptions. By small pool PCR (SP-PCR) analysis of blood DNA from SCA17 patients of distinct racial backgrounds, we quantitatively assessed the instability of these two types of expanded alleles coding similar length of polyglutamine expansion. Mutation frequency in patients harboring pure CAG repeats is 2-3 folds of those with CAA interruptions. Interestingly, the pure CAG repeats showed both expansion and deletion while the interrupted repeats exhibited mostly deletion at a significantly lower frequency. These data strongly suggest that repeat configuration is a critical determinant for instability, and CAA interruptions might serve as a limiting element for further expansion of CAG repeats in SCA17 locus, suggesting a molecular basis for lack of anticipation in SCA17 families with interrupted CAG expansion.
Puromycin-sensitive aminopeptidase is the major peptidase responsible for digesting polyglutamine sequences released by proteasomes during protein degradation

PubMed Central

Bhutani, N; Venkatraman, P; Goldberg, A L

2007-01-01

Long stretches of glutamine (Q) residues are found in many cellular proteins. Expansion of these polyglutamine (polyQ) sequences is the underlying cause of several neurodegenerative diseases (e.g. Huntington's disease). Eukaryotic proteasomes have been found to digest polyQ sequences in proteins very slowly, or not at all, and to release such potentially toxic sequences for degradation by other peptidases. To identify these key peptidases, we investigated the degradation in cell extracts of model Q-rich fluorescent substrates and peptides containing 10–30 Q's. Their degradation at neutral pH was due to a single aminopeptidase, the puromycin-sensitive aminopeptidase (PSA, cytosol alanyl aminopeptidase). No other known cytosolic aminopeptidase or endopeptidase was found to digest these polyQ peptides. Although tripeptidyl peptidase II (TPPII) exhibited limited activity, studies with specific inhibitors, pure enzymes and extracts of cells treated with siRNA for TPPII or PSA showed PSA to be the rate-limiting activity against polyQ peptides up to 30 residues long. (PSA digests such Q sequences, shorter ones and typical (non-repeating) peptides at similar rates.) Thus, PSA, which is induced in neurons expressing mutant huntingtin, appears critical in preventing the accumulation of polyQ peptides in normal cells, and its activity may influence susceptibility to polyQ diseases. PMID:17318184
Modeling protein homopolymeric repeats: possible polyglutamine structural motifs for Huntington's disease.

PubMed

Lathrop, R H; Casale, M; Tobias, D J; Marsh, J L; Thompson, L M

1998-01-01

We describe a prototype system (Poly-X) for assisting an expert user in modeling protein repeats. Poly-X reduces the large number of degrees of freedom required to specify a protein motif in complete atomic detail. The result is a small number of parameters that are easily understood by, and under the direct control of, a domain expert. The system was applied to the polyglutamine (poly-Q) repeat in the first exon of huntingtin, the gene implicated in Huntington's disease. We present four poly-Q structural motifs: two poly-Q beta-sheet motifs (parallel and antiparallel) that constitute plausible alternatives to a similar previously published poly-Q beta-sheet motif, and two novel poly-Q helix motifs (alpha-helix and pi-helix). To our knowledge, helical forms of polyglutamine have not been proposed before. The motifs suggest that there may be several plausible aggregation structures for the intranuclear inclusion bodies which have been found in diseased neurons, and may help in the effort to understand the structural basis for Huntington's disease.
β-hairpin-mediated nucleation of polyglutamine amyloid formation

PubMed Central

Kar, Karunakar; Hoop, Cody L.; Drombosky, Kenneth W.; Baker, Matthew A.; Kodali, Ravindra; Arduini, Irene; van der Wel, Patrick C. A.; Horne, W. Seth; Wetzel, Ronald

2013-01-01

The conformational preferences of polyglutamine (polyQ) sequences are of major interest because of their central importance in the expanded CAG repeat diseases that include Huntington’s disease (HD). Here we explore the response of various biophysical parameters to the introduction of β-hairpin motifs within polyQ sequences. These motifs (trpzip, disulfide, D-Pro-Gly, Coulombic attraction, L-Pro-Gly) enhance formation rates and stabilities of amyloid fibrils with degrees of effectiveness well-correlated with their known abilities to enhance β-hairpin formation in other peptides. These changes led to decreases in the critical nucleus for amyloid formation from a value of n* = 4 for a simple, unbroken Q23 sequence to approximate unitary n* values for similar length polyQs containing β-hairpin motifs. At the same time, the morphologies, secondary structures, and bioactivities of the resulting fibrils were essentially unchanged from simple polyQ aggregates. In particular, the signature pattern of SSNMR 13C Gln resonances that appears to be unique to polyQ amyloid is replicated exactly in fibrils from a β-hairpin polyQ. Importantly, while β-hairpin motifs do produce enhancements in the equilibrium constant for nucleation in aggregation reactions, these Kn* values remain quite low (~ 10−10) and there is no evidence for significant embellishment of β-structure within the monomer ensemble. The results indicate an important role for β-turns in the nucleation mechanism and structure of polyQ amyloid and have implications for the nature of the toxic species in expanded CAG repeat diseases. PMID:23353826
Explaining the length threshold of polyglutamine aggregation

NASA Astrophysics Data System (ADS)

De Los Rios, Paolo; Hafner, Marc; Pastore, Annalisa

2012-06-01

The existence of a length threshold, of about 35 residues, above which polyglutamine repeats can give rise to aggregation and to pathologies, is one of the hallmarks of polyglutamine neurodegenerative diseases such as Huntington’s disease. The reason why such a minimal length exists at all has remained one of the main open issues in research on the molecular origins of such classes of diseases. Following the seminal proposals of Perutz, most research has focused on the hunt for a special structure, attainable only above the minimal length, able to trigger aggregation. Such a structure has remained elusive and there is growing evidence that it might not exist at all. Here we review some basic polymer and statistical physics facts and show that the existence of a threshold is compatible with the modulation that the repeat length imposes on the association and dissociation rates of polyglutamine polypeptides to and from oligomers. In particular, their dramatically different functional dependence on the length rationalizes the very presence of a threshold and hints at the cellular processes that might be at play, in vivo, to prevent aggregation and the consequent onset of the disease.

A cell-based assay for aggregation inhibitors as therapeutics of polyglutamine-repeat disease and validation in Drosophila

NASA Astrophysics Data System (ADS)

Apostol, Barbara L.; Kazantsev, Alexsey; Raffioni, Simona; Illes, Katalin; Pallos, Judit; Bodai, Laszlo; Slepko, Natalia; Bear, James E.; Gertler, Frank B.; Hersch, Steven; Housman, David E.; Marsh, J. Lawrence; Michels Thompson, Leslie

2003-05-01

The formation of polyglutamine-containing aggregates and inclusions are hallmarks of pathogenesis in Huntington's disease that can be recapitulated in model systems. Although the contribution of inclusions to pathogenesis is unclear, cell-based assays can be used to screen for chemical compounds that affect aggregation and may provide therapeutic benefit. We have developed inducible PC12 cell-culture models to screen for loss of visible aggregates. To test the validity of this approach, compounds that inhibit aggregation in the PC12 cell-based screen were tested in a Drosophila model of polyglutamine-repeat disease. The disruption of aggregation in PC12 cells strongly correlates with suppression of neuronal degeneration in Drosophila. Thus, the engineered PC12 cells coupled with the Drosophila model provide a rapid and effective method to screen and validate compounds.
Repeat expansion and autosomal dominant neurodegenerative disorders: consensus and controversy.

PubMed

Rudnicki, Dobrila D; Margolis, Russell L

2003-08-22

Repeat-expansion mutations cause 13 autosomal dominant neurodegenerative disorders falling into three groups. Huntington's disease (HD), dentatorubral pallidoluysian atrophy (DRPLA), spinal and bulbar muscular atrophy (SBMA), and spinocerebellar ataxias (SCAs) types 1, 2, 3, 7 and 17 are each caused by a CAG repeat expansion that encodes polyglutamine. Convergent lines of evidence demonstrate that neurodegeneration in these diseases is a consequence of the neurotoxic effects of abnormally long stretches of glutamines. How polyglutamine induces neurodegeneration, and why neurodegeneration occurs in only select neuronal populations, remains a matter of intense investigation. SCA6 is caused by a CAG repeat expansion in CACNA1A, a gene that encodes a subunit of the P/Q-type calcium channel. The threshold length at which the repeat causes disease is much shorter than in the other polyglutamine diseases, and neurodegeneration may arise from expansion-induced change of function in the calcium channel. Huntington's disease-like 2 (HDL2) and SCAs 8, 10 and 12 are rare disorders in which the repeats (CAG, CTG or ATTCT) are not in protein-coding regions. Investigation into these diseases is still at an early stage, but it is now reasonable to hypothesise that the net effect of each expansion is to alter gene expression. The different pathogenic mechanisms in these three groups of diseases have important implications for the development of rational therapeutics.
Large Polyglutamine Repeats Cause Muscle Degeneration in SCA17 Mice

PubMed Central

Huang, Shanshan; Yang, Su; Guo, Jifeng; Yan, Sen; Gaertig, Marta A.; Li, Shihua; Li, Xiao-Jiang

2015-01-01

SUMMARY In polyglutamine (polyQ) diseases, large polyQ repeats cause juvenile cases with different symptoms than adult-onset patients, who carry smaller expanded polyQ repeats. The mechanisms behind the differential pathology mediated by different polyQ repeat lengths remain unknown. By studying knock-in mouse models of spinal cerebellar ataxia-17 (SCA17), we found that a large polyQ (105 glutamines) in the TATA box-binding protein (TBP) preferentially causes muscle degeneration and reduces the expression of muscle-specific genes. Direct expression of TBP with different polyQ repeats in mouse muscle revealed that muscle degeneration is mediated only by the large polyQ repeats. Different polyQ repeats differentially alter TBP’s interaction with neuronal and muscle-specific transcription factors. As a result, the large polyQ repeat decreases the association of MyoD with TBP and DNA promoters. Our findings suggest that specific alterations in protein interactions by large polyQ repeats may account for the unique pathology in juvenile polyQ diseases. PMID:26387956
The interaction of polyglutamine peptides with lipid membranes is regulated by flanking sequences associated with huntingtin.

PubMed

Burke, Kathleen A; Kauffman, Karlina J; Umbaugh, C Samuel; Frey, Shelli L; Legleiter, Justin

2013-05-24

Huntington disease (HD) is caused by an expanded polyglutamine (poly(Q)) repeat near the N terminus of the huntingtin (htt) protein. Expanded poly(Q) facilitates formation of htt aggregates, eventually leading to deposition of cytoplasmic and intranuclear inclusion bodies containing htt. Flanking sequences directly adjacent to the poly(Q) domain, such as the first 17 amino acids on the N terminus (Nt17) and the polyproline (poly(P)) domain on the C-terminal side of the poly(Q) domain, heavily influence aggregation. Additionally, htt interacts with a variety of membraneous structures within the cell, and Nt17 is implicated in lipid binding. To investigate the interaction between htt exon1 and lipid membranes, a combination of in situ atomic force microscopy, Langmuir trough techniques, and vesicle permeability assays were used to directly monitor the interaction of a variety of synthetic poly(Q) peptides with different combinations of flanking sequences (KK-Q35-KK, KK-Q35-P10-KK, Nt17-Q35-KK, and Nt17-Q35-P10-KK) on model membranes and surfaces. Each peptide aggregated on mica, predominately forming extended, fibrillar aggregates. In contrast, poly(Q) peptides that lacked the Nt17 domain did not appreciably aggregate on or insert into lipid membranes. Nt17 facilitated the interaction of peptides with lipid surfaces, whereas the poly(P) region enhanced this interaction. The aggregation of Nt17-Q35-P10-KK on the lipid bilayer closely resembled that of a htt exon1 construct containing 35 repeat glutamines. Collectively, this data suggests that the Nt17 domain plays a critical role in htt binding and aggregation on lipid membranes, and this lipid/htt interaction can be further modulated by the presence of the poly(P) domain.
Molecular Dynamics Study of the Solubility Curve of Polyglutamine for the PLUM Model.

NASA Astrophysics Data System (ADS)

Kutlu, Songul; Haaga, Jason; Gunton, James D.

A recent study by Crick et al determined the saturation (solubility) curve for polyglutamine (PolyQ) for several different repeat lengths, n, of Qn, and for different flanking sequences, such as K2. The degree of supersaturation S, (S =ln(Co/Ce), where Co and Ce are the metastable and equilibrium saturation monomer concentrations, respectively) plays a crucial role in the kinetics of aggregation of misfolded proteins containing polyQ. Thus the degree of supersaturation is an important factor in diseases such as Huntington's disease for which polyQ is a major component. We present here preliminary results of a molecular dynamics study for the solubility curve for a PLUM model of Q10. (An extensive study of the kinetics of aggregation for this model is being carried out in a separate study) Our results display a normal solubility curve behavior, with the saturation concentration increasing with increasing temperature. This is only in partial qualitative agreement with the experimental results, which show a retrograde behavior at low temperatures. We are extending this study to other repeat lengths, including Q40. ∖ ∖ ∖ ∖ This work is supported by the G. Harold and Leila Y. Mathers Foundation and used an allocation of time from XSEDE.
Structure prediction of polyglutamine disease proteins: comparison of methods

PubMed Central

2014-01-01

Background The expansion of polyglutamine (poly-Q) repeats in several unrelated proteins is associated with at least ten neurodegenerative diseases. The length of the poly-Q regions plays an important role in the progression of the diseases. The number of glutamines (Q) is inversely related to the onset age of these polyglutamine diseases, and the expansion of poly-Q repeats has been associated with protein misfolding. However, very little is known about the structural changes induced by the expansion of the repeats. Computational methods can provide an alternative to determine the structure of these poly-Q proteins, but it is important to evaluate their performance before large scale prediction work is done. Results In this paper, two popular protein structure prediction programs, I-TASSER and Rosetta, have been used to predict the structure of the N-terminal fragment of a protein associated with Huntington's disease with 17 glutamines. Results show that both programs have the ability to find the native structures, but I-TASSER performs better for the overall task. Conclusions Both I-TASSER and Rosetta can be used for structure prediction of proteins with poly-Q repeats. Knowledge of poly-Q structure may significantly contribute to development of therapeutic strategies for poly-Q diseases. PMID:25080018
Folding of polyglutamine chains

NASA Astrophysics Data System (ADS)

Chopra, Manan; Reddy, Allam S.; Abbott, N. L.; de Pablo, J. J.

2008-10-01

Long polyglutamine chains have been associated with a number of neurodegenerative diseases. These include Huntington's disease, where expanded polyglutamine (PolyQ) sequences longer than 36 residues are correlated with the onset of symptoms. In this paper we study the folding pathway of a 54-residue PolyQ chain into a β-helical structure. Transition path sampling Monte Carlo simulations are used to generate unbiased reactive pathways between unfolded configurations and the folded β-helical structure of the polyglutamine chain. The folding process is examined in both explicit water and an implicit solvent. Both models reveal that the formation of a few critical contacts is necessary and sufficient for the molecule to fold. Once the primary contacts are formed, the fate of the protein is sealed and it is largely committed to fold. We find that, consistent with emerging hypotheses about PolyQ aggregation, a stable β-helical structure could serve as the nucleus for subsequent polymerization of amyloid fibrils. Our results indicate that PolyQ sequences shorter than 36 residues cannot form that nucleus, and it is also shown that specific mutations inferred from an analysis of the simulated folding pathway exacerbate its stability.
DNA repair pathways underlie a common genetic mechanism modulating onset in polyglutamine diseases

PubMed Central

Bettencourt, Conceição; Hensman‐Moss, Davina; Flower, Michael; Wiethoff, Sarah; Brice, Alexis; Goizet, Cyril; Stevanin, Giovanni; Koutsis, Georgios; Karadima, Georgia; Panas, Marios; Yescas‐Gómez, Petra; García‐Velázquez, Lizbeth Esmeralda; Alonso‐Vilatela, María Elisa; Lima, Manuela; Raposo, Mafalda; Traynor, Bryan; Sweeney, Mary; Wood, Nicholas; Giunti, Paola; Durr, Alexandra; Holmans, Peter; Houlden, Henry; Tabrizi, Sarah J.

2016-01-01

Objective The polyglutamine diseases, including Huntington's disease (HD) and multiple spinocerebellar ataxias (SCAs), are among the commonest hereditary neurodegenerative diseases. They are caused by expanded CAG tracts, encoding glutamine, in different genes. Longer CAG repeat tracts are associated with earlier ages at onset, but this does not account for all of the difference, and the existence of additional genetic modifying factors has been suggested in these diseases. A recent genome‐wide association study (GWAS) in HD found association between age at onset and genetic variants in DNA repair pathways, and we therefore tested whether the modifying effects of variants in DNA repair genes have wider effects in the polyglutamine diseases. Methods We assembled an independent cohort of 1,462 subjects with HD and polyglutamine SCAs, and genotyped single‐nucleotide polymorphisms (SNPs) selected from the most significant hits in the HD study. Results In the analysis of DNA repair genes as a group, we found the most significant association with age at onset when grouping all polyglutamine diseases (HD+SCAs; p = 1.43 × 10–5). In individual SNP analysis, we found significant associations for rs3512 in FAN1 with HD+SCAs (p = 1.52 × 10–5) and all SCAs (p = 2.22 × 10–4) and rs1805323 in PMS2 with HD+SCAs (p = 3.14 × 10–5), all in the same direction as in the HD GWAS. Interpretation We show that DNA repair genes significantly modify age at onset in HD and SCAs, suggesting a common pathogenic mechanism, which could operate through the observed somatic expansion of repeats that can be modulated by genetic manipulation of DNA repair in disease models. This offers novel therapeutic opportunities in multiple diseases. Ann Neurol 2016;79:983–990 PMID:27044000
A new Caenorhabditis elegans model of human huntingtin 513 aggregation and toxicity in body wall muscles.

PubMed

Lee, Amy L; Ung, Hailey M; Sands, L Paul; Kikis, Elise A

2017-01-01

Expanded polyglutamine repeats in different proteins are the known determinants of at least nine progressive neurodegenerative disorders whose symptoms include cognitive and motor impairment that worsen as patients age. One such disorder is Huntington's Disease (HD) that is caused by a polyglutamine expansion in the human huntingtin protein (htt). The polyglutamine expansion destabilizes htt leading to protein misfolding, which in turn triggers neurodegeneration and the disruption of energy metabolism in muscle cells. However, the molecular mechanisms that underlie htt proteotoxicity have been somewhat elusive, and the muscle phenotypes have not been well studied. To generate tools to elucidate the basis for muscle dysfunction, we engineered Caenorhabditis elegans to express a disease-associated 513 amino acid fragment of human htt in body wall muscle cells. We show that this htt fragment aggregates in C. elegans in a polyglutamine length-dependent manner and is toxic. Toxicity manifests as motor impairment and a shortened lifespan. Compared to previous models, the data suggest that the protein context in which a polyglutamine tract is embedded alters aggregation propensity and toxicity, likely by affecting interactions with the muscle cell environment.
PolyQ repeat expansions in ATXN2 associated with ALS are CAA interrupted repeats.

PubMed

Yu, Zhenming; Zhu, Yongqing; Chen-Plotkin, Alice S; Clay-Falcone, Dana; McCluskey, Leo; Elman, Lauren; Kalb, Robert G; Trojanowski, John Q; Lee, Virginia M-Y; Van Deerlin, Vivianna M; Gitler, Aaron D; Bonini, Nancy M

2011-03-29

Amyotrophic lateral sclerosis (ALS) is a devastating, rapidly progressive disease leading to paralysis and death. Recently, intermediate length polyglutamine (polyQ) repeats of 27-33 in ATAXIN-2 (ATXN2), encoding the ATXN2 protein, were found to increase risk for ALS. In ATXN2, polyQ expansions of ≥ 34, which are pure CAG repeat expansions, cause spinocerebellar ataxia type 2. However, similar length expansions that are interrupted with other codons, can present atypically with parkinsonism, suggesting that configuration of the repeat sequence plays an important role in disease manifestation in ATXN2 polyQ expansion diseases. Here we determined whether the expansions in ATXN2 associated with ALS were pure or interrupted CAG repeats, and defined single nucleotide polymorphisms (SNPs) rs695871 and rs695872 in exon 1 of the gene, to assess haplotype association. We found that the expanded repeat alleles of 40 ALS patients and 9 long-repeat length controls were all interrupted, bearing 1-3 CAA codons within the CAG repeat. 21/21 expanded ALS chromosomes with 3CAA interruptions arose from one haplotype (GT), while 18/19 expanded ALS chromosomes with <3CAA interruptions arose from a different haplotype (CC). Moreover, age of disease onset was significantly earlier in patients bearing 3 interruptions vs fewer, and was distinct between haplotypes. These results indicate that CAG repeat expansions in ATXN2 associated with ALS are uniformly interrupted repeats and that the nature of the repeat sequence and haplotype, as well as length of polyQ repeat, may play a role in the neurological effect conferred by expansions in ATXN2.
Chaperones in Polyglutamine Aggregation: Beyond the Q-Stretch

PubMed Central

Kuiper, E. F. E.; de Mattos, Eduardo P.; Jardim, Laura B.; Kampinga, Harm H.; Bergink, Steven

2017-01-01

Expanded polyglutamine (polyQ) stretches in at least nine unrelated proteins lead to inherited neuronal dysfunction and degeneration. The expansion size in all diseases correlates with age at onset (AO) of disease and with polyQ protein aggregation, indicating that the expanded polyQ stretch is the main driving force for the disease onset. Interestingly, there is marked interpatient variability in expansion thresholds for a given disease. Between different polyQ diseases the repeat length vs. AO also indicates the existence of modulatory effects on aggregation of the upstream and downstream amino acid sequences flanking the Q expansion. This can be either due to intrinsic modulation of aggregation by the flanking regions, or due to differential interaction with other proteins, such as the components of the cellular protein quality control network. Indeed, several lines of evidence suggest that molecular chaperones have impact on the handling of different polyQ proteins. Here, we review factors differentially influencing polyQ aggregation: the Q-stretch itself, modulatory flanking sequences, interaction partners, cleavage of polyQ-containing proteins, and post-translational modifications, with a special focus on the role of molecular chaperones. By discussing typical examples of how these factors influence aggregation, we provide more insight on the variability of AO between different diseases as well as within the same polyQ disorder, on the molecular level. PMID:28386214
New RNAi strategy for selective suppression of a mutant allele in polyglutamine disease.

PubMed

Kubodera, Takayuki; Yokota, Takanori; Ishikawa, Kinya; Mizusawa, Hidehiro

2005-12-01

In gene therapy of dominantly inherited diseases with small interfering RNA (siRNA), mutant allele specific suppression may be necessary for diseases in which the defective gene normally has an important role. It is difficult, however, to design a mutant allele-specific siRNA for trinucleotide repeat diseases in which the difference of sequences is only repeat length. To overcome this problem, we use a new RNA interference (RNAi) strategy for selective suppression of mutant alleles. Both mutant and wild-type alleles are inhibited by the most effective siRNA, and wild-type protein is restored using the wild-type mRNA modified to be resistant to the siRNA. Here, we applied this method to spinocerebellar ataxia type 6 (SCA6). We discuss its feasibility and problems for future gene therapy.
Endoplasmic reticulum stress in spinal and bulbar muscular atrophy: a potential target for therapy

PubMed Central

Montague, Karli; Malik, Bilal; Gray, Anna L.; La Spada, Albert R.; Hanna, Michael G.; Szabadkai, Gyorgy

2014-01-01

Spinal and bulbar muscular atrophy is an X-linked degenerative motor neuron disease caused by an abnormal expansion in the polyglutamine encoding CAG repeat of the androgen receptor gene. There is evidence implicating endoplasmic reticulum stress in the development and progression of neurodegenerative disease, including polyglutamine disorders such as Huntington’s disease and in motor neuron disease, where cellular stress disrupts functioning of the endoplasmic reticulum, leading to induction of the unfolded protein response. We examined whether endoplasmic reticulum stress is also involved in the pathogenesis of spinal and bulbar muscular atrophy. Spinal and bulbar muscular atrophy mice that carry 100 pathogenic polyglutamine repeats in the androgen receptor, and develop a late-onset neuromuscular phenotype with motor neuron degeneration, were studied. We observed a disturbance in endoplasmic reticulum-associated calcium homeostasis in cultured embryonic motor neurons from spinal and bulbar muscular atrophy mice, which was accompanied by increased endoplasmic reticulum stress. Furthermore, pharmacological inhibition of endoplasmic reticulum stress reduced the endoplasmic reticulum-associated cell death pathway. Examination of spinal cord motor neurons of pathogenic mice at different disease stages revealed elevated expression of markers for endoplasmic reticulum stress, confirming an increase in this stress response in vivo. Importantly, the most significant increase was detected presymptomatically, suggesting that endoplasmic reticulum stress may play an early and possibly causal role in disease pathogenesis. Our results therefore indicate that the endoplasmic reticulum stress pathway could potentially be a therapeutic target for spinal and bulbar muscular atrophy and related polyglutamine diseases. PMID:24898351
Length and sequence dependence in the association of Huntingtin protein with lipid membranes

NASA Astrophysics Data System (ADS)

Jawahery, Sudi; Nagarajan, Anu; Matysiak, Silvina

2013-03-01

There is a fundamental gap in our understanding of how aggregates of mutant Huntingtin protein (htt) with overextended polyglutamine (polyQ) sequences gain the toxic properties that cause Huntington's disease (HD). Experimental studies have shown that the most important step associated with toxicity is the binding of mutant htt aggregates to lipid membranes. Studies have also shown that flanking amino acid sequences around the polyQ sequence directly affect interactions with the lipid bilayer, and that polyQ sequences of greater than 35 glutamine repeats in htt are a characteristic of HD. The key steps that determine how flanking sequences and polyQ length affect the structure of lipid bilayers remain unknown. In this study, we use atomistic molecular dynamics simulations to study the interactions between lipid membranes of varying compositions and polyQ peptides of varying lengths and flanking sequences. We find that overextended polyQ interactions do cause deformation in model membranes, and that the flanking sequences do play a role in intensifying this deformation by altering the shape of the affected regions.
Triplet repeat RNA structure and its role as pathogenic agent and therapeutic target

PubMed Central

Krzyzosiak, Wlodzimierz J.; Sobczak, Krzysztof; Wojciechowska, Marzena; Fiszer, Agnieszka; Mykowska, Agnieszka; Kozlowski, Piotr

2012-01-01

This review presents detailed information about the structure of triplet repeat RNA and addresses the simple sequence repeats of normal and expanded lengths in the context of the physiological and pathogenic roles played in human cells. First, we discuss the occurrence and frequency of various trinucleotide repeats in transcripts and classify them according to the propensity to form RNA structures of different architectures and stabilities. We show that repeats capable of forming hairpin structures are overrepresented in exons, which implies that they may have important functions. We further describe long triplet repeat RNA as a pathogenic agent by presenting human neurological diseases caused by triplet repeat expansions in which mutant RNA gains a toxic function. Prominent examples of these diseases include myotonic dystrophy type 1 and fragile X-associated tremor ataxia syndrome, which are triggered by mutant CUG and CGG repeats, respectively. In addition, we discuss RNA-mediated pathogenesis in polyglutamine disorders such as Huntington's disease and spinocerebellar ataxia type 3, in which expanded CAG repeats may act as an auxiliary toxic agent. Finally, triplet repeat RNA is presented as a therapeutic target. We describe various concepts and approaches aimed at the selective inhibition of mutant transcript activity in experimental therapies developed for repeat-associated diseases. PMID:21908410
DNA repair pathways underlie a common genetic mechanism modulating onset in polyglutamine diseases.

PubMed

Bettencourt, Conceição; Hensman-Moss, Davina; Flower, Michael; Wiethoff, Sarah; Brice, Alexis; Goizet, Cyril; Stevanin, Giovanni; Koutsis, Georgios; Karadima, Georgia; Panas, Marios; Yescas-Gómez, Petra; García-Velázquez, Lizbeth Esmeralda; Alonso-Vilatela, María Elisa; Lima, Manuela; Raposo, Mafalda; Traynor, Bryan; Sweeney, Mary; Wood, Nicholas; Giunti, Paola; Durr, Alexandra; Holmans, Peter; Houlden, Henry; Tabrizi, Sarah J; Jones, Lesley

2016-06-01

The polyglutamine diseases, including Huntington's disease (HD) and multiple spinocerebellar ataxias (SCAs), are among the commonest hereditary neurodegenerative diseases. They are caused by expanded CAG tracts, encoding glutamine, in different genes. Longer CAG repeat tracts are associated with earlier ages at onset, but this does not account for all of the difference, and the existence of additional genetic modifying factors has been suggested in these diseases. A recent genome-wide association study (GWAS) in HD found association between age at onset and genetic variants in DNA repair pathways, and we therefore tested whether the modifying effects of variants in DNA repair genes have wider effects in the polyglutamine diseases. We assembled an independent cohort of 1,462 subjects with HD and polyglutamine SCAs, and genotyped single-nucleotide polymorphisms (SNPs) selected from the most significant hits in the HD study. In the analysis of DNA repair genes as a group, we found the most significant association with age at onset when grouping all polyglutamine diseases (HD+SCAs; p = 1.43 × 10(-5) ). In individual SNP analysis, we found significant associations for rs3512 in FAN1 with HD+SCAs (p = 1.52 × 10(-5) ) and all SCAs (p = 2.22 × 10(-4) ) and rs1805323 in PMS2 with HD+SCAs (p = 3.14 × 10(-5) ), all in the same direction as in the HD GWAS. We show that DNA repair genes significantly modify age at onset in HD and SCAs, suggesting a common pathogenic mechanism, which could operate through the observed somatic expansion of repeats that can be modulated by genetic manipulation of DNA repair in disease models. This offers novel therapeutic opportunities in multiple diseases. Ann Neurol 2016;79:983-990. © 2016 The Authors. Annals of Neurology published by Wiley Periodicals, Inc. on behalf of American Neurological Association.
The Role of the Immune System in Triplet Repeat Expansion Diseases

PubMed Central

Urbanek, Martyna O.; Krzyzosiak, Wlodzimierz J.

2015-01-01

Trinucleotide repeat expansion disorders (TREDs) are a group of dominantly inherited neurological diseases caused by the expansion of unstable repeats in specific regions of the associated genes. Expansion of CAG repeat tracts in translated regions of the respective genes results in polyglutamine- (polyQ-) rich proteins that form intracellular aggregates that affect numerous cellular activities. Recent evidence suggests the involvement of an RNA toxicity component in polyQ expansion disorders, thus increasing the complexity of the pathogenic processes. Neurodegeneration, accompanied by reactive gliosis and astrocytosis is the common feature of most TREDs, which may suggest involvement of inflammation in pathogenesis. Indeed, a number of immune response markers have been observed in the blood and CNS of patients and mouse models, and the activation of these markers was even observed in the premanifest stage of the disease. Although inflammation is not an initiating factor of TREDs, growing evidence indicates that inflammatory responses involving astrocytes, microglia, and the peripheral immune system may contribute to disease progression. Herein, we review the involvement of the immune system in the pathogenesis of triplet repeat expansion diseases, with particular emphasis on polyglutamine disorders. We also present various therapeutic approaches targeting the dysregulated inflammation pathways in these diseases. PMID:25873774
Differential Occurrence of Interactions and Interaction Domains in Proteins Containing Homopolymeric Amino Acid Repeats

PubMed Central

Pelassa, Ilaria; Fiumara, Ferdinando

2015-01-01

Homopolymeric amino acids repeats (AARs), which are widespread in proteomes, have often been viewed simply as spacers between protein domains, or even as “junk” sequences with no obvious function but with a potential to cause harm upon expansion as in genetic diseases associated with polyglutamine or polyalanine expansions, including Huntington disease and cleidocranial dysplasia. A growing body of evidence indicates however that at least some AARs can form organized, functional protein structures, and can regulate protein function. In particular, certain AARs can mediate protein-protein interactions, either through homotypic AAR-AAR contacts or through heterotypic contacts with other protein domains. It is still unclear however, whether AARs may have a generalized, proteome-wide role in shaping protein-protein interaction networks. Therefore, we have undertaken here a bioinformatics screening of the human proteome and interactome in search of quantitative evidence of such a role. We first identified the sets of proteins that contain repeats of any one of the 20 amino acids, as well as control sets of proteins chosen at random in the proteome. We then analyzed the connectivity between the proteins of the AAR-containing protein sets and we compared it with that observed in the corresponding control networks. We find evidence for different degrees of connectivity in the different AAR-containing protein networks. Indeed, networks of proteins containing polyglutamine, polyglutamate, polyproline, and other AARs show significantly increased levels of connectivity, whereas networks containing polyleucine and other hydrophobic repeats show lower degrees of connectivity. Furthermore, we observed that numerous protein-protein, -nucleic acid, and -lipid interaction domains are significantly enriched in specific AAR protein groups. These findings support the notion of a generalized, combinatorial role of AARs, together with conventional protein interaction domains, in shaping the interaction networks of the human proteome, and define proteome-wide knowledge that may guide the informed biological exploration of the role of AARs in protein interactions. PMID:26734058
Quantification Assays for Total and Polyglutamine-Expanded Huntingtin Proteins

PubMed Central

Boogaard, Ivette; Smith, Melanie; Pulli, Kristiina; Szynol, Agnieszka; Albertus, Faywell; Lamers, Marieke B. A. C.; Dijkstra, Sipke; Kordt, Daniel; Reindl, Wolfgang; Herrmann, Frank; McAllister, George; Fischer, David F.; Munoz-Sanjuan, Ignacio

2014-01-01

The expansion of a CAG trinucleotide repeat in the huntingtin gene, which produces huntingtin protein with an expanded polyglutamine tract, is the cause of Huntington's disease (HD). Recent studies have reported that RNAi suppression of polyglutamine-expanded huntingtin (mutant HTT) in HD animal models can ameliorate disease phenotypes. A key requirement for such preclinical studies, as well as eventual clinical trials, aimed to reduce mutant HTT exposure is a robust method to measure HTT protein levels in select tissues. We have developed several sensitive and selective assays that measure either total human HTT or polyglutamine-expanded human HTT proteins on the electrochemiluminescence Meso Scale Discovery detection platform with an increased dynamic range over other methods. In addition, we have developed an assay to detect endogenous mouse and rat HTT proteins in pre-clinical models of HD to monitor effects on the wild type protein of both allele selective and non-selective interventions. We demonstrate the application of these assays to measure HTT protein in several HD in vitro cellular and in vivo animal model systems as well as in HD patient biosamples. Furthermore, we used purified recombinant HTT proteins as standards to quantitate the absolute amount of HTT protein in such biosamples. PMID:24816435
Nanoscale studies link amyloid maturity with polyglutamine diseases onset

NASA Astrophysics Data System (ADS)

Ruggeri, F. S.; Vieweg, S.; Cendrowska, U.; Longo, G.; Chiki, A.; Lashuel, H. A.; Dietler, G.

2016-08-01

The presence of expanded poly-glutamine (polyQ) repeats in proteins is directly linked to the pathogenesis of several neurodegenerative diseases, including Huntington’s disease. However, the molecular and structural basis underlying the increased toxicity of aggregates formed by proteins containing expanded polyQ repeats remain poorly understood, in part due to the size and morphological heterogeneity of the aggregates they form in vitro. To address this knowledge gap and technical limitations, we investigated the structural, mechanical and morphological properties of fibrillar aggregates at the single molecule and nanometer scale using the first exon of the Huntingtin protein as a model system (Exon1). Our findings demonstrate a direct correlation of the morphological and mechanical properties of Exon1 aggregates with their structural organization at the single aggregate and nanometric scale and provide novel insights into the molecular and structural basis of Huntingtin Exon1 aggregation and toxicity.

Insights into the Aggregation Mechanism of PolyQ Proteins with Different Glutamine Repeat Lengths.

PubMed

Yushchenko, Tetyana; Deuerling, Elke; Hauser, Karin

2018-04-24

Polyglutamine (polyQ) diseases, including Huntington's disease, result from the aggregation of an abnormally expanded polyQ repeat in the affected protein. The length of the polyQ repeat is essential for the disease's onset; however, the molecular mechanism of polyQ aggregation is still poorly understood. Controlled conditions and initiation of the aggregation process are prerequisites for the detection of transient intermediate states. We present an attenuated total reflection Fourier-transform infrared spectroscopic approach combined with protein immobilization to study polyQ aggregation dependent on the polyQ length. PolyQ proteins were engineered mimicking the mammalian N-terminus fragment of the Huntingtin protein and containing a polyQ sequence with the number of glutamines below (Q11), close to (Q38), and above (Q56) the disease threshold. A monolayer of the polyQ construct was chemically immobilized on the internal reflection element of the attenuated total reflection cell, and the aggregation was initiated via enzymatic cleavage. Structural changes of the polyQ sequence were monitored by time-resolved infrared difference spectroscopy. We observed faster aggregation kinetics for the longer sequences, and furthermore, we could distinguish β-structured intermediates for the different constructs, allowing us to propose aggregation mechanisms dependent on the repeat length. Q11 forms a β-structured aggregate by intermolecular interaction of stretched monomers, whereas Q38 and Q56 undergo conformational changes to various β-structured intermediates, including intramolecular β-sheets. Copyright © 2018 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Proteins containing expanded polyglutamine tracts and neurodegenerative disease

PubMed Central

Adegbuyiro, Adewale; Sedighi, Faezeh; Pilkington, Albert W.; Groover, Sharon; Legleiter, Justin

2017-01-01

Several hereditary neurological and neuromuscular diseases are caused by an abnormal expansion of trinucleotide repeats. To date, there have been ten of these trinucleotide repeat disorders associated with an expansion of the codon CAG encoding glutamine (Q). For these polyglutamine (polyQ) diseases, there is a critical threshold length of the CAG repeat required for disease, and further expansion beyond this threshold is correlated with age of onset and symptom severity. PolyQ expansion in the translated proteins promotes their self-assembly into a variety of oligomeric and fibrillar aggregate species that accumulate into the hallmark proteinaceous inclusion bodies associated with each disease. Here, we review aggregation mechanisms of proteins with expanded polyQ-tracts, structural consequences of expanded polyQ ranging from monomers to fibrillar aggregates, the impact of protein context and post translational modifications on aggregation, and a potential role for lipids membranes in aggregation. As the pathogenic mechanisms that underlie these disorders are often classified as either a gain of toxic function or loss of normal protein function, some toxic mechanisms associated with mutant polyQ tracts will also be discussed. PMID:28170216
Spinocerebellar ataxia type 6 knockin mice develop a progressive neuronal dysfunction with age-dependent accumulation of mutant CaV2.1 channels

PubMed Central

Watase, Kei; Barrett, Curtis F.; Miyazaki, Taisuke; Ishiguro, Taro; Ishikawa, Kinya; Hu, Yuanxin; Unno, Toshinori; Sun, Yaling; Kasai, Sayumi; Watanabe, Masahiko; Gomez, Christopher M.; Mizusawa, Hidehiro; Tsien, Richard W.; Zoghbi, Huda Y.

2008-01-01

Spinocerebellar ataxia type 6 (SCA6) is a neurodegenerative disorder caused by CAG repeat expansions within the voltage-gated calcium (CaV) 2.1 channel gene. It remains controversial whether the mutation exerts neurotoxicity by changing the function of CaV2.1 channel or through a gain-of-function mechanism associated with accumulation of the expanded polyglutamine protein. We generated three strains of knockin (KI) mice carrying normal, expanded, or hyperexpanded CAG repeat tracts in the Cacna1a locus. The mice expressing hyperexpanded polyglutamine (Sca684Q) developed progressive motor impairment and aggregation of mutant CaV2.1 channels. Electrophysiological analysis of cerebellar Purkinje cells revealed similar Ca2+ channel current density among the three KI models. Neither voltage sensitivity of activation nor inactivation was altered in the Sca684Q neurons, suggesting that expanded CAG repeat per se does not affect the intrinsic electrophysiological properties of the channels. The pathogenesis of SCA6 is apparently linked to an age-dependent process accompanied by accumulation of mutant CaV2.1 channels. PMID:18687887
Simple Sequence Repeats Provide a Substrate for Phenotypic Variation in the Neurospora crassa Circadian Clock

PubMed Central

Michael, Todd P.; Park, Sohyun; Kim, Tae-Sung; Booth, Jim; Byer, Amanda; Sun, Qi; Chory, Joanne; Lee, Kwangwon

2007-01-01

Background WHITE COLLAR-1 (WC-1) mediates interactions between the circadian clock and the environment by acting as both a core clock component and as a blue light photoreceptor in Neurospora crassa. Loss of the amino-terminal polyglutamine (NpolyQ) domain in WC-1 results in an arrhythmic circadian clock; this data is consistent with this simple sequence repeat (SSR) being essential for clock function. Methodology/Principal Findings Since SSRs are often polymorphic in length across natural populations, we reasoned that investigating natural variation of the WC-1 NpolyQ may provide insight into its role in the circadian clock. We observed significant phenotypic variation in the period, phase and temperature compensation of circadian regulated asexual conidiation across 143 N. crassa accessions. In addition to the NpolyQ, we identified two other simple sequence repeats in WC-1. The sizes of all three WC-1 SSRs correlated with polymorphisms in other clock genes, latitude and circadian period length. Furthermore, in a cross between two N. crassa accessions, the WC-1 NpolyQ co-segregated with period length. Conclusions/Significance Natural variation of the WC-1 NpolyQ suggests a mechanism by which period length can be varied and selected for by the local environment that does not deleteriously affect WC-1 activity. Understanding natural variation in the N. crassa circadian clock will facilitate an understanding of how fungi exploit their environments. PMID:17726525
Computational study of the fibril organization of polyglutamine repeats reveals a common motif identified in beta-helices.

PubMed

Zanuy, David; Gunasekaran, Kannan; Lesk, Arthur M; Nussinov, Ruth

2006-04-21

The formation of fibril aggregates by long polyglutamine sequences is assumed to play a major role in neurodegenerative diseases such as Huntington. Here, we model peptides rich in glutamine, through a series of molecular dynamics simulations. Starting from a rigid nanotube-like conformation, we have obtained a new conformational template that shares structural features of a tubular helix and of a beta-helix conformational organization. Our new model can be described as a super-helical arrangement of flat beta-sheet segments linked by planar turns or bends. Interestingly, our comprehensive analysis of the Protein Data Bank reveals that this is a common motif in beta-helices (termed beta-bend), although it has not been identified so far. The motif is based on the alternation of beta-sheet and helical conformation as the protein sequence is followed from the N to the C termini (beta-alpha(R)-beta-polyPro-beta). We further identify this motif in the ssNMR structure of the protofibril of the amyloidogenic peptide Abeta(1-40). The recurrence of the beta-bend suggests a general mode of connecting long parallel beta-sheet segments that would allow the growth of partially ordered fibril structures. The design allows the peptide backbone to change direction with a minimal loss of main chain hydrogen bonds. The identification of a coherent organization beyond that of the beta-sheet segments in different folds rich in parallel beta-sheets suggests a higher degree of ordered structure in protein fibrils, in agreement with their low solubility and dense molecular packing.
Huntingtin protein: A new option for fixing the Huntington's disease countdown clock.

PubMed

Caterino, Marco; Squillaro, Tiziana; Montesarchio, Daniela; Giordano, Antonio; Giancola, Concetta; Melone, Mariarosa A B

2018-06-01

Huntington's disease is a dreadful, incurable disorder. It springs from the autosomal dominant mutation in the first exon of the HTT gene, which encodes for the huntingtin protein (HTT) and results in progressive neurodegeneration. Thus far, all the attempted approaches to tackle the mutant HTT-induced toxicity causing this disease have failed. The mutant protein comes with the aberrantly expanded poly-glutamine tract. It is primarily to blame for the build-up of β-amyloid-like HTT aggregates, deleterious once broadened beyond the critical ∼35-37 repeats threshold. Recent experimental findings have provided valuable information on the molecular basis underlying this HTT-driven neurodegeneration. These findings indicate that the poly-glutamine siding regions and many post-translation modifications either abet or counter the poly-glutamine tract. This review provides an overall, up-to-date insight into HTT biophysics and structural biology, particularly discussing novel pharmacological options to specifically target the mutated protein and thus inhibit its functions and toxicity. Copyright © 2018 Elsevier Ltd. All rights reserved.
Neuronal intranuclear inclusions are ultrastructurally and immunologically distinct from cytoplasmic inclusions of neuronal intermediate filament inclusion disease

PubMed Central

Mosaheb, Sabrina; Thorpe, Julian R.; Hashemzadeh-Bonehi, Lida; Bigio, Eileen H.; Gearing, Marla; Cairns, Nigel J.

2006-01-01

Abnormal neuronal cytoplasmic inclusions (NCIs) containing aggregates of α-internexin and the neurofilament (NF) subunits, NF-H, NF-M, and NF-L, are the signature lesions of neuronal intermediate filament (IF) inclusion disease (NIFID). The disease has a clinically heterogeneous phenotype, including fronto-temporal dementia, pyramidal and extrapyramidal signs presenting at a young age. NCIs are variably ubiquitinated and about half of cases also have neuronal intranuclear inclusions (NIIs), which are also ubiquitinated. NIIs have been described in polyglutamine-repeat expansion diseases, where they are strongly ubiquitin immunoreactive. The fine structure of NIIs of NIFID has not previously been described. Therefore, to determine the ultrastructure of NIIs, immunoelectron microscopy was undertaken on NIFID cases and normal aged control brains. Our results indicate that the NIIs of NIFID are strongly ubiquitin immunoreactive. However, unlike NCIs which contain ubiquitin, α-internexin and NF epitopes, NIIs contain neither epitopes of α-internexin nor NF subunits. Neither NIIs nor NCIs were recognised by antibodies to expanded polyglutamine repeats. The NII of NIFID lacks a limiting membrane and contains straight filaments of 20 nm mean width (range 11–35 nm), while NCIs contain filaments with a mean width of 10 nm (range 5–18 nm; t-test, P<0.001). Biochemistry revealed no differences in neuronal IF protein mobilities between NIFID and normal brain tissue. Therefore, NIIs of NIFID contain filaments morphologically and immunologically distinct from those of NCIs, and both types of inclusion lack expanded polyglutamine tracts of the triplet-repeat expansion diseases. These observations indicate that abnormal protein aggregation follows separate pathways in different neuronal compartments of NIFID. PMID:16025283
Cell biology of spinocerebellar ataxia.

PubMed

Orr, Harry T

2012-04-16

Ataxia is a neurological disorder characterized by loss of control of body movements. Spinocerebellar ataxia (SCA), previously known as autosomal dominant cerebellar ataxia, is a biologically robust group of close to 30 progressive neurodegenerative diseases. Six SCAs, including the more prevalent SCA1, SCA2, SCA3, and SCA6 along with SCA7 and SCA17 are caused by expansion of a CAG repeat that encodes a polyglutamine tract in the affected protein. How the mutated proteins in these polyglutamine SCAs cause disease is highly debated. Recent work suggests that the mutated protein contributes to pathogenesis within the context of its "normal" cellular function. Thus, understanding the cellular function of these proteins could aid in the development of therapeutics.
Developmental alterations in Huntington’s disease neural cells and pharmacological rescue in cells and mice

PubMed Central

2017-01-01

Neural cultures derived from Huntington’s disease (HD) patient-derived induced pluripotent stem cells were used for ‘omics’ analyses to identify mechanisms underlying neurodegeneration. RNA-seq analysis identified genes in glutamate and GABA signaling, axonal guidance and calcium influx whose expression was decreased in HD cultures. One-third of gene changes were in pathways regulating neuronal development and maturation. When mapped to stages of mouse striatal development, the profiles aligned with earlier embryonic stages of neuronal differentiation. We observed a strong correlation between HD-related histone marks, gene expression and unique peak profiles associated with dysregulated genes, suggesting a coordinated epigenetic program. Treatment with isoxazole-9, which targets key dysregulated pathways, led to amelioration of expanded polyglutamine repeat-associated phenotypes in neural cells and of cognitive impairment and synaptic pathology in HD model R6/2 mice. These data suggest that mutant huntingtin impairs neurodevelopmental pathways that could disrupt synaptic homeostasis and increase vulnerability to the pathologic consequence of expanded polyglutamine repeats over time. PMID:28319609
Aggregation landscapes of Huntingtin exon 1 protein fragments and the critical repeat length for the onset of Huntington’s disease

PubMed Central

Chen, Mingchen; Wolynes, Peter G.

2017-01-01

Huntington’s disease (HD) is a neurodegenerative disease caused by an abnormal expansion in the polyglutamine (polyQ) track of the Huntingtin (HTT) protein. The severity of the disease depends on the polyQ repeat length, arising only in patients with proteins having 36 repeats or more. Previous studies have shown that the aggregation of N-terminal fragments (encoded by HTT exon 1) underlies the disease pathology in mouse models and that the HTT exon 1 gene product can self-assemble into amyloid structures. Here, we provide detailed structural mechanisms for aggregation of several protein fragments encoded by HTT exon 1 by using the associative memory, water-mediated, structure and energy model (AWSEM) to construct their free energy landscapes. We find that the addition of the N-terminal 17-residue sequence (NT17) facilitates polyQ aggregation by encouraging the formation of prefibrillar oligomers, whereas adding the C-terminal polyproline sequence (P10) inhibits aggregation. The combination of both terminal additions in HTT exon 1 fragment leads to a complex aggregation mechanism with a basic core that resembles that found for the aggregation of pure polyQ repeats using AWSEM. At the extrapolated physiological concentration, although the grand canonical free energy profiles are uphill for HTT exon 1 fragments having 20 or 30 glutamines, the aggregation landscape for fragments with 40 repeats has become downhill. This computational prediction agrees with the critical length found for the onset of HD and suggests potential therapies based on blocking early binding events involving the terminal additions to the polyQ repeats. PMID:28400517
Early stage aggregation of a coarse-grained model of polyglutamine

NASA Astrophysics Data System (ADS)

Haaga, Jason; Gunton, J. D.; Buckles, C. Nadia; Rickman, J. M.

2018-01-01

In this paper, we study the early stages of aggregation of a model of polyglutamine (polyQ) for different repeat lengths (number of glutamine amino acid groups in the chain). In particular, we use the Large-scale Atomic/Molecular Massively Parallel Simulator to study a generic coarse-grained model proposed by Bereau and Deserno. We focus on the primary nucleation mechanism involved and find that our results for the initial self-assembly process are consistent with the two-dimensional classical nucleation theory of Kashchiev and Auer. More specifically, we find that with decreasing supersaturation, the oligomer fibril (protofibril) transforms from a one-dimensional β sheet to two-, three-, and higher layer β sheets as the critical nucleus size increases. We also show that the results are consistent with several predictions of their theory, including the dependence of the critical nucleus size on the supersaturation. Our results for the time dependence of the mass aggregation are in reasonable agreement with an approximate analytical solution of the filament theory by Knowles and collaborators that corresponds to an additional secondary nucleation arising from filament fragmentation. Finally, we study the dependence of the critical nucleus size on the repeat length of polyQ. We find that for the larger length polyglutamine chain that we study, the critical nucleus is a monomer, in agreement with experiment and in contrast to the case for the smaller chain, for which the smallest critical nucleus size is four.
Androgen receptor CAG repeat polymorphisms in canine prostate cancer.

PubMed

Lai, C-L; L'Eplattenier, H; van den Ham, R; Verseijden, F; Jagtenberg, A; Mol, J A; Teske, E

2008-01-01

Relatively shorter lengths of the polymorphic polyglutamine repeat-1 of the androgen receptor (AR) have been associated with an increased risk of prostate cancer (PC) in humans. In the dog, there are 2 polymorphic CAG repeat (CAGr) regions. To investigate the relationship of CAGr length of the canine AR-gene and the development of PC. Thirty-two dogs with PC and 172 control dogs were used. DNA was extracted from blood. Both CAG repeats were amplified by polymerase chain reaction (PCR) and PCR products were sequenced. In dogs with PC, CAG-1 repeat length was shorter (P = .001) by an increased proportion of 10 repeats (P = .011) and no 12 repeats (P = .0017) than in the control dogs. No significant changes were found in CAG-3 length distribution. CAG-1 and CAG-3 polymorphisms proved not to be in linkage disequilibrium. Breed difference in allelic distribution was found in the control group. Of the prostate-disease sensitive breeds, a high percentage (64.5%) of the shortest haplotype 10/11 was found in the Doberman, whereas Beagles and German Pointers had higher haplotype 12/11 (47.1 and 50%). Bernese Mountain dogs and Bouvier dogs both shared a high percentage of 11 CAG-1 repeats and 13 CAG-3 repeats. Differences in (combined) allelic distributions among breeds were not significant. In this preliminary study, short CAG-1 repeats in the AR-gene were associated with an increased risk of developing canine PC. Although breed-specific differences in allelic distribution of CAG-1 and CAG-3 repeats were found, these could not be related to PC risk.
Cell biology of spinocerebellar ataxia

PubMed Central

2012-01-01

Ataxia is a neurological disorder characterized by loss of control of body movements. Spinocerebellar ataxia (SCA), previously known as autosomal dominant cerebellar ataxia, is a biologically robust group of close to 30 progressive neurodegenerative diseases. Six SCAs, including the more prevalent SCA1, SCA2, SCA3, and SCA6 along with SCA7 and SCA17 are caused by expansion of a CAG repeat that encodes a polyglutamine tract in the affected protein. How the mutated proteins in these polyglutamine SCAs cause disease is highly debated. Recent work suggests that the mutated protein contributes to pathogenesis within the context of its “normal” cellular function. Thus, understanding the cellular function of these proteins could aid in the development of therapeutics. PMID:22508507
Geography of the circadian gene clock and photoperiodic response in western North American populations of the three-spined stickleback Gasterosteus aculeatus.

PubMed

O'Brien, C; Unruh, L; Zimmerman, C; Bradshaw, W E; Holzapfel, C M; Cresko, W A

2013-03-01

Controlled laboratory experiments were used to show that Oregon and Alaskan three-spined stickleback Gasterosteus aculeatus, collected from locations differing by 18° of latitude, exhibited no significant variation in length of the polyglutamine domain of the clock protein or in photoperiodic response within or between latitudes despite the fact that male and female G. aculeatus are photoperiodic at both latitudes. Hence, caution is urged when interpreting variation in the polyglutamine repeat (PolyQ) domain of the gene clock in the context of seasonal activities or in relationship to photoperiodism along geographical gradients. © 2013 The Authors. Journal of Fish Biology © 2013 The Fisheries Society of the British Isles.
Partners in crime: bidirectional transcription in unstable microsatellite disease.

PubMed

Batra, Ranjan; Charizanis, Konstantinos; Swanson, Maurice S

2010-04-15

Nearly two decades have passed since the discovery that the expansion of microsatellite trinucleotide repeats is responsible for a prominent class of neurological disorders, including Huntington disease and fragile X syndrome. These hereditary diseases are characterized by genetic anticipation or the intergenerational increase in disease severity accompanied by a decrease in age-of-onset. The revelation that the variable expansion of simple sequence repeats accounted for anticipation spawned a number of pathogenesis models and a flurry of studies designed to reveal the molecular events affected by these expansions. This work led to our current understanding that expansions in protein-coding regions result in extended homopolymeric amino acid tracts, often polyglutamine or polyQ, and deleterious protein gain-of-function effects. In contrast, expansions in noncoding regions cause RNA-mediated toxicity. However, the realization that the transcriptome is considerably more complex than previously imagined, as well as the emerging regulatory importance of antisense RNAs, has blurred this distinction. In this review, we summarize evidence for bidirectional transcription of microsatellite disease genes and discuss recent suggestions that some repeat expansions produce variable levels of both toxic RNAs and proteins that influence cell viability, disease penetrance and pathological severity.
The CAG repeat polymorphism in the Androgen receptor gene modifies the risk for hypospadias in Caucasians

PubMed Central

2012-01-01

Background Hypospadias is a birth defect of the urethra in males, and a milder form of 46,XY disorder of sexual development (DSD). The disease is characterized by a ventrally placed urinary opening due to a premature fetal arrest of the urethra development. Moreover, the Androgen receptor (AR) gene has an essential role in the hormone-dependent stage of sexual development. In addition, longer AR polyglutamine repeat lengths encoded by CAG repeats are associated with lower transcriptional activity in vitro. In the present study, we aimed at investigating the role of the CAG repeat length in the AR gene in hypospadias cases as compared to the controls. Our study included 211 hypospadias and 208 controls of Caucasian origin. Methods We amplified the CAG repeat region with PCR, and calculated the difference in the mean CAG repeat length between the hypospadias and control group using the T-test for independent groups. Results We detected a significant increase of the CAG repeat length in the hypospadias cases when compared to the controls (contrast estimate: 2.29, 95% Confidence Interval (1.73-2.84); p-value: 0.001). In addition, the odds ratios between the hypospadias and controls revealed that the hypospadias cases are two to 3 times as likely to have longer CAG repeats than a shorter length for each repeat length investigated. Conclusions We have investigated the largest number of hypospadias cases with regards to the CAG repeat length, and we provide evidence that a higher number of the CAG repeat sequence in the AR gene have a clear effect on the risk of hypospadias in Caucasians. PMID:23167717
Repeat expansion disease: Progress and puzzles in disease pathogenesis

PubMed Central

La Spada, Albert R.; Taylor, J. Paul

2015-01-01

Repeat expansion mutations cause at least 22 inherited neurological diseases. The complexity of repeat disease genetics and pathobiology has revealed unexpected shared themes and mechanistic pathways among the diseases, for example, RNA toxicity. Also, investigation of the polyglutamine diseases has identified post-translational modification as a key step in the pathogenic cascade, and has shown that the autophagy pathway plays an important role in the degradation of misfolded proteins – two themes likely to be relevant to the entire neurodegeneration field. Insights from repeat disease research are catalyzing new lines of study that should not only elucidate molecular mechanisms of disease, but also highlight opportunities for therapeutic intervention for these currently untreatable disorders. PMID:20177426
Expanded polyglutamine embedded in the endoplasmic reticulum causes membrane distortion and coincides with Bax insertion

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ueda, Masashi; Li, Shimo; Itoh, Masanori

The endoplasmic reticulum (ER) is important in various cellular functions, such as secretary and membrane protein biosynthesis, lipid synthesis, and calcium storage. ER stress, including membrane distortion, is associated with many diseases such as Huntington's disease. In particular, nuclear envelope distortion is related to neuronal cell death associated with polyglutamine. However, the mechanism by which polyglutamine causes ER membrane distortion remains unclear. We used electron microscopy, fluorescence protease protection assay, and alkaline treatment to analyze the localization of polyglutamine in cells. We characterized polyglutamine embedded in the ER membrane and noted an effect on morphology, including the dilation of ERmore » luminal space and elongation of ER-mitochondria contact sites, in addition to the distortion of the nuclear envelope. The polyglutamine embedded in the ER membrane was observed at the same time as Bax insertion. These results demonstrated that the ER membrane may be a target of polyglutamine, which triggers cell death through Bax. -- Highlights: •We characterized polyglutamine embedded in the ER membrane. •The polyglutamine embedded in the ER membrane was observed at the same time as Bax insertion. •The ER membrane may be a target of polyglutamine, which triggers cell death.« less
Effects of the enlargement of poly-glutamine segments on the structure and folding of ataxin-2 and ataxin-3 proteins

PubMed Central

Wen, Jingran; Scoles, Daniel R.; Facelli, Julio C.

2017-01-01

Spinocerebellar ataxia type 2 (SCA2) and type 3 (SCA3) are two common autosomal-dominant inherited ataxia syndromes, both of which are related to the unstable expansion of tri-nucleotide CAG repeats in the coding region of the related ATXN2 and ATXN3 genes, respectively. The poly-glutamine (poly-Q) tract encoded by the CAG repeats has long been recognized as an important factor in disease pathogenesis and progress. In this study, using the I-TASSER method for 3D structure prediction, we investigated the effect of poly-Q tract enlargement on the structure and folding of ataxin-2 and ataxin-3 proteins. Our results show good agreement with the known experimental structures of the Josephin and UIM domains providing credence to the simulation results presented here, which show that the enlargement of the poly-Q region not only affects the local structure of these regions but also affects the structures of functional domains as well as the whole protein. The changes observed in the predicted models of the UIM domains in ataxin-3 when the poly-Q track is enlarged provide new insights on possible pathogenic mechanisms. PMID:26861241
Early-Aggregation Studies of Polyglutamine in Solution

NASA Astrophysics Data System (ADS)

Fluitt, Aaron; de Pablo, Juan

2012-02-01

Several neurodegenerative diseases, notably Huntington's disease, are associated with certain proteins containing extended polyglutamine tracts. In all polyglutamine diseases, the age of onset is inversely correlated with the length of the polyglutamine domain beyond some pathological threshold. Diseased cells are characterized by intranuclear inclusions rich in aggregated polyglutamine. Experimental evidence suggests that oligomeric aggregate species, not mature amyloid fibrils, are the species most toxic to the cell. Little is known about the structures and aggregation dynamics of polyglutamine oligomers due to their short lifetimes. A better understanding of the pathway through which polyglutamine peptides form oligomeric aggregates will aid the design of therapies to inhibit their toxic activity. In this work, we report structural characterization of polyglutamine monomers and dimers from atomistic molecular dynamics simulations in explicit water. Umbrella sampling simulations reveal that the stability of the dimer species with respect to the disassociated monomers is an increasing function of the chain length.

Full Length Human Mutant Huntingtin with a Stable Polyglutamine Repeat Can Elicit Progressive and Selective Neuropathogenesis in BACHD Mice

PubMed Central

Gray, Michelle; Shirasaki, Dyna I.; Cepeda, Carlos; Andre, Veronique M.; Wilburn, Brian; Lu, Xiao-Hong; Tao, Jifang; Yamazaki, Irene; Li, Shi-Hua; Sun, Yi E.; Li, Xiao-Jiang; Levine, Michael S.; William Yang, X

2008-01-01

To elucidate the pathogenic mechanisms in Huntington’s disease (HD) elicited by expression of full-length human mutant huntingtin (fl-mhtt), a Bacterial Artificial Chromosome (BAC)-mediated transgenic mouse model (BACHD) was developed expressing fl-mhtt with 97 glutamine repeats under the control of endogenous htt regulatory machinery on the BAC. BACHD mice exhibit progressive motor deficits, neuronal synaptic dysfunction, and late-onset selective neuropathology, which includes significant cortical and striatal atrophy and striatal dark neuron degeneration. Power analyses reveal the robustness of the behavioral and neuropathological phenotypes, suggesting BACHD as a suitable fl-mhtt mouse model for preclinical studies. Further analyses of BACHD mice provide additional insights into how mhtt may elicit neuropathogenesis. First, unlike prior fl-mhtt mouse models, BACHD mice reveal that the slowly progressive and selective pathogenic process in HD mouse brains can occur without early and diffuse nuclear accumulation of aggregated mhtt (i.e. as detected by immunostaining with the EM48 antibody). Instead, a relatively steady-state level of predominantly full-length mhtt and a small amount of mhtt N-terminal fragments are sufficient to elicit the disease process. Second, the polyglutamine repeat within fl-mhtt in BACHD mice is encoded by a mixed CAA-CAG repeat, which is stable in both the germline and somatic tissues including the cortex and striatum at the onset of neuropathology. Therefore, our results suggest that somatic repeat instability does not play a necessary role in selective neuropathogenesis in BACHD mice. In summary, the BACHD model constitutes a novel and robust in vivo paradigm for the investigation of HD pathogenesis and treatment. PMID:18550760
Comparative characterization of short monomeric polyglutamine peptides by replica exchange molecular dynamics simulation.

PubMed

Nakano, Miki; Watanabe, Hirofumi; Rothstein, Stuart M; Tanaka, Shigenori

2010-05-27

Polyglutamine (polyQ) diseases are caused by an abnormal expansion of CAG repeats. While their detailed structure remains unclear, polyQ peptides assume beta-sheet structures when they aggregate. To investigate the conformational ensemble of short, monomeric polyQ peptides, which consist of 15 glutamine residues (Q(15)), we performed replica exchange molecular dynamics (REMD) simulations. We found that Q(15) can assume multiple configurations due to all of the residues affecting the formation of side-chain hydrogen bonds. Analysis of the free energy landscape reveals that Q(15) has a basin for random-coil structures and another for alpha-helix or beta-turn structures. To investigate properties of aggregated polyQ peptides, we performed multiple molecular dynamics (MMD) simulations for monomeric and oligomeric Q(15). MMD revealed that the formation of oligomers stabilizes the beta-turn structure by increasing the number of hydrogen bonds between the main chains.
Comparative analysis of anti-polyglutamine Fab crystals grown on Earth and in microgravity.

PubMed

Owens, Gwen E; New, Danielle M; Olvera, Alejandra I; Manzella, Julia Ashlyn; Macon, Brittney L; Dunn, Joshua C; Cooper, David A; Rouleau, Robyn L; Connor, Daniel S; Bjorkman, Pamela J

2016-10-01

Huntington's disease is one of nine neurodegenerative diseases caused by a polyglutamine (polyQ)-repeat expansion. An anti-polyQ antigen-binding fragment, MW1 Fab, was crystallized both on Earth and on the International Space Station, a microgravity environment where convection is limited. Once the crystals returned to Earth, the number, size and morphology of all crystals were recorded, and X-ray data were collected from representative crystals. The results generally agreed with previous microgravity crystallization studies. On average, microgravity-grown crystals were 20% larger than control crystals grown on Earth, and microgravity-grown crystals had a slightly improved mosaicity (decreased by 0.03°) and diffraction resolution (decreased by 0.2 Å) compared with control crystals grown on Earth. However, the highest resolution and lowest mosaicity crystals were formed on Earth, and the highest-quality crystal overall was formed on Earth after return from microgravity.
Comparative analysis of anti-polyglutamine Fab crystals grown on Earth and in microgravity

PubMed Central

Owens, Gwen E.; New, Danielle M.; Olvera, Alejandra I.; Manzella, Julia Ashlyn; Macon, Brittney L.; Dunn, Joshua C.; Cooper, David A.; Rouleau, Robyn L.; Connor, Daniel S.; Bjorkman, Pamela J.

2016-01-01

Huntington’s disease is one of nine neurodegenerative diseases caused by a polyglutamine (polyQ)-repeat expansion. An anti-polyQ antigen-binding fragment, MW1 Fab, was crystallized both on Earth and on the International Space Station, a microgravity environment where convection is limited. Once the crystals returned to Earth, the number, size and morphology of all crystals were recorded, and X-ray data were collected from representative crystals. The results generally agreed with previous microgravity crystallization studies. On average, microgravity-grown crystals were 20% larger than control crystals grown on Earth, and microgravity-grown crystals had a slightly improved mosaicity (decreased by 0.03°) and diffraction resolution (decreased by 0.2 Å) compared with control crystals grown on Earth. However, the highest resolution and lowest mosaicity crystals were formed on Earth, and the highest-quality crystal overall was formed on Earth after return from microgravity. PMID:27710941
Phosphorodiamidate morpholino oligomers suppress mutant huntingtin expression and attenuate neurotoxicity

PubMed Central

Sun, Xin; Marque, Leonard O.; Cordner, Zachary; Pruitt, Jennifer L.; Bhat, Manik; Li, Pan P.; Kannan, Geetha; Ladenheim, Ellen E.; Moran, Timothy H.; Margolis, Russell L.; Rudnicki, Dobrila D.

2014-01-01

Huntington's disease (HD) is a neurodegenerative disorder caused by a CAG trinucleotide repeat expansion in the huntingtin (HTT) gene. Disease pathogenesis derives, at least in part, from the long polyglutamine tract encoded by mutant HTT. Therefore, considerable effort has been dedicated to the development of therapeutic strategies that significantly reduce the expression of the mutant HTT protein. Antisense oligonucleotides (ASOs) targeted to the CAG repeat region of HTT transcripts have been of particular interest due to their potential capacity to discriminate between normal and mutant HTT transcripts. Here, we focus on phosphorodiamidate morpholino oligomers (PMOs), ASOs that are especially stable, highly soluble and non-toxic. We designed three PMOs to selectively target expanded CAG repeat tracts (CTG22, CTG25 and CTG28), and two PMOs to selectively target sequences flanking the HTT CAG repeat (HTTex1a and HTTex1b). In HD patient–derived fibroblasts with expanded alleles containing 44, 77 or 109 CAG repeats, HTTex1a and HTTex1b were effective in suppressing the expression of mutant and non-mutant transcripts. CTGn PMOs also suppressed HTT expression, with the extent of suppression and the specificity for mutant transcripts dependent on the length of the targeted CAG repeat and on the CTG repeat length and concentration of the PMO. PMO CTG25 reduced HTT-induced cytotoxicity in vitro and suppressed mutant HTT expression in vivo in the N171-82Q transgenic mouse model. Finally, CTG28 reduced mutant HTT expression and improved the phenotype of HdhQ7/Q150 knock-in HD mice. These data demonstrate the potential of PMOs as an approach to suppressing the expression of mutant HTT. PMID:25035419
A Variable Polyglutamine Repeat Affects Subcellular Localization and Regulatory Activity of a Populus ANGUSTIFOLIA Protein.

PubMed

Bryan, Anthony C; Zhang, Jin; Guo, Jianjun; Ranjan, Priya; Singan, Vasanth; Barry, Kerrie; Schmutz, Jeremy; Weighill, Deborah; Jacobson, Daniel; Jawdy, Sara; Tuskan, Gerald A; Chen, Jin-Gui; Muchero, Wellington

2018-06-08

Polyglutamine (polyQ) stretches have been reported to occur in proteins across many organisms including animals, fungi and plants. Expansion of these repeats has attracted much attention due their associations with numerous human diseases including Huntington's and other neurological maladies. This suggests that the relative length of polyQ stretches is an important modulator of their function. Here, we report the identification of a Populus C-terminus binding protein (CtBP) ANGUSTIFOLIA ( PtAN1 ) which contains a polyQ stretch whose functional relevance had not been established. Analysis of 917 resequenced Populus trichocarpa genotypes revealed three allelic variants at this locus encoding 11-, 13- and 15-glutamine residues. Transient expression assays using Populus leaf mesophyll protoplasts revealed that the 11Q variant exhibited strong nuclear localization whereas the 15Q variant was only found in the cytosol, with the 13Q variant exhibiting localization in both subcellular compartments. We assessed functional implications by evaluating expression changes of putative PtAN1 targets in response to overexpression of the three allelic variants and observed allele-specific differences in expression levels of putative targets. Our results provide evidence that variation in polyQ length modulates PtAN1 function by altering subcellular localization. Copyright © 2018, G3: Genes, Genomes, Genetics.
Allelic variations of α-gliadin genes from species of Aegilops section Sitopsis and insights into evolution of α-gliadin multigene family among Triticum and Aegilops.

PubMed

Huang, Zhuo; Long, Hai; Wei, Yu-Ming; Yan, Ze-Hong; Zheng, You-Liang

2016-04-01

The α-gliadins account for 15-30 % of the total storage protein in wheat endosperm and play important roles in the dough extensibility and nutritional quality. On the other side, they act as a main source of toxic peptides triggering celiac disease. In this study, 37 α-gliadins were isolated from three species of Aegilops section Sitopsis. Sequence similarity and phylogenetic analyses revealed novel allelic variation at Gli-2 loci of species of Sitopsis and regular organization of motifs in their repetitive domain. Based on the comprehensive analyses of a large number of known sequences of bread wheat and its diploid genome progenitors, the distributions of four T cell epitopes and length variations of two polyglutamine domains are analyzed. Additionally, according to the organization of repeat motifs, we classified the α-gliadins of Triticum and Aegilops into eight types. Their most recent common ancestor and putative divergence patterns were further considered. This study provides new insights into the allelic variations of α-gliadins in Aegilops section Sitopsis, as well as evolution of α-gliadin multigene family among Triticum and Aegilops species.
Novel polyglutamine model uncouples proteotoxicity from aging.

PubMed

Christie, Nakeirah T M; Lee, Amy L; Fay, Hannah G; Gray, Amelia A; Kikis, Elise A

2014-01-01

Polyglutamine expansions in certain proteins are the genetic determinants for nine distinct progressive neurodegenerative disorders and resultant age-related dementia. In these cases, neurodegeneration is due to the aggregation propensity and resultant toxic properties of the polyglutamine-containing proteins. We are interested in elucidating the underlying mechanisms of toxicity of the protein ataxin-3, in which a polyglutamine expansion is the genetic determinant for Machado-Joseph Disease (MJD), also referred to as spinocerebellar ataxia 3 (SCA3). To this end, we have developed a novel model for ataxin-3 protein aggregation, by expressing a disease-related polyglutamine-containing fragment of ataxin-3 in the genetically tractable body wall muscle cells of the model system C. elegans. Here, we demonstrate that this ataxin-3 fragment aggregates in a polyQ length-dependent manner in C. elegans muscle cells and that this aggregation is associated with cellular dysfunction. However, surprisingly, this aggregation and resultant toxicity was not influenced by aging. This is in contrast to polyglutamine peptides alone whose aggregation/toxicity is highly dependent on age. Thus, the data presented here not only describe a new polyglutamine model, but also suggest that protein context likely influences the cellular interactions of the polyglutamine-containing protein and thereby modulates its toxic properties.
Role of tissue transglutaminase type 2 in calbindin-D28k interaction with ataxin-1

PubMed Central

Vig, P.J.S.; Wei, J.; Shao, Q.; Hebert, M.D.; Subramony, S.H.; Sutton, L.T.

2007-01-01

Spinocerebellar ataxia-1 (SCA1) is caused by the expansion of a polyglutamine repeats within the disease protein, ataxin-1. The mutant ataxin-1 precipitates as large intranuclear aggregates in the affected neurons. These aggregates may protect neurons from mutant protein and/or trigger neuronal degeneration by encouraging recruitment of other essential proteins. Our previous studies have shown that calcium binding protein calbindin-D28k (CaB) associated with SCA1 pathogenesis is recruited to ataxin-1 aggregates in Purkinje cells of SCA1 mice. Since our recent findings suggest that tissue transglutaminase 2 (TG2) may be involved in cross-linking and aggregation of ataxin-1, the present study was initiated to determine if TG2 has any role in CaB-ataxin-1 interaction. The guinea pig TG2 covalently cross-linked purified rat brain CaB. Time dependent progressive increase in aggregation produced large multimers, which stayed on top of the gel. CaB interaction with ataxin-1 was studied using HeLa cell lysates expressing GFP and GFP tagged ataxin-1 with normal and expanded polyglutamine repeats (Q2, Q30 and Q82). The reaction products were analyzed by Western blots using anti- polyglutamine, CaB or GFP antibodies. CaB interacted with ataxin-1 independent of TG2 as the protein-protein cross-linker DSS stabilized CaB-ataxin-1 complex. TG2 cross-linked CaB preferentially with Q82 ataxin-1. The cross-linking was inhibited with EGTA or TG2 inhibitor cystamine. The present data indicate that CaB may be a TG2 substrate. In addition, aggregates of mutant ataxin-1 may recruit CaB via TG2 mediated covalent cross-linking, further supporting the argument that ataxin-1 aggregates may be toxic to neurons. PMID:17442486
Differential contributions of Caenorhabditis elegans histone deacetylases to huntingtin polyglutamine toxicity.

PubMed

Bates, Emily A; Victor, Martin; Jones, Adriana K; Shi, Yang; Hart, Anne C

2006-03-08

Expansion of a polyglutamine tract in the huntingtin protein causes neuronal degeneration and death in Huntington's disease patients, but the molecular mechanisms underlying polyglutamine-mediated cell death remain unclear. Previous studies suggest that expanded polyglutamine tracts alter transcription by sequestering glutamine rich transcriptional regulatory proteins, thereby perturbing their function. We tested this hypothesis in Caenorhabditis elegans neurons expressing a human huntingtin fragment with an expanded polyglutamine tract (Htn-Q150). Loss of function alleles and RNA interference (RNAi) were used to examine contributions of C. elegans cAMP response element-binding protein (CREB), CREB binding protein (CBP), and histone deacetylases (HDACs) to polyglutamine-induced neurodegeneration. Deletion of CREB (crh-1) or loss of one copy of CBP (cbp-1) enhanced polyglutamine toxicity in C. elegans neurons. Loss of function alleles and RNAi were then used to systematically reduce function of each C. elegans HDAC. Generally, knockdown of individual C. elegans HDACs enhanced Htn-Q150 toxicity, but knockdown of C. elegans hda-3 suppressed toxicity. Neuronal expression of hda-3 restored Htn-Q150 toxicity and suggested that C. elegans HDAC3 (HDA-3) acts within neurons to promote degeneration in response to Htn-Q150. Genetic epistasis experiments suggested that HDA-3 and CRH-1 (C. elegans CREB homolog) directly oppose each other in regulating transcription of genes involved in polyglutamine toxicity. hda-3 loss of function failed to suppress increased neurodegeneration in hda-1/+;Htn-Q150 animals, indicating that HDA-1 and HDA-3 have different targets with opposing effects on polyglutamine toxicity. Our results suggest that polyglutamine expansions perturb transcription of CREB/CBP targets and that specific targeting of HDACs will be useful in reducing associated neurodegeneration.
Control of Huntington's Disease-Associated Phenotypes by the Striatum-Enriched Transcription Factor Foxp2.

PubMed

Hachigian, Lea J; Carmona, Vitor; Fenster, Robert J; Kulicke, Ruth; Heilbut, Adrian; Sittler, Annie; Pereira de Almeida, Luís; Mesirov, Jill P; Gao, Fan; Kolaczyk, Eric D; Heiman, Myriam

2017-12-05

Alteration of corticostriatal glutamatergic function is an early pathophysiological change associated with Huntington's disease (HD). The factors that regulate the maintenance of corticostriatal glutamatergic synapses post-developmentally are not well understood. Recently, the striatum-enriched transcription factor Foxp2 was implicated in the development of these synapses. Here, we show that, in mice, overexpression of Foxp2 in the adult striatum of two models of HD leads to rescue of HD-associated behaviors, while knockdown of Foxp2 in wild-type mice leads to development of HD-associated behaviors. We note that Foxp2 encodes the longest polyglutamine repeat protein in the human reference genome, and we show that it can be sequestered into aggregates with polyglutamine-expanded mutant Huntingtin protein (mHTT). Foxp2 overexpression in HD model mice leads to altered expression of several genes associated with synaptic function, genes that present additional targets for normalization of corticostriatal dysfunction in HD. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Androgen receptor polyglutamine repeat length affects receptor activity and C2C12 cell development.

PubMed

Sheppard, Ryan L; Spangenburg, Espen E; Chin, Eva R; Roth, Stephen M

2011-10-20

Testosterone (T) has an anabolic effect on skeletal muscle and is believed to exert its local effects via the androgen receptor (AR). The AR harbors a polymorphic stretch of glutamine repeats demonstrated to inversely affect receptor transcriptional activity in prostate and kidney cells. The effects of AR glutamine repeat length on skeletal muscle are unknown. In this study we examined the effect of AR CAG repeat length on AR function in C2C12 cells. AR expression vectors harboring 14, 24, and 33 CAG repeats were used to assess AR transcriptional activity. C2C12 cell proliferation, differentiation, gene expression, myotube formation, and myonuclear fusion index were assessed. Transcriptional activity increased with increasing repeat length and in response to testosterone (AR14 = 3.91 ± 0.26, AR24 = 25.21 ± 1.72, AR33 = 36.08 ± 3.22 relative light units; P < 0.001). Ligand activation was increased for AR33 (2.10 ± 0.04) compared with AR14 (1.54 ± 0.09) and AR24 (1.57 ± 0.05, P < 0.001). AR mRNA expression was elevated in each stably transfected line. AR33 cell proliferation (20,512.3 ± 1,024.0) was decreased vs. AR14 (27,604.17 ± 1,425.3; P < 0.001) after 72 h. Decreased CK activity in AR14 cells (54.9 ± 2.9 units/μg protein) in comparison to AR33 (70.8 ± 8.1) (P < 0.05) was noted. The myonuclear fusion index was lower for AR14 (15.21 ± 3.24%) and AR33 (9.97 ± 3.14%) in comparison to WT (35.07 ± 5.60%, P < 0.001). AR14 and AR33 cells also displayed atypical myotube morphology. RT-PCR revealed genotype differences in myostatin and myogenin expression. We conclude that AR polyglutamine repeat length is directly associated with transcriptional activity and alters the growth and development of C2C12 cells. This polymorphism may contribute to the heritability of muscle mass in humans.
Polyglutamine length-dependent toxicity from α1ACT in Drosophila models of spinocerebellar ataxia type 6

PubMed Central

Tsou, Wei-Ling; Qiblawi, Sultan H.; Hosking, Ryan R.; Gomez, Christopher M.

2016-01-01

ABSTRACT Spinocerebellar ataxia type 6 (SCA6) is a neurodegenerative disease that results from abnormal expansion of a polyglutamine (polyQ) repeat. SCA6 is caused by CAG triplet repeat expansion in the gene CACNA1A, resulting in a polyQ tract of 19-33 in patients. CACNA1A, a bicistronic gene, encodes the α1A calcium channel subunit and the transcription factor, α1ACT. PolyQ expansion in α1ACT causes degeneration in mice. We recently described the first Drosophila models of SCA6 that express α1ACT with a normal (11Q) or hyper-expanded (70Q) polyQ. Here, we report additional α1ACT transgenic flies, which express full-length α1ACT with a 33Q repeat. We show that α1ACT33Q is toxic in Drosophila, but less so than the 70Q version. When expressed everywhere, α1ACT33Q-expressing adults die earlier than flies expressing the normal allele. α1ACT33Q causes retinal degeneration and leads to aggregated species in an age-dependent manner, but at a slower pace than the 70Q counterpart. According to western blots, α1ACT33Q localizes less readily in the nucleus than α1ACT70Q, providing clues into the importance of polyQ tract length on α1ACT localization and its site of toxicity. We expect that these new lines will be highly valuable for future work on SCA6. PMID:27979829
A structural model of polyglutamine determined from a host-guest method combining experiments and landscape theory.

PubMed

Finke, John M; Cheung, Margaret S; Onuchic, José N

2004-09-01

Modeling the structure of natively disordered peptides has proved difficult due to the lack of structural information on these peptides. In this work, we use a novel application of the host-guest method, combining folding theory with experiments, to model the structure of natively disordered polyglutamine peptides. Initially, a minimalist molecular model (C(alpha)C(beta)) of CI2 is developed with a structurally based potential and captures many of the folding properties of CI2 determined from experiments. Next, polyglutamine "guest" inserts of increasing length are introduced into the CI2 "host" model and the polyglutamine is modeled to match the resultant change in CI2 thermodynamic stability between simulations and experiments. The polyglutamine model that best mimics the experimental changes in CI2 thermodynamic stability has 1), a beta-strand dihedral preference and 2), an attractive energy between polyglutamine atoms 0.75-times the attractive energy between the CI2 host Go-contacts. When free-energy differences in the CI2 host-guest system are correctly modeled at varying lengths of polyglutamine guest inserts, the kinetic folding rates and structural perturbation of these CI2 insert mutants are also correctly captured in simulations without any additional parameter adjustment. In agreement with experiments, the residues showing structural perturbation are located in the immediate vicinity of the loop insert. The simulated polyglutamine loop insert predominantly adopts extended random coil conformations, a structural model consistent with low resolution experimental methods. The agreement between simulation and experimental CI2 folding rates, CI2 structural perturbation, and polyglutamine insert structure show that this host-guest method can select a physically realistic model for inserted polyglutamine. If other amyloid peptides can be inserted into stable protein hosts and the stabilities of these host-guest mutants determined, this novel host-guest method may prove useful to determine structural preferences of these intractable but biologically relevant protein fragments.
ATXN2 with intermediate-length CAG/CAA repeats does not seem to be a risk factor in hereditary spastic paraplegia.

PubMed

Nielsen, Troels Tolstrup; Svenstrup, Kirsten; Budtz-Jørgensen, Esben; Eiberg, Hans; Hasholt, Lis; Nielsen, Jørgen E

2012-10-15

Hereditary spastic paraplegia (HSP) confines a group of heterogeneous neurodegenerative disorders characterized by progressive spasticity and lower limb weakness. Age of onset is highly variable even in familial cases with known mutations suggesting that the disease is modulated by other yet unknown parameters. Although progressive gait disturbances, lower limb spasticity and extensor plantar responses are hallmarks of HSP these characteristics are also found in other neurodegenerative disorders, e.g. amytrophic lateral sclerosis (ALS). HSP has been linked to ALS and frontotemporal degeneration with motor neuron disease (FTD-MND), since TDP-43 positive inclusions have recently been found in an HSP subtype, and TDP-43 are found in abundance in pathological inclusions of both ALS and FTD-MND. Furthermore, ataxin-2 (encoded by the gene ATXN2), a polyglutamine containing protein elongated in spinocerebellar ataxia type 2, has been shown to be a modulator of TDP-43 induced toxicity in ALS animal and cell models. Finally, it has been shown that ATXN2 with non-pathogenic intermediate-length CAG/CAA repeat elongations (encoding the polyglutamine tract) is a genetic risk factor of ALS. Considering the similarities in the disease phenotype and the neuropathological link between ALS and HSP we hypothesized that intermediate-length CAG/CAA repeats in ATXN2 could be a modulator of HSP. We show that in a cohort of 181 HSP patients 4.9 % of the patients had intermediate-length CAG/CAA repeats in ATXN2 which was not significantly different from the frequencies in a Danish control cohort or in American and European control populations. However, the mean age of onset was significantly lower in HSP patients with intermediate-length CAG/CAA repeats in ATXN2 compared to patients with normal length repeats. Based on these results we conclude that ATXN2 is most likely not a risk factor of HSP, whereas it might serve as a modulator of age of onset. Copyright © 2012 Elsevier B.V. All rights reserved.
A panel study on patients with dominant cerebellar ataxia highlights the frequency of channelopathies.

PubMed

Coutelier, Marie; Coarelli, Giulia; Monin, Marie-Lorraine; Konop, Juliette; Davoine, Claire-Sophie; Tesson, Christelle; Valter, Rémi; Anheim, Mathieu; Behin, Anthony; Castelnovo, Giovanni; Charles, Perrine; David, Albert; Ewenczyk, Claire; Fradin, Mélanie; Goizet, Cyril; Hannequin, Didier; Labauge, Pierre; Riant, Florence; Sarda, Pierre; Sznajer, Yves; Tison, François; Ullmann, Urielle; Van Maldergem, Lionel; Mochel, Fanny; Brice, Alexis; Stevanin, Giovanni; Durr, Alexandra

2017-06-01

Autosomal dominant cerebellar ataxias have a marked heterogeneous genetic background, with mutations in 34 genes identified so far. This large amount of implicated genes accounts for heterogeneous clinical presentations, making genotype-phenotype correlations a major challenge in the field. While polyglutamine ataxias, linked to CAG repeat expansions in genes such as ATXN1, ATXN2, ATXN3, ATXN7, CACNA1A and TBP, have been extensively characterized in large cohorts, there is a need for comprehensive assessment of frequency and phenotype of more 'conventional' ataxias. After exclusion of CAG/polyglutamine expansions in spinocerebellar ataxia genes in 412 index cases with dominantly inherited cerebellar ataxias, we aimed to establish the relative frequencies of mutations in other genes, with an approach combining panel sequencing and TaqMan® polymerase chain reaction assay. We found relevant genetic variants in 59 patients (14.3%). The most frequently mutated were channel genes [CACNA1A (n = 16), KCND3 (n = 4), KCNC3 (n = 2) and KCNA1 (n = 2)]. Deletions in ITPR1 (n = 11) were followed by biallelic variants in SPG7 (n = 9). Variants in AFG3L2 (n = 7) came next in frequency, and variants were rarely found in STBN2 (n = 2), ELOVL5, FGF14, STUB1 and TTBK2 (n = 1 each). Interestingly, possible risk factor variants were detected in SPG7 and POLG. Clinical comparisons showed that ataxias due to channelopathies had a significantly earlier age at onset with an average of 24.6 years, versus 40.9 years for polyglutamine expansion spinocerebellar ataxias and 37.8 years for SPG7-related forms (P = 0.001). In contrast, disease duration was significantly longer in the former (20.5 years versus 9.3 and 13.7, P=0.001), though for similar functional stages, indicating slower progression of the disease. Of interest, intellectual deficiency was more frequent in channel spinocerebellar ataxias, while cognitive impairment in adulthood was similar among the three groups. Similar differences were found among a single gene group, comparing 23 patients with CACNA1A expansions (spinocerebellar ataxia 6) to 22 patients with CACNA1A point mutations, which had lower average age at onset (25.2 versus 47.3 years) with longer disease duration (18.7 versus 10.9), but lower severity indexes (0.39 versus 0.44), indicating slower progression of the disease. In conclusion, we identified relevant genetic variations in up to 15% of cases after exclusion of polyglutamine expansion spinocerebellar ataxias, and confirmed CACNA1A and SPG7 as major ataxia genes. We could delineate firm genotype-phenotype correlations that are important for genetic counselling and of possible prognostic value. © The Author (2017). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Prefoldin Protects Neuronal Cells from Polyglutamine Toxicity by Preventing Aggregation Formation*

PubMed Central

Tashiro, Erika; Zako, Tamotsu; Muto, Hideki; Itoo, Yoshinori; Sörgjerd, Karin; Terada, Naofumi; Abe, Akira; Miyazawa, Makoto; Kitamura, Akira; Kitaura, Hirotake; Kubota, Hiroshi; Maeda, Mizuo; Momoi, Takashi; Iguchi-Ariga, Sanae M. M.; Kinjo, Masataka; Ariga, Hiroyoshi

2013-01-01

Huntington disease is caused by cell death after the expansion of polyglutamine (polyQ) tracts longer than ∼40 repeats encoded by exon 1 of the huntingtin (HTT) gene. Prefoldin is a molecular chaperone composed of six subunits, PFD1–6, and prevents misfolding of newly synthesized nascent polypeptides. In this study, we found that knockdown of PFD2 and PFD5 disrupted prefoldin formation in HTT-expressing cells, resulting in accumulation of aggregates of a pathogenic form of HTT and in induction of cell death. Dead cells, however, did not contain inclusions of HTT, and analysis by a fluorescence correlation spectroscopy indicated that knockdown of PFD2 and PFD5 also increased the size of soluble oligomers of pathogenic HTT in cells. In vitro single molecule observation demonstrated that prefoldin suppressed HTT aggregation at the small oligomer (dimer to tetramer) stage. These results indicate that prefoldin inhibits elongation of large oligomers of pathogenic Htt, thereby inhibiting subsequent inclusion formation, and suggest that soluble oligomers of polyQ-expanded HTT are more toxic than are inclusion to cells. PMID:23720755
The S/T-Rich Motif in the DNAJB6 Chaperone Delays Polyglutamine Aggregation and the Onset of Disease in a Mouse Model.

PubMed

Kakkar, Vaishali; Månsson, Cecilia; de Mattos, Eduardo P; Bergink, Steven; van der Zwaag, Marianne; van Waarde, Maria A W H; Kloosterhuis, Niels J; Melki, Ronald; van Cruchten, Remco T P; Al-Karadaghi, Salam; Arosio, Paolo; Dobson, Christopher M; Knowles, Tuomas P J; Bates, Gillian P; van Deursen, Jan M; Linse, Sara; van de Sluis, Bart; Emanuelsson, Cecilia; Kampinga, Harm H

2016-04-21

Expanded CAG repeats lead to debilitating neurodegenerative disorders characterized by aggregation of proteins with expanded polyglutamine (polyQ) tracts. The mechanism of aggregation involves primary and secondary nucleation steps. We show how a noncanonical member of the DNAJ-chaperone family, DNAJB6, inhibits the conversion of soluble polyQ peptides into amyloid fibrils, in particular by suppressing primary nucleation. This inhibition is mediated by a serine/threonine-rich region that provides an array of surface-exposed hydroxyl groups that bind to polyQ peptides and may disrupt the formation of the H bonds essential for the stability of amyloid fibrils. Early prevention of polyQ aggregation by DNAJB6 occurs also in cells and leads to delayed neurite retraction even before aggregates are visible. In a mouse model, brain-specific coexpression of DNAJB6 delays polyQ aggregation, relieves symptoms, and prolongs lifespan, pointing to DNAJB6 as a potential target for disease therapy and tool for unraveling early events in the onset of polyQ diseases. Copyright © 2016 Elsevier Inc. All rights reserved.
Identification of benzothiazoles as potential polyglutamine aggregation inhibitors of Huntington's disease by using an automated filter retardation assay

PubMed Central

Heiser, Volker; Engemann, Sabine; Bröcker, Wolfgang; Dunkel, Ilona; Boeddrich, Annett; Waelter, Stephanie; Nordhoff, Eddi; Lurz, Rudi; Schugardt, Nancy; Rautenberg, Susanne; Herhaus, Christian; Barnickel, Gerhard; Böttcher, Henning; Lehrach, Hans; Wanker, Erich E.

2002-01-01

Preventing the formation of insoluble polyglutamine containing protein aggregates in neurons may represent an attractive therapeutic strategy to ameliorate Huntington's disease (HD). Therefore, the ability to screen for small molecules that suppress the self-assembly of huntingtin would have potential clinical and significant research applications. We have developed an automated filter retardation assay for the rapid identification of chemical compounds that prevent HD exon 1 protein aggregation in vitro. Using this method, a total of 25 benzothiazole derivatives that inhibit huntingtin fibrillogenesis in a dose-dependent manner were discovered from a library of ≈184,000 small molecules. The results obtained by the filter assay were confirmed by immunoblotting, electron microscopy, and mass spectrometry. Furthermore, cell culture studies revealed that 2-amino-4,7-dimethyl-benzothiazol-6-ol, a chemical compound similar to riluzole, significantly inhibits HD exon 1 aggregation in vivo. These findings may provide the basis for a new therapeutic approach to prevent the accumulation of insoluble protein aggregates in Huntington's disease and related glutamine repeat disorders. PMID:12200548
Size analysis of polyglutamine protein aggregates using fluorescence detection in an analytical ultracentrifuge.

PubMed

Polling, Saskia; Hatters, Danny M; Mok, Yee-Foong

2013-01-01

Defining the aggregation process of proteins formed by poly-amino acid repeats in cells remains a challenging task due to a lack of robust techniques for their isolation and quantitation. Sedimentation velocity methodology using fluorescence detected analytical ultracentrifugation is one approach that can offer significant insight into aggregation formation and kinetics. While this technique has traditionally been used with purified proteins, it is now possible for substantial information to be collected with studies using cell lysates expressing a GFP-tagged protein of interest. In this chapter, we describe protocols for sample preparation and setting up the fluorescence detection system in an analytical ultracentrifuge to perform sedimentation velocity experiments on cell lysates containing aggregates formed by poly-amino acid repeat proteins.

Linkage disequilibrium at the SCA2 locus

PubMed Central

Didierjean, O.; Cancel, G.; Stevanin, G.; Durr, A.; Burk, K.; Benomar, A.; Lezin, A.; Belal, S.; Abada-Bendid, M.; Klockgether, T.; Brice, A.

1999-01-01

Spinocerebellar ataxia type 2 (SCA2) is caused by the expansion of an unstable CAG repeat encoding a polyglutamine tract. Repeats with 32 to 200 CAGs are associated with the disease, whereas normal chromosomes contain 13 to 33 repeats. We tested 220 families of different geographical origins for the SCA2 mutation. Thirty three were positive (15%). Twenty three families with at least two affected subjects were tested for linkage disequilibium (LD) between the SCA2 mutation and three microsatellite markers, two of which (D12S1332-D12S1333) closely flanked the mutation; the other (D12S1672) was intragenic. Many different haplotypes were observed, indicating the occurrence of several ancestral mutations. However, the same haplotype, not observed in controls, was detected in the German, the Serbian, and some of the French families, suggesting a founder effect or recurrent mutations on an at risk haplotype.   Keywords: linkage disequilibrium; SCA2; trinucleotide repeat expansion; founder effect PMID:10353790
Evidence for sequestration of polyglutamine inclusions by Drosophila myeloid leukemia factor.

PubMed

Kim, Woo-Yang; Fayazi, Zahra; Bao, Xiankun; Higgins, Dennis; Kazemi-Esfarjani, Parsa

2005-08-01

Intracellular inclusions of abnormally long polyglutamine tracts and neurotoxicity are the hallmarks of several hereditary neurodegenerative disorders, including Huntington's disease (HD). In Drosophila melanogaster, dMLF, an ortholog of human myeloid leukemia factors, hMLF1 and hMLF2, suppressed polyglutamine toxicity and colocalized with the inclusions. In transfected primary rat neuronal cultures, dMLF and its orthologs reduced the morphological phenotypes and inclusions. Furthermore, dMLF reduced the recruitment of CBP and Hsp70 into the inclusions, both of which are among many essential proteins apparently trapped in the inclusions. These data suggest that a possible mechanism of suppression by dMLF is via the sequestration of polyglutamine oligomers or inclusions.
Fibril polymorphism affects immobilized non-amyloid flanking domains of huntingtin exon1 rather than its polyglutamine core

PubMed Central

Lin, Hsiang-Kai; Boatz, Jennifer C.; Krabbendam, Inge E.; Kodali, Ravindra; Hou, Zhipeng; Wetzel, Ronald; Dolga, Amalia M.; Poirier, Michelle A.; van der Wel, Patrick C. A.

2017-01-01

Polyglutamine expansion in the huntingtin protein is the primary genetic cause of Huntington's disease (HD). Fragments coinciding with mutant huntingtin exon1 aggregate in vivo and induce HD-like pathology in mouse models. The resulting aggregates can have different structures that affect their biochemical behaviour and cytotoxic activity. Here we report our studies of the structure and functional characteristics of multiple mutant htt exon1 fibrils by complementary techniques, including infrared and solid-state NMR spectroscopies. Magic-angle-spinning NMR reveals that fibrillar exon1 has a partly mobile α-helix in its aggregation-accelerating N terminus, and semi-rigid polyproline II helices in the proline-rich flanking domain (PRD). The polyglutamine-proximal portions of these domains are immobilized and clustered, limiting access to aggregation-modulating antibodies. The polymorphic fibrils differ in their flanking domains rather than the polyglutamine amyloid structure. They are effective at seeding polyglutamine aggregation and exhibit cytotoxic effects when applied to neuronal cells. PMID:28537272
Fibril polymorphism affects immobilized non-amyloid flanking domains of huntingtin exon1 rather than its polyglutamine core

NASA Astrophysics Data System (ADS)

Lin, Hsiang-Kai; Boatz, Jennifer C.; Krabbendam, Inge E.; Kodali, Ravindra; Hou, Zhipeng; Wetzel, Ronald; Dolga, Amalia M.; Poirier, Michelle A.; van der Wel, Patrick C. A.

2017-05-01

Polyglutamine expansion in the huntingtin protein is the primary genetic cause of Huntington's disease (HD). Fragments coinciding with mutant huntingtin exon1 aggregate in vivo and induce HD-like pathology in mouse models. The resulting aggregates can have different structures that affect their biochemical behaviour and cytotoxic activity. Here we report our studies of the structure and functional characteristics of multiple mutant htt exon1 fibrils by complementary techniques, including infrared and solid-state NMR spectroscopies. Magic-angle-spinning NMR reveals that fibrillar exon1 has a partly mobile α-helix in its aggregation-accelerating N terminus, and semi-rigid polyproline II helices in the proline-rich flanking domain (PRD). The polyglutamine-proximal portions of these domains are immobilized and clustered, limiting access to aggregation-modulating antibodies. The polymorphic fibrils differ in their flanking domains rather than the polyglutamine amyloid structure. They are effective at seeding polyglutamine aggregation and exhibit cytotoxic effects when applied to neuronal cells.
Spontaneous formation of polyglutamine nanotubes with molecular dynamics simulations

NASA Astrophysics Data System (ADS)

Laghaei, Rozita; Mousseau, Normand

2010-04-01

Expansion of polyglutamine (polyQ) beyond the pathogenic threshold (35-40 Gln) is associated with several neurodegenerative diseases including Huntington's disease, several forms of spinocerebellar ataxias and spinobulbar muscular atrophy. To determine the structure of polyglutamine aggregates we perform replica-exchange molecular dynamics simulations coupled with the optimized potential for effective peptide forcefield. Using a range of temperatures from 250 to 700 K, we study the aggregation kinetics of the polyglutamine monomer and dimer with chain lengths from 30 to 50 residues. All monomers show a similar structural change at the same temperature from α-helical structure to random coil, without indication of any significant β-strand. For dimers, by contrast, starting from random structures, we observe spontaneous formation of antiparallel β-sheets and triangular and circular β-helical structures for polyglutamine with 40 residues in a 400 ns 50 temperature replica-exchange molecular dynamics simulation (total integrated time 20 μs). This ˜32 Å diameter structure reorganizes further into a tight antiparallel double-stranded ˜22 Å nanotube with 22 residues per turn close to Perutz' model for amyloid fibers as water-filled nanotubes. This diversity of structures suggests the existence of polymorphism for polyglutamine with possibly different pathways leading to the formation of toxic oligomers and to fibrils.
Oligonucleotide-based strategies to combat polyglutamine diseases

PubMed Central

Fiszer, Agnieszka; Krzyzosiak, Wlodzimierz J.

2014-01-01

Considerable advances have been recently made in understanding the molecular aspects of pathogenesis and in developing therapeutic approaches for polyglutamine (polyQ) diseases. Studies on pathogenic mechanisms have extended our knowledge of mutant protein toxicity, confirmed the toxicity of mutant transcript and identified other toxic RNA and protein entities. One very promising therapeutic strategy is targeting the causative gene expression with oligonucleotide (ON) based tools. This straightforward approach aimed at halting the early steps in the cascade of pathogenic events has been widely tested for Huntington's disease and spinocerebellar ataxia type 3. In this review, we gather information on the use of antisense oligonucleotides and RNA interference triggers for the experimental treatment of polyQ diseases in cellular and animal models. We present studies testing non-allele-selective and allele-selective gene silencing strategies. The latter include targeting SNP variants associated with mutations or targeting the pathologically expanded CAG repeat directly. We compare gene silencing effectors of various types in a number of aspects, including their design, efficiency in cell culture experiments and pre-clinical testing. We discuss advantages, current limitations and perspectives of various ON-based strategies used to treat polyQ diseases. PMID:24848018
The spinocerebellar ataxias: order emerges from chaos.

PubMed

Margolis, Russell L

2002-09-01

In the past decade, the genetic etiologies accounting for most cases of adult-onset dominant cerebellar ataxia have been discovered. This group of disorders, generally referred to as the spinocerebellar ataxias (SCAs), can now be classified by a simple genetic nosology, essentially a sequential list in which each new SCA is given a number. However, recent advances in the elucidation of SCA pathogenesis provide the opportunity to subclassify the disorders into three discrete groups based on pathogenesis: 1) the polyglutamine disorders, SCAs 1, 2, 3, 7, and 17, which result from proteins with toxic stretches of polyglutamine; 2) the channelopathies, SCA6 and episodic ataxia types 1 and 2 (EA1 and EA2), which result from disruption of calcium or potassium channel function; and 3) the gene expression disorders, SCAs 8, 10, and 12, which result from repeat expansions outside of coding regions that may quantitatively alter gene expression. SCAs 4, 5, 9, 11, 13-16, 19, 21, and 22 are of unknown etiology, and may or may not fit into one of these three groups. At present, most diagnostic and therapeutic strategies apply equally to all of the SCAs. Therapy specific for individual diseases or types of diseases is a realistic goal in the foreseeable future.
Modifiers and mechanisms of multi-system polyglutamine neurodegenerative disorders: lessons from fly models.

PubMed

Mallik, Moushami; Lakhotia, Subhash C

2010-12-01

Polyglutamine (polyQ) diseases, resulting from a dynamic expansion of glutamine repeats in a polypeptide, are a class of genetically inherited late onset neurodegenerative disorders which, despite expression of the mutated gene widely in brain and other tissues, affect defined subpopulations of neurons in a disease-specific manner. We briefly review the different polyQ-expansion-induced neurodegenerative disorders and the advantages of modelling them in Drosophila. Studies using the fly models have successfully identified a variety of genetic modifiers and have helped in understanding some of the molecular events that follow expression of the abnormal polyQ proteins. Expression of the mutant polyQ proteins causes, as a consequence of intra-cellular and inter-cellular networking, mis-regulation at multiple steps like transcriptional and posttranscriptional regulations, cell signalling, protein quality control systems (protein folding and degradation networks), axonal transport machinery etc., in the sensitive neurons, resulting ultimately in their death. The diversity of genetic modifiers of polyQ toxicity identified through extensive genetic screens in fly and other models clearly reflects a complex network effect of the presence of the mutated protein. Such network effects pose a major challenge for therapeutic applications.
Genome-wide RNA interference screen identifies previously undescribed regulators of polyglutamine aggregation

PubMed Central

Nollen, Ellen A. A.; Garcia, Susana M.; van Haaften, Gijs; Kim, Soojin; Chavez, Alejandro; Morimoto, Richard I.; Plasterk, Ronald H. A.

2004-01-01

Protein misfolding and the formation of aggregates are increasingly recognized components of the pathology of human genetic disease and hallmarks of many neurodegenerative disorders. As exemplified by polyglutamine diseases, the propensity for protein misfolding is associated with the length of polyglutamine expansions and age-dependent changes in protein-folding homeostasis, suggesting a critical role for a protein homeostatic buffer. To identify the complement of protein factors that protects cells against the formation of protein aggregates, we tested transgenic Caenorhabditis elegans strains expressing polyglutamine expansion yellow fluorescent protein fusion proteins at the threshold length associated with the age-dependent appearance of protein aggregation. We used genome-wide RNA interference to identify genes that, when suppressed, resulted in the premature appearance of protein aggregates. Our screen identified 186 genes corresponding to five principal classes of polyglutamine regulators: genes involved in RNA metabolism, protein synthesis, protein folding, and protein degradation; and those involved in protein trafficking. We propose that each of these classes represents a molecular machine collectively comprising the protein homeostatic buffer that responds to the expression of damaged proteins to prevent their misfolding and aggregation. PMID:15084750
The therapeutic potential of G-protein coupled receptors in Huntington's disease.

PubMed

Dowie, Megan J; Scotter, Emma L; Molinari, Emanuela; Glass, Michelle

2010-11-01

Huntington's disease is a late-onset autosomal dominant inherited neurodegenerative disease characterised by increased symptom severity over time and ultimately premature death. An expanded CAG repeat sequence in the huntingtin gene leads to a polyglutamine expansion in the expressed protein, resulting in complex dysfunctions including cellular excitotoxicity and transcriptional dysregulation. Symptoms include cognitive deficits, psychiatric changes and a movement disorder often referred to as Huntington's chorea, which involves characteristic involuntary dance-like writhing movements. Neuropathologically Huntington's disease is characterised by neuronal dysfunction and death in the striatum and cortex with an overall decrease in cerebral volume (Ho et al., 2001). Neuronal dysfunction begins prior to symptom presentation, and cells of particular vulnerability include the striatal medium spiny neurons. Huntington's is a devastating disease for patients and their families and there is currently no cure, or even an effective therapy for disease symptoms. G-protein coupled receptors are the most abundant receptor type in the central nervous system and are linked to complex downstream pathways, manipulation of which may have therapeutic application in many neurological diseases. This review will highlight the potential of G-protein coupled receptor drug targets as emerging therapies for Huntington's disease. Copyright © 2010 Elsevier Inc. All rights reserved.
ROCK and PRK-2 Mediate the Inhibitory Effect of Y-27632 on Polyglutamine Aggregation

PubMed Central

Shao, Jieya; Welch, William J.; Diamond, Marc I.

2009-01-01

Polyglutamine expansion in huntingtin (Htt) and the androgen receptor (AR) causes untreatable neurodegenerative diseases. Y-27632, a therapeutic lead, reduces Htt and AR aggregation in cultured cells, and Htt-induced neurodegeneration in Drosophila. Y-27632 inhibits both Rho-associated kinases ROCK and PRK-2, making its precise intracellular target uncertain. Over-expression of either kinase increases Htt and AR aggregation. Three ROCK inhibitors (Y-27632, H-1077, HA-1152), and a specific ROCK inhibitory peptide reduce polyglutamine protein aggregation, as does knockdown of ROCK or PRK-2 by RNAi. RNAi also indicates that each kinase is required for the inhibitory effects of Y-27632 to manifest fully. These two actin regulatory kinases are thus involved in polyglutamine aggregation, and their simultaneous inhibition may be an important therapeutic goal. PMID:18423405
From The Cover: Genome-wide RNA interference screen identifies previously undescribed regulators of polyglutamine aggregation

NASA Astrophysics Data System (ADS)

Nollen, Ellen A. A.; Garcia, Susana M.; van Haaften, Gijs; Kim, Soojin; Chavez, Alejandro; Morimoto, Richard I.; Plasterk, Ronald H. A.

2004-04-01

Protein misfolding and the formation of aggregates are increasingly recognized components of the pathology of human genetic disease and hallmarks of many neurodegenerative disorders. As exemplified by polyglutamine diseases, the propensity for protein misfolding is associated with the length of polyglutamine expansions and age-dependent changes in protein-folding homeostasis, suggesting a critical role for a protein homeostatic buffer. To identify the complement of protein factors that protects cells against the formation of protein aggregates, we tested transgenic Caenorhabditis elegans strains expressing polyglutamine expansion yellow fluorescent protein fusion proteins at the threshold length associated with the age-dependent appearance of protein aggregation. We used genome-wide RNA interference to identify genes that, when suppressed, resulted in the premature appearance of protein aggregates. Our screen identified 186 genes corresponding to five principal classes of polyglutamine regulators: genes involved in RNA metabolism, protein synthesis, protein folding, and protein degradation; and those involved in protein trafficking. We propose that each of these classes represents a molecular machine collectively comprising the protein homeostatic buffer that responds to the expression of damaged proteins to prevent their misfolding and aggregation. protein misfolding | neurodegenerative diseases
Dysregulation of synaptic proteins, dendritic spine abnormalities and pathological plasticity of synapses as experience-dependent mediators of cognitive and psychiatric symptoms in Huntington's disease.

PubMed

Nithianantharajah, J; Hannan, A J

2013-10-22

Huntington's disease (HD) is an autosomal dominant tandem repeat expansion disorder involving cognitive, psychiatric and motor symptoms. The expanded trinucleotide (CAG) repeat leads to an extended polyglutamine tract in the huntingtin protein and a subsequent cascade of molecular and cellular pathogenesis. One of the key features of neuropathology, which has been shown to precede the eventual loss of neurons in the cerebral cortex, striatum and other areas, are changes to synapses, including the dendritic protrusions known as spines. In this review we will focus on synapse and spine pathology in HD, including molecular and experience-dependent aspects of pathogenesis. Dendritic spine pathology has been found in both the human HD brain at post mortem as well as various transgenic and knock-in animal models. These changes may help explain the symptoms in HD, and synaptopathy within the cerebral cortex may be particularly important in mediating the psychiatric and cognitive manifestations of this disease. The earliest stages of synaptic dysfunction in HD, as assayed in various mouse models, appears to involve changes in synaptic proteins and associated physiological abnormalities such as synaptic plasticity deficits. In mouse models, synaptic and cortical plasticity deficits have been directly correlated with the onset of cognitive deficits, implying a causal link. Furthermore, following the discovery that environmental enrichment can delay onset of affective, cognitive and motor deficits in HD transgenic mice, specific synaptic molecules shown to be dysregulated by the polyglutamine-induced toxicity were also found to be beneficially modulated by environmental stimulation. This identifies potential molecular targets for future therapeutic developments to treat this devastating disease. Copyright © 2012 IBRO. Published by Elsevier Ltd. All rights reserved.
Architecture of polyglutamine-containing fibrils from time-resolved fluorescence decay.

PubMed

Röthlein, Christoph; Miettinen, Markus S; Borwankar, Tejas; Bürger, Jörg; Mielke, Thorsten; Kumke, Michael U; Ignatova, Zoya

2014-09-26

The disease risk and age of onset of Huntington disease (HD) and nine other repeat disorders strongly depend on the expansion of CAG repeats encoding consecutive polyglutamines (polyQ) in the corresponding disease protein. PolyQ length-dependent misfolding and aggregation are the hallmarks of CAG pathologies. Despite intense effort, the overall structure of these aggregates remains poorly understood. Here, we used sensitive time-dependent fluorescent decay measurements to assess the architecture of mature fibrils of huntingtin (Htt) exon 1 implicated in HD pathology. Varying the position of the fluorescent labels in the Htt monomer with expanded 51Q (Htt51Q) and using structural models of putative fibril structures, we generated distance distributions between donors and acceptors covering all possible distances between the monomers or monomer dimensions within the polyQ amyloid fibril. Using Monte Carlo simulations, we systematically scanned all possible monomer conformations that fit the experimentally measured decay times. Monomers with four-stranded 51Q stretches organized into five-layered β-sheets with alternating N termini of the monomers perpendicular to the fibril axis gave the best fit to our data. Alternatively, the core structure of the polyQ fibrils might also be a zipper layer with antiparallel four-stranded stretches as this structure showed the next best fit. All other remaining arrangements are clearly excluded by the data. Furthermore, the assessed dimensions of the polyQ stretch of each monomer provide structural evidence for the observed polyQ length threshold in HD pathology. Our approach can be used to validate the effect of pharmacological substances that inhibit or alter amyloid growth and structure. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.
Toxicity and aggregation of the polyglutamine disease protein, ataxin-3 is regulated by its binding to VCP/p97 in Drosophila melanogaster.

PubMed

Ristic, Gorica; Sutton, Joanna R; Libohova, Kozeta; Todi, Sokol V

2018-04-26

Among the nine dominantly inherited, age-dependent neurodegenerative diseases caused by abnormal expansion in the polyglutamine (polyQ) repeat of otherwise unrelated proteins is Spinocerebellar Ataxia Type 3 (SCA3). SCA3 is caused by polyQ expansion in the deubiquitinase (DUB), ataxin-3. Molecular sequelae related to SCA3 remain unclear. Here, we sought to understand the role of protein context in SCA3 by focusing on the interaction between this DUB and Valosin-Containing Protein (VCP). VCP is bound directly by ataxin-3 through an arginine-rich area preceding the polyQ repeat. We examined the importance of this interaction in ataxin-3-dependent degeneration in Drosophila melanogaster. Our assays with new isogenic fly lines expressing pathogenic ataxin-3 with an intact or mutated VCP-binding site show that disrupting the ataxin-3-VCP interaction delays the aggregation of the toxic protein in vivo. Importantly, early on flies that express pathogenic ataxin-3 with a mutated VCP-binding site are indistinguishable from flies that do not express any SCA3 protein. Also, reducing levels of VCP through RNA-interference has a similar, protective effect to mutating the VCP-binding site of pathogenic ataxin-3. Based on in vivo pulse-chases, aggregated species of ataxin-3 are highly stable, in a manner independent of VCP-binding. Collectively, our results highlight an important role for the ataxin-3-VCP interaction in SCA3, based on a model that posits a seeding effect from VCP on pathogenic ataxin-3 aggregation and subsequent toxicity. Copyright © 2018 Elsevier Inc. All rights reserved.
MR Imaging in Spinocerebellar Ataxias: A Systematic Review.

PubMed

Klaes, A; Reckziegel, E; Franca, M C; Rezende, T J R; Vedolin, L M; Jardim, L B; Saute, J A

2016-08-01

Polyglutamine expansion spinocerebellar ataxias are autosomal dominant slowly progressive neurodegenerative diseases with no current treatment. MR imaging is the best-studied surrogate biomarker candidate for polyglutamine expansion spinocerebellar ataxias, though with conflicting results. We aimed to review quantitative central nervous system MR imaging technique findings in patients with polyglutamine expansion spinocerebellar ataxias and correlations with well-established clinical and molecular disease markers. We searched MEDLINE, LILACS, and Cochrane data bases of clinical trials between January 1995 and January 2016, for quantitative MR imaging volumetric approaches, MR spectroscopy, diffusion tensor imaging, or other quantitative techniques, comparing patients with polyglutamine expansion spinocerebellar ataxias (SCAs) with controls. Pertinent details for each study regarding participants, imaging methods, and results were extracted. After reviewing the 706 results, 18 studies were suitable for inclusion: 2 studies in SCA1, 1 in SCA2, 15 in SCA3, 1 in SCA7, 1 in SCA1 and SCA6 presymptomatic carriers, and none in SCA17 and dentatorubropallidoluysian atrophy. Cerebellar hemispheres and vermis, whole brain stem, midbrain, pons, medulla oblongata, cervical spine, striatum, and thalamus presented significant atrophy in SCA3. The caudate, putamen and whole brain stem presented similar sensitivity to change compared with ataxia scales after 2 years of follow-up in a single prospective study in SCA3. MR spectroscopy and DTI showed abnormalities only in cross-sectional studies in SCA3. Results from single studies in other polyglutamine expansion spinocerebellar ataxias should be replicated in different cohorts. Additional cross-sectional and prospective volumetric analysis, MR spectroscopy, and DTI studies are necessary in polyglutamine expansion spinocerebellar ataxias. The properties of preclinical disease biomarkers (presymptomatic) of MR imaging should be targeted in future studies. © 2016 by American Journal of Neuroradiology.
Molecular dynamics analysis of the aggregation propensity of polyglutamine segments

PubMed Central

Wen, Jingran; Scoles, Daniel R.

2017-01-01

Protein misfolding and aggregation is a pathogenic feature shared among at least ten polyglutamine (polyQ) neurodegenerative diseases. While solvent-solution interaction is a key factor driving protein folding and aggregation, the solvation properties of expanded polyQ tracts are not well understood. By using GPU-enabled all-atom molecular dynamics simulations of polyQ monomers in an explicit solvent environment, this study shows that solvent-polyQ interaction propensity decreases as the lengths of polyQ tract increases. This study finds a predominance in long-distance interactions between residues far apart in polyQ sequences with longer polyQ segments, that leads to significant conformational differences. This study also indicates that large loops, comprised of parallel β-structures, appear in long polyQ tracts and present new aggregation building blocks with aggregation driven by long-distance intra-polyQ interactions. Finally, consistent with previous observations using coarse-grain simulations, this study demonstrates that there is a gain in the aggregation propensity with increased polyQ length, and that this gain is correlated with decreasing ability of solvent-polyQ interaction. These results suggest the modulation of solvent-polyQ interactions as a possible therapeutic strategy for treating polyQ diseases. PMID:28542401
Spinocerebellar ataxia type 7.

PubMed

Martin, Jean-Jacques

2012-01-01

Spinocerebellar ataxia type 7 (SCA7) is associated with progressive blindness, dominant transmission, and marked anticipation. SCA7 represents one of the polyglutamine expansion diseases with increase of CAG repeats. The gene maps to chromosome 3p12-p21.1. Normal values of CAG repeats range from 4 to 18. The SCA7 gene encodes a protein of largely unknown function, called ataxin-7. SCA7 is reported in many countries and ethnic groups. Its phenotypic expression depends on the number of expanded repeats. The infantile phenotype is very severe, with more than 100 repeats. The classic type has 50 to 55 repeats and is characterized by a combination of visual and ataxic disturbances lasting for 20-40 years.When the number of CAG repeats is between 36 and 43, the evolution is much slower, with few or no retinal abnormalities. A CAG repeat number from 18 to 35 is asymptomatic but predisposes to the development of the disorder when expanding to the pathological range through transmission. The diagnosis is made by molecular genetics. The neuropathology of the disorder includes atrophy of the spinocerebellar pathways, pyramidal tracts, and motor nuclei in the brainstem and spinal cord, a cone-rod sytrophy of the retina, and ataxin-7 immunoreactive neuronal intranuclear inclusions. The neuropathological features vary as a function of the number of CAG repeats. Present research deals mainly with the study of ataxin-7 in transfected neural cells and transgenic mouse models. 2012 Elsevier B.V. All rights reserved.
Huntington's disease (HD): the neuropathology of a multisystem neurodegenerative disorder of the human brain.

PubMed

Rüb, U; Seidel, K; Heinsen, H; Vonsattel, J P; den Dunnen, W F; Korf, H W

2016-11-01

Huntington's disease (HD) is an autosomal dominantly inherited, and currently untreatable, neuropsychiatric disorder. This progressive and ultimately fatal disease is named after the American physician George Huntington and according to the underlying molecular biological mechanisms is assigned to the human polyglutamine or CAG-repeat diseases. In the present article we give an overview of the currently known neurodegenerative hallmarks of the brains of HD patients. Subsequent to recent pathoanatomical studies the prevailing reductionistic concept of HD as a human neurodegenerative disease, which is primarily and more or less exclusively confined to the striatum (ie, caudate nucleus and putamen) has been abandoned. Many recent studies have improved our neuropathological knowledge of HD; many of the early groundbreaking findings of neuropathological HD research have been rediscovered and confirmed. The results of this investigation have led to the stepwise revision of the simplified pathoanatomical and pathophysiological HD concept and culminated in the implementation of the current concept of HD as a multisystem degenerative disease of the human brain. The multisystem character of the neuropathology of HD is emphasized by a brain distribution pattern of neurodegeneration (i) which apart from the striatum includes the cerebral neo-and allocortex, thalamus, pallidum, brainstem and cerebellum, and which (ii) therefore, shares more similarities with polyglutamine spinocerebellar ataxias than previously thought. © 2016 International Society of Neuropathology.
GENETICS AND NEUROPATHOLOGY OF HUNTINGTON’S DISEASE

PubMed Central

Reiner, Anton; Dragatsis, Ioannis; Dietrich, Paula

2015-01-01

Huntington’s disease (HD) is an autosomal dominant progressive neurodegenerative disorder that prominently affects the basal ganglia, leading to affective, cognitive, behavioral and motor decline. The basis of HD is a CAG repeat expansion to >35 CAG in a gene that codes for a ubiquitous protein known as huntingtin, resulting in an expanded N-terminal polyglutamine tract. The size of the expansion is correlated with disease severity, with increasing CAG accelerating the age of onset. A variety of possibilities have been proposed as to the mechanism by which the mutation causes preferential injury to the basal ganglia. The present chapter provides a basic overview of the genetics and pathology of HD. PMID:21907094

Misfolded Polyglutamine, Polyalanine, and Superoxide Dismutase 1 Aggregate via Distinct Pathways in the Cell*

PubMed Central

Polling, Saskia; Mok, Yee-Foong; Ramdzan, Yasmin M.; Turner, Bradley J.; Yerbury, Justin J.; Hill, Andrew F.; Hatters, Danny M.

2014-01-01

Protein aggregation into intracellular inclusions is a key feature of many neurodegenerative disorders. A common theme has emerged that inappropriate self-aggregation of misfolded or mutant polypeptide sequences is detrimental to cell health. Yet protein quality control mechanisms may also deliberately cluster them together into distinct inclusion subtypes, including the insoluble protein deposit (IPOD) and the juxtanuclear quality control (JUNQ). Here we investigated how the intrinsic oligomeric state of three model systems of disease-relevant mutant protein and peptide sequences relates to the IPOD and JUNQ patterns of aggregation using sedimentation velocity analysis. Two of the models (polyalanine (37A) and superoxide dismutase 1 (SOD1) mutants A4V and G85R) accumulated into the same JUNQ-like inclusion whereas the other, polyglutamine (72Q), formed spatially distinct IPOD-like inclusions. Using flow cytometry pulse shape analysis (PulSA) to separate cells with inclusions from those without revealed the SOD1 mutants and 37A to have abruptly altered oligomeric states with respect to the nonaggregating forms, regardless of whether cells had inclusions or not, whereas 72Q was almost exclusively monomeric until inclusions formed. We propose that mutations leading to JUNQ inclusions induce a constitutively “misfolded” state exposing hydrophobic side chains that attract and ultimately overextend protein quality capacity, which leads to aggregation into JUNQ inclusions. Poly(Q) is not misfolded in this same sense due to universal polar side chains, but is highly prone to forming amyloid fibrils that we propose invoke a different engagement mechanism with quality control. PMID:24425868
Misfolded polyglutamine, polyalanine, and superoxide dismutase 1 aggregate via distinct pathways in the cell.

PubMed

Polling, Saskia; Mok, Yee-Foong; Ramdzan, Yasmin M; Turner, Bradley J; Yerbury, Justin J; Hill, Andrew F; Hatters, Danny M

2014-03-07

Protein aggregation into intracellular inclusions is a key feature of many neurodegenerative disorders. A common theme has emerged that inappropriate self-aggregation of misfolded or mutant polypeptide sequences is detrimental to cell health. Yet protein quality control mechanisms may also deliberately cluster them together into distinct inclusion subtypes, including the insoluble protein deposit (IPOD) and the juxtanuclear quality control (JUNQ). Here we investigated how the intrinsic oligomeric state of three model systems of disease-relevant mutant protein and peptide sequences relates to the IPOD and JUNQ patterns of aggregation using sedimentation velocity analysis. Two of the models (polyalanine (37A) and superoxide dismutase 1 (SOD1) mutants A4V and G85R) accumulated into the same JUNQ-like inclusion whereas the other, polyglutamine (72Q), formed spatially distinct IPOD-like inclusions. Using flow cytometry pulse shape analysis (PulSA) to separate cells with inclusions from those without revealed the SOD1 mutants and 37A to have abruptly altered oligomeric states with respect to the nonaggregating forms, regardless of whether cells had inclusions or not, whereas 72Q was almost exclusively monomeric until inclusions formed. We propose that mutations leading to JUNQ inclusions induce a constitutively "misfolded" state exposing hydrophobic side chains that attract and ultimately overextend protein quality capacity, which leads to aggregation into JUNQ inclusions. Poly(Q) is not misfolded in this same sense due to universal polar side chains, but is highly prone to forming amyloid fibrils that we propose invoke a different engagement mechanism with quality control.
Drosophila melanogaster As a Model Organism to Study RNA Toxicity of Repeat Expansion-Associated Neurodegenerative and Neuromuscular Diseases

PubMed Central

Koon, Alex C.; Chan, Ho Yin Edwin

2017-01-01

For nearly a century, the fruit fly, Drosophila melanogaster, has proven to be a valuable tool in our understanding of fundamental biological processes, and has empowered our discoveries, particularly in the field of neuroscience. In recent years, Drosophila has emerged as a model organism for human neurodegenerative and neuromuscular disorders. In this review, we highlight a number of recent studies that utilized the Drosophila model to study repeat-expansion associated diseases (READs), such as polyglutamine diseases, fragile X-associated tremor/ataxia syndrome (FXTAS), myotonic dystrophy type 1 (DM1) and type 2 (DM2), and C9ORF72-associated amyotrophic lateral sclerosis/frontotemporal dementia (C9-ALS/FTD). Discoveries regarding the possible mechanisms of RNA toxicity will be focused here. These studies demonstrate Drosophila as an excellent in vivo model system that can reveal novel mechanistic insights into human disorders, providing the foundation for translational research and therapeutic development. PMID:28377694
Spinocerebellum Ataxia Type 6: Molecular Mechanisms and Calcium Channel Genetics.

PubMed

Du, Xiaofei; Gomez, Christopher Manuel

2018-01-01

Spinocerebellar ataxia (SCA) type 6 is an autosomal dominant disease affecting cerebellar degeneration. Clinically, it is characterized by pure cerebellar dysfunction, slowly progressive unsteadiness of gait and stance, slurred speech, and abnormal eye movements with late onset. Pathological findings of SCA6 include a diffuse loss of Purkinje cells, predominantly in the cerebellar vermis. Genetically, SCA6 is caused by expansion of a trinucleotide CAG repeat in the last exon of longest isoform CACNA1A gene on chromosome 19p13.1-p13.2. Normal alleles have 4-18 repeats, while alleles causing disease contain 19-33 repeats. Due to presence of a novel internal ribosomal entry site (IRES) with the mRNA, CACNA1A encodes two structurally unrelated proteins with distinct functions within an overlapping open reading frame (ORF) of the same mRNA: (1) α1A subunit of P/Q-type voltage gated calcium channel; (2) α1ACT, a newly recognized transcription factor, with polyglutamine repeat at C-terminal end. Understanding the function of α1ACT in physiological and pathological conditions may elucidate the pathogenesis of SCA6. More importantly, the IRES, as the translational control element of α1ACT, provides a potential therapeutic target for the treatment of SCA6.
Large-scale assessment of polyglutamine repeat expansions in Parkinson disease

PubMed Central

Wang, Lisa; Aasly, Jan O.; Annesi, Grazia; Bardien, Soraya; Bozi, Maria; Brice, Alexis; Carr, Jonathan; Chung, Sun J.; Clarke, Carl; Crosiers, David; Deutschländer, Angela; Eckstein, Gertrud; Farrer, Matthew J.; Goldwurm, Stefano; Garraux, Gaetan; Hadjigeorgiou, Georgios M.; Hicks, Andrew A.; Hattori, Nobutaka; Klein, Christine; Jeon, Beom; Kim, Yun J.; Lesage, Suzanne; Lin, Juei-Jueng; Lynch, Timothy; Lichtner, Peter; Lang, Anthony E.; Mok, Vincent; Jasinska-Myga, Barbara; Mellick, George D.; Morrison, Karen E.; Opala, Grzegorz; Pihlstrøm, Lasse; Pramstaller, Peter P.; Park, Sung S.; Quattrone, Aldo; Rogaeva, Ekaterina; Ross, Owen A.; Stefanis, Leonidas; Stockton, Joanne D.; Silburn, Peter A.; Theuns, Jessie; Tan, Eng K.; Tomiyama, Hiroyuki; Toft, Mathias; Van Broeckhoven, Christine; Uitti, Ryan J.; Wirdefeldt, Karin; Wszolek, Zbigniew; Xiromerisiou, Georgia; Yueh, Kuo-Chu; Zhao, Yi; Gasser, Thomas; Maraganore, Demetrius M.; Krüger, Rejko

2015-01-01

Objectives: We aim to clarify the pathogenic role of intermediate size repeat expansions of SCA2, SCA3, SCA6, and SCA17 as risk factors for idiopathic Parkinson disease (PD). Methods: We invited researchers from the Genetic Epidemiology of Parkinson's Disease Consortium to participate in the study. There were 12,346 cases and 8,164 controls genotyped, for a total of 4 repeats within the SCA2, SCA3, SCA6, and SCA17 genes. Fixed- and random-effects models were used to estimate the summary risk estimates for the genes. We investigated between-study heterogeneity and heterogeneity between different ethnic populations. Results: We did not observe any definite pathogenic repeat expansions for SCA2, SCA3, SCA6, and SCA17 genes in patients with idiopathic PD from Caucasian and Asian populations. Furthermore, overall analysis did not reveal any significant association between intermediate repeats and PD. The effect estimates (odds ratio) ranged from 0.93 to 1.01 in the overall cohort for the SCA2, SCA3, SCA6, and SCA17 loci. Conclusions: Our study did not support a major role for definite pathogenic repeat expansions in SCA2, SCA3, SCA6, and SCA17 genes for idiopathic PD. Thus, results of this large study do not support diagnostic screening of SCA2, SCA3, SCA6, and SCA17 gene repeats in the common idiopathic form of PD. Likewise, this largest multicentered study performed to date excludes the role of intermediate repeats of these genes as a risk factor for PD. PMID:26354989
Fibrillar Structure and Charge Determine the Interaction of Polyglutamine Protein Aggregates with the Cell Surface*

PubMed Central

Trevino, R. Sean; Lauckner, Jane E.; Sourigues, Yannick; Pearce, Margaret M.; Bousset, Luc; Melki, Ronald; Kopito, Ron R.

2012-01-01

The pathogenesis of most neurodegenerative diseases, including transmissible diseases like prion encephalopathy, inherited disorders like Huntington disease, and sporadic diseases like Alzheimer and Parkinson diseases, is intimately linked to the formation of fibrillar protein aggregates. It is becoming increasingly appreciated that prion-like intercellular transmission of protein aggregates can contribute to the stereotypical spread of disease pathology within the brain, but the mechanisms underlying the binding and uptake of protein aggregates by mammalian cells are largely uninvestigated. We have investigated the properties of polyglutamine (polyQ) aggregates that endow them with the ability to bind to mammalian cells in culture and the properties of the cell surface that facilitate such uptake. Binding and internalization of polyQ aggregates are common features of mammalian cells and depend upon both trypsin-sensitive and trypsin-resistant saturable sites on the cell surface, suggesting the involvement of cell surface proteins in this process. polyQ aggregate binding depends upon the presence of a fibrillar amyloid-like structure and does not depend upon electrostatic interaction of fibrils with the cell surface. Sequences in the huntingtin protein that flank the amyloid-forming polyQ tract also influence the extent to which aggregates are able to bind to cell surfaces. PMID:22753412
Ubiquilin overexpression reduces GFP-polyalanine-induced protein aggregates and toxicity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang Hongmin; Monteiro, Mervyn J.

2007-08-01

Several human disorders are associated with an increase in a continuous stretch of alanine amino acids in proteins. These so-called polyalanine expansion diseases share many similarities with polyglutamine-related disorders, including a length-dependent reiteration of amino acid induction of protein aggregation and cytotoxicity. We previously reported that overexpression of ubiquilin reduces protein aggregates and toxicity of expanded polyglutamine proteins. Here, we demonstrate a similar role for ubiquilin toward expanded polyalanine proteins. Overexpression of ubiquilin-1 in HeLa cells reduced protein aggregates and the cytotoxicity associated with expression of a transfected nuclear-targeted GFP-fusion protein containing 37-alanine repeats (GFP-A37), in a dose dependent manner.more » Ubiquilin coimmunoprecipitated more with GFP proteins containing a 37-polyalanine tract compared to either 7 (GFP-A7), or no alanine tract (GFP). Moreover, overexpression of ubiquilin suppressed the increased vulnerability of HeLa cell lines stably expressing the GFP-A37 fusion protein to oxidative stress-induced cell death compared to cell lines expressing GFP or GFP-A7 proteins. By contrast, siRNA knockdown of ubiquilin expression in the GFP-A37 cell line was associated with decreased cellular proliferation, and increases in GFP protein aggregates, nuclear fragmentation, and cell death. Our results suggest that boosting ubiquilin levels in cells might provide a universal and attractive strategy to prevent toxicity of proteins containing reiterative expansions of amino acids involved in many human diseases.« less
Suppressing aberrant GluN3A expression rescues NMDA receptor dysfunction, synapse loss and motor and cognitive decline in Huntington's disease models

PubMed Central

Marco, Sonia; Giralt, Albert; Petrovic, Milos M.; Pouladi, Mahmoud A.; Martínez-Turrillas, Rebeca; Martínez-Hernández, José; Kaltenbach, Linda S.; Torres-Peraza, Jesús; Graham, Rona K.; Watanabe, Masahiko; Luján, Rafael; Nakanishi, Nobuki; Lipton, Stuart A.; Lo, Donald C.; Hayden, Michael R.; Alberch, Jordi; Wesseling, John F.

2013-01-01

Huntington's disease is caused by an expanded polyglutamine repeat in huntingtin (Htt), but the pathophysiological sequence of events that trigger synaptic failure and neuronal loss are not fully understood. Alterations in NMDA-type glutamate receptors (NMDARs) have been implicated, yet it remains unclear how the Htt mutation impacts NMDAR function and direct evidence for a causative role is missing. Here we show that mutant Htt re-directs an intracellular store of juvenile NMDARs to the surface of striatal neurons by sequestering and disrupting the subcellular localization of the GluN3A subunit-specific endocytic adaptor PACSIN1. Overexpressing GluN3A in wild-type striatum mimicked the synapse loss observed in Huntington's disease mouse models, whereas genetic deletion of GluN3A prevented synapse degeneration, ameliorated motor and cognitive decline, and reduced striatal atrophy and neuronal loss in the YAC128 model. Furthermore, GluN3A deletion corrected the abnormally enhanced NMDAR currents, which have been linked to cell death in Huntington's disease and other neurodegenerative conditions. Our findings reveal an early pathogenic role of GluN3A dysregulation in Huntington's disease, and suggest that therapies targeting GluN3A or pathogenic Htt-PACSIN1 interactions might prevent or delay disease progression. PMID:23852340
Are Long-Range Structural Correlations Behind the Aggregration Phenomena of Polyglutamine Diseases?

PubMed Central

Moradi, Mahmoud; Babin, Volodymyr; Roland, Christopher; Sagui, Celeste

2012-01-01

We have characterized the conformational ensembles of polyglutamine peptides of various lengths (ranging from to ), both with and without the presence of a C-terminal polyproline hexapeptide. For this, we used state-of-the-art molecular dynamics simulations combined with a novel statistical analysis to characterize the various properties of the backbone dihedral angles and secondary structural motifs of the glutamine residues. For (i.e., just above the pathological length for Huntington's disease), the equilibrium conformations of the monomer consist primarily of disordered, compact structures with non-negligible -helical and turn content. We also observed a relatively small population of extended structures suitable for forming aggregates including - and -strands, and - and -hairpins. Most importantly, for we find that there exists a long-range correlation (ranging for at least residues) among the backbone dihedral angles of the Q residues. For polyglutamine peptides below the pathological length, the population of the extended strands and hairpins is considerably smaller, and the correlations are short-range (at most residues apart). Adding a C-terminal hexaproline to suppresses both the population of these rare motifs and the long-range correlation of the dihedral angles. We argue that the long-range correlation of the polyglutamine homopeptide, along with the presence of these rare motifs, could be responsible for its aggregation phenomena. PMID:22577357
Myricetin Reduces Toxic Level of CAG Repeats RNA in Huntington's Disease (HD) and Spino Cerebellar Ataxia (SCAs).

PubMed

Khan, Eshan; Tawani, Arpita; Mishra, Subodh Kumar; Verma, Arun Kumar; Upadhyay, Arun; Kumar, Mohit; Sandhir, Rajat; Mishra, Amit; Kumar, Amit

2018-01-19

Huntington's disease (HD) is a neurodegenerative disorder that is caused by abnormal expansion of CAG repeats in the HTT gene. The transcribed mutant RNA contains expanded CAG repeats that translate into a mutant huntingtin protein. This expanded CAG repeat also causes mis-splicing of pre-mRNA due to sequestration of muscle blind like-1 splicing factor (MBNL1), and thus both of these elicit the pathogenesis of HD. Targeting the onset as well as progression of HD by small molecules could be a potent therapeutic approach. We have screened a set of small molecules to target this transcript and found Myricetin, a flavonoid, as a lead molecule that interacts with the CAG motif and thus prevents the translation of mutant huntingtin protein as well as sequestration of MBNL1. Here, we report the first solution structure of the complex formed between Myricetin and RNA containing the 5'CAG/3'GAC motif. Myricetin interacts with this RNA via base stacking at the AA mismatch. Moreover, Myricetin was also found reducing the proteo-toxicity generated due to the aggregation of polyglutamine, and further, its supplementation also improves neurobehavioral deficits in the HD mouse model. Our study provides the structural and mechanistic basis of Myricetin as an effective therapeutic candidate for HD and other polyQ related disorders.
Similar Progression of Morphological and Metabolic Phenotype in R6/2 Mice with Different CAG Repeats Revealed by In Vivo Magnetic Resonance Imaging and Spectroscopy.

PubMed

Sawiak, Stephen J; Wood, Nigel I; Morton, A Jennifer

2016-10-01

Huntington's disease (HD) is caused by an unstable polyglutamine (CAG) repeat in the HD gene, whereby a CAG repeat length greater than ∼36 leads to the disease. In HD patients, longer repeats correlate with more severe disease and earlier death. This is also seen in R6/2 mice carrying repeat lengths up to ∼200. Paradoxically, R6/2 mice with repeat lengths >300 have a less aggressive phenotype and longer lifespan than those with shorter repeats. The mechanism underlying this phenomenon is unknown. To investigate the consequences of longer repeat lengths on structural changes in the brains of R6/2 mice, especially with regard to progressive atrophy. We used longitudinal in vivo magnetic resonance imaging (MRI) and spectroscopy (MRS) to compare pathological changes in two strains of R6/2 mice, one with a rapidly progressing disease (250 CAG repeats), and the other with a less aggressive phenotype (350 CAG repeats). We found significant progressive brain atrophy in both 250 and 350 CAG repeat mice, as well as changes in metabolites (glutamine/glutamate, choline and aspartate). Although similar in magnitude, atrophy in the brains of 350 CAG R6/2 mice progressed more slowly than that seen in 250 CAG mice, in line with the milder phenotype and longer lifespan. Interestingly, significant atrophy was detectable in 350 CAG mice as early as 8-12 weeks of age, although behavioural abnormalities in these mice are not apparent before 25-30 weeks. This finding fits well with human data from the PREDICT-HD and TRACK-HD project, where reductions in brain volume were found 10 years in advance of the onset of symptoms. The similar brain atrophy with a mismatch between onset of brain atrophy and behavioural phenotype in HD mice with 350 repeats will make this mouse particularly useful for modelling early stages of HD pathology.
Generation of induced pluripotent stem cells from a patient with spinocerebellar ataxia type 3.

PubMed

Soong, Bing-Wen; Syu, Shih-Han; Wen, Cheng-Hao; Ko, Hui-Wen; Wu, Mei-Ling; Hsieh, Patrick C H; Hwang, Shiaw-Min; Lu, Huai-En

2017-01-01

Spinocerebellar ataxia type 3 (SCA3) is a dominantly inherited neurodegenerative disease caused by a trinucleotide repeat (CAG) expansion in the coding region of ATXN3 gene resulting in production of ataxin-3 with an elongated polyglutamine tract. Here, we generated induced pluripotent stem cells (iPSCs) from the peripheral blood mononuclear cells of a male patient with SCA3 by using the Sendai-virus delivery system. The resulting iPSCs had a normal karyotype, retained the disease-causing ATXN3 mutation, expressed pluripotent markers and could differentiate into the three germ layers. Potentially, the iPSCs could be a useful tool for the investigation of disease mechanisms of SCA3. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Exercise and Genetic Rescue of SCA1 via the Transcriptional Repressor Capicua*

PubMed Central

Fryer, John D.; Yu, Peng; Kang, Hyojin; Mandel-Brehm, Caleigh; Carter, Angela N.; Crespo-Barreto, Juan; Gao, Yan; Flora, Adriano; Shaw, Chad; Orr, Harry T.; Zoghbi, Huda Y.

2011-01-01

Spinocerebellar ataxia type 1 (SCA1) is a fatal neurodegenerative disease caused by expansion of a translated CAG repeat in Ataxin-1 (ATXN1). To determine the long-term effects of exercise, we implemented a mild exercise regimen in a mouse model of SCA1 and found a considerable improvement in survival accompanied by upregulation of epidermal growth factor and consequential downregulation of Capicua, an ATXN1 interactor. Offspring of Capicua mutant mice bred to SCA1 mice showed significant improvement of all disease phenotypes. Although polyglutamine-expanded Atxn1 caused some loss of Capicua function, further reducing Capicua levels, either genetically or by exercise, mitigated the disease phenotypes. Thus, exercise might have long-term beneficial effects in other ataxias and neurodegenerative diseases. PMID:22053053
[Personal genome research and neurological diseases: overview].

PubMed

Toda, Tatsushi

2013-03-01

Neurological diseases include those caused by a single defective gene,e.g., Huntington's disease, other polyglutamine diseases, and muscular dystrophies, and those that are mostly sporadic but rarely show Mendelian inheritance in some families, e.g., Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis, and epilepsy. The latter diseases are considered polygenic disorders. Both sporadic and Mendelian cases of these diseases are believed to share some common pathological mechanisms. Since the detection of causal genes for the Mendelian cases, studies have been initiated on disease pathology. SNPs and rare gene variants play important roles in common neurological diseases. From a technological perspective, next-generation sequencers have become widely available and have contributed to the advancement of research based on individual genome sequences (personal genome). This paper presents an overview, as well as a historical context, of the contribution of personal genome research to neurological disease studies.
Expression levels of DNA replication and repair genes predict regional somatic repeat instability in the brain but are not altered by polyglutamine disease protein expression or age.

PubMed

Mason, Amanda G; Tomé, Stephanie; Simard, Jodie P; Libby, Randell T; Bammler, Theodor K; Beyer, Richard P; Morton, A Jennifer; Pearson, Christopher E; La Spada, Albert R

2014-03-15

Expansion of CAG/CTG trinucleotide repeats causes numerous inherited neurological disorders, including Huntington's disease (HD), several spinocerebellar ataxias and myotonic dystrophy type 1. Expanded repeats are genetically unstable with a propensity to further expand when transmitted from parents to offspring. For many alleles with expanded repeats, extensive somatic mosaicism has been documented. For CAG repeat diseases, dramatic instability has been documented in the striatum, with larger expansions noted with advancing age. In contrast, only modest instability occurs in the cerebellum. Using microarray expression analysis, we sought to identify the genetic basis of these regional instability differences by comparing gene expression in the striatum and cerebellum of aged wild-type C57BL/6J mice. We identified eight candidate genes enriched in cerebellum, and validated four--Pcna, Rpa1, Msh6 and Fen1--along with a highly associated interactor, Lig1. We also explored whether expression levels of mismatch repair (MMR) proteins are altered in a line of HD transgenic mice, R6/2, that is known to show pronounced regional repeat instability. Compared with wild-type littermates, MMR expression levels were not significantly altered in R6/2 mice regardless of age. Interestingly, expression levels of these candidates were significantly increased in the cerebellum of control and HD human samples in comparison to striatum. Together, our data suggest that elevated expression levels of DNA replication and repair proteins in cerebellum may act as a safeguard against repeat instability, and may account for the dramatically reduced somatic instability present in this brain region, compared with the marked instability observed in the striatum.
Contribution of ATXN2 intermediary polyQ expansions in a spectrum of neurodegenerative disorders.

PubMed

Lattante, Serena; Millecamps, Stéphanie; Stevanin, Giovanni; Rivaud-Péchoux, Sophie; Moigneu, Carine; Camuzat, Agnès; Da Barroca, Sandra; Mundwiller, Emeline; Couarch, Philippe; Salachas, François; Hannequin, Didier; Meininger, Vincent; Pasquier, Florence; Seilhean, Danielle; Couratier, Philippe; Danel-Brunaud, Véronique; Bonnet, Anne-Marie; Tranchant, Christine; LeGuern, Eric; Brice, Alexis; Le Ber, Isabelle; Kabashi, Edor

2014-09-09

The aim of this study was to establish the frequency of ATXN2 polyglutamine (polyQ) expansion in large cohorts of patients with amyotrophic lateral sclerosis (ALS), frontotemporal dementia (FTD), and progressive supranuclear palsy (PSP), and to evaluate whether ATXN2 could act as a modifier gene in patients carrying the C9orf72 expansion. We screened a large cohort of French patients (1,144 ALS, 203 FTD, 168 FTD-ALS, and 109 PSP) for ATXN2 CAG repeat length. We included in our cohort 322 carriers of the C9orf72 expansion (202 ALS, 63 FTD, and 57 FTD-ALS). We found a significant association with intermediate repeat size (≥29 CAG) in patients with ALS (both familial and sporadic) and, for the first time, in patients with familial FTD-ALS. Of interest, we found the co-occurrence of pathogenic C9orf72 expansion in 23.2% of ATXN2 intermediate-repeat carriers, all in the FTD-ALS and familial ALS subgroups. In the cohort of C9orf72 carriers, 3.1% of patients also carried an intermediate ATXN2 repeat length. ATXN2 repeat lengths in patients with PSP and FTD were found to be similar to the controls. ATXN2 intermediary repeat length is a strong risk factor for ALS and FTD-ALS. Furthermore, we propose that ATXN2 polyQ expansions could act as a strong modifier of the FTD phenotype in the presence of a C9orf72 repeat expansion, leading to the development of clinical signs featuring both FTD and ALS. © 2014 American Academy of Neurology.
Contribution of ATXN2 intermediary polyQ expansions in a spectrum of neurodegenerative disorders

PubMed Central

Lattante, Serena; Millecamps, Stéphanie; Stevanin, Giovanni; Rivaud-Péchoux, Sophie; Moigneu, Carine; Camuzat, Agnès; Da Barroca, Sandra; Mundwiller, Emeline; Couarch, Philippe; Salachas, François; Hannequin, Didier; Meininger, Vincent; Pasquier, Florence; Seilhean, Danielle; Couratier, Philippe; Danel-Brunaud, Véronique; Bonnet, Anne-Marie; Tranchant, Christine; LeGuern, Eric; Brice, Alexis; Le Ber, Isabelle

2014-01-01

Objective: The aim of this study was to establish the frequency of ATXN2 polyglutamine (polyQ) expansion in large cohorts of patients with amyotrophic lateral sclerosis (ALS), frontotemporal dementia (FTD), and progressive supranuclear palsy (PSP), and to evaluate whether ATXN2 could act as a modifier gene in patients carrying the C9orf72 expansion. Methods: We screened a large cohort of French patients (1,144 ALS, 203 FTD, 168 FTD-ALS, and 109 PSP) for ATXN2 CAG repeat length. We included in our cohort 322 carriers of the C9orf72 expansion (202 ALS, 63 FTD, and 57 FTD-ALS). Results: We found a significant association with intermediate repeat size (≥29 CAG) in patients with ALS (both familial and sporadic) and, for the first time, in patients with familial FTD-ALS. Of interest, we found the co-occurrence of pathogenic C9orf72 expansion in 23.2% of ATXN2 intermediate-repeat carriers, all in the FTD-ALS and familial ALS subgroups. In the cohort of C9orf72 carriers, 3.1% of patients also carried an intermediate ATXN2 repeat length. ATXN2 repeat lengths in patients with PSP and FTD were found to be similar to the controls. Conclusions: ATXN2 intermediary repeat length is a strong risk factor for ALS and FTD-ALS. Furthermore, we propose that ATXN2 polyQ expansions could act as a strong modifier of the FTD phenotype in the presence of a C9orf72 repeat expansion, leading to the development of clinical signs featuring both FTD and ALS. PMID:25098532
Transgenic Monkey Model of the Polyglutamine Diseases Recapitulating Progressive Neurological Symptoms

PubMed Central

Ishibashi, Hidetoshi; Minakawa, Eiko N.; Motohashi, Hideyuki H.; Takayama, Osamu; Popiel, H. Akiko; Puentes, Sandra; Owari, Kensuke; Nakatani, Terumi; Nogami, Naotake; Yamamoto, Kazuhiro; Yonekawa, Takahiro; Tanaka, Yoko; Fujita, Naoko; Suzuki, Hikaru; Aizawa, Shu; Nagano, Seiichi; Yamada, Daisuke; Wada, Keiji; Kohsaka, Shinichi

2017-01-01

Abstract Age-associated neurodegenerative diseases, such as Alzheimer’s disease, Parkinson’s disease, and the polyglutamine (polyQ) diseases, are becoming prevalent as a consequence of elongation of the human lifespan. Although various rodent models have been developed to study and overcome these diseases, they have limitations in their translational research utility owing to differences from humans in brain structure and function and in drug metabolism. Here, we generated a transgenic marmoset model of the polyQ diseases, showing progressive neurological symptoms including motor impairment. Seven transgenic marmosets were produced by lentiviral introduction of the human ataxin 3 gene with 120 CAG repeats encoding an expanded polyQ stretch. Although all offspring showed no neurological symptoms at birth, three marmosets with higher transgene expression developed neurological symptoms of varying degrees at 3–4 months after birth, followed by gradual decreases in body weight gain, spontaneous activity, and grip strength, indicating time-dependent disease progression. Pathological examinations revealed neurodegeneration and intranuclear polyQ protein inclusions accompanied by gliosis, which recapitulate the neuropathological features of polyQ disease patients. Consistent with neuronal loss in the cerebellum, brain MRI analyses in one living symptomatic marmoset detected enlargement of the fourth ventricle, which suggests cerebellar atrophy. Notably, successful germline transgene transmission was confirmed in the second-generation offspring derived from the symptomatic transgenic marmoset gamete. Because the accumulation of abnormal proteins is a shared pathomechanism among various neurodegenerative diseases, we suggest that this new marmoset model will contribute toward elucidating the pathomechanisms of and developing clinically applicable therapies for neurodegenerative diseases. PMID:28374014
Study of the aggregation mechanism of polyglutamine peptides using replica exchange molecular dynamics simulations.

PubMed

Nakano, Miki; Ebina, Kuniyoshi; Tanaka, Shigenori

2013-04-01

Polyglutamine (polyQ, a peptide) with an abnormal repeat length is the causative agent of polyQ diseases, such as Huntington's disease. Although glutamine is a polar residue, polyQ peptides form insoluble aggregates in water, and the mechanism for this aggregation is still unclear. To elucidate the detailed mechanism for the nucleation and aggregation of polyQ peptides, replica exchange molecular dynamics simulations were performed for monomers and dimers of polyQ peptides with several chain lengths. Furthermore, to determine how the aggregation mechanism of polyQ differs from those of other peptides, we compared the results for polyQ with those of polyasparagine and polyleucine. The energy barrier between the monomeric and dimeric states of polyQ was found to be relatively low, and it was observed that polyQ dimers strongly favor the formation of antiparallel β-sheet structures. We also found a characteristic behavior of the monomeric polyQ peptide: a turn at the eighth residue is always present, even when the chain length is varied. We previously showed that a structure including more than two sets of β-turns is stable, so a long monomeric polyQ chain can act as an aggregation nucleus by forming several pairs of antiparallel β-sheet structures within a single chain. Since the aggregation of polyQ peptides has some features in common with an amyloid fibril, our results shed light on the mechanism for the aggregation of polyQ peptides as well as the mechanism for the formation of general amyloid fibrils, which cause the onset of amyloid diseases.
Caffeine alleviates progressive motor deficits in a transgenic mouse model of spinocerebellar ataxia.

PubMed

Gonçalves, Nélio; Simões, Ana T; Prediger, Rui D; Hirai, Hirokazu; Cunha, Rodrigo A; Pereira de Almeida, Luís

2017-03-01

Machado-Joseph disease (MJD) is a neurodegenerative spinocerebellar ataxia (SCA) associated with an expanded polyglutamine tract within ataxin-3 for which there is currently no available therapy. We previously showed that caffeine, a nonselective adenosine receptor antagonist, delays the appearance of striatal damage resulting from expression of full-length mutant ataxin-3. Here we investigated the ability of caffeine to alleviate behavioral deficits and cerebellar neuropathology in transgenic mice with a severe ataxia resulting from expression of a truncated fragment of polyglutamine-expanded ataxin-3 in Purkinje cells. Control and transgenic c57Bl6 mice expressing in the mouse cerebella a truncated form of human ataxin-3 with 69 glutamine repeats were allowed to freely drink water or caffeinated water (1g/L). Treatments began at 7 weeks of age, when motor and ataxic phenotype emerges in MJD mice, and lasted up to 20 weeks. Mice were tested in a panel of locomotor behavioral paradigms, namely rotarod, beam balance and walking, pole, and water maze cued-platform version tests, and then sacrificed for cerebellar histology. Caffeine consumption attenuated the progressive loss of general and fine-tuned motor function, balance, and grip strength, in parallel with preservation of cerebellar morphology through decreasing the loss of Purkinje neurons and the thinning of the molecular layer in different folia. Caffeine also rescued the putative striatal-dependent executive and cognitive deficiencies in MJD mice. Our findings provide the first in vivo demonstration that caffeine intake alleviates behavioral disabilities in a severely impaired animal model of SCA. Ann Neurol 2017;81:407-418. © 2016 American Neurological Association.

Cell-to-cell Transmission of Polyglutamine Aggregates in C. elegans

PubMed Central

Kim, Dong-Kyu; Cho, Kyu-Won; Ahn, Woo Jung; Perez-Acuña, Dayana; Jeong, Hyunsu; Lee, He-Jin

2017-01-01

Huntington disease (HD) is an inherited neurodegenerative disorder characterized by motor and cognitive dysfunction caused by expansion of polyglutamine (polyQ) repeat in exon 1 of huntingtin (HTT). In patients, the number of glutamine residues in polyQ tracts are over 35, and it is correlated with age of onset, severity, and disease progression. Expansion of polyQ increases the propensity for HTT protein aggregation, process known to be implicated in neurodegeneration. These pathological aggregates can be transmitted from neuron to another neuron, and this process may explain the pathological spreading of polyQ aggregates. Here, we developed an in vivo model for studying transmission of polyQ aggregates in a highly quantitative manner in real time. HTT exon 1 with expanded polyQ was fused with either N-terminal or C-terminal fragments of Venus fluorescence protein and expressed in pharyngeal muscles and associated neurons, respectively, of C. elegans. Transmission of polyQ proteins was detected using bimolecular fluorescence complementation (BiFC). Mutant polyQ (Q97) was transmitted much more efficiently than wild type polyQ (Q25) and forms numerous inclusion bodies as well. The transmission of Q97 was gradually increased with aging of animal. The animals with polyQ transmission exhibited degenerative phenotypes, such as nerve degeneration, impaired pharyngeal pumping behavior, and reduced life span. The C. elegans model presented here would be a useful in vivo model system for the study of polyQ aggregate propagation and might be applied to the screening of genetic and chemical modifiers of the propagation. PMID:29302199
The insulin-like growth factor pathway is altered in Spinocerebellar ataxia type 1 and type 7

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gatchel, Jennifer R.; Watase, Kei; Thaller, Christina

2008-01-29

Polyglutamine diseases are inherited neurodegenerative disorders caused by expansion of CAG trinucleotide repeats encoding a polyglutamine tract in the disease-causing proteins. There are nine of these disorders each having distinct features but also clinical and pathological similarities. In particular, spinocerebellar ataxia type 1 and 7 (SCA1 and SCA7) patients manifest cerebellar ataxia with corresponding degeneration of Purkinje cells. Given this common phenotype, we asked whether the two disorders share common molecular pathogenic events. To address this question we studied two genetically accurate mouse models of SCA1 and SCA7—Sca1154Q/2Q and Sca7266Q/5Q knock-in mice—that express the glutamine-expanded proteins from the respective endogenousmore » loci. We found common transcriptional changes in early symptomatic mice, with downregulation of Insulin-like growth factor binding protein 5 (Igfbp5) representing one of the most robust transcriptional changes that closely correlates with disease state. Interestingly, down-regulation of Igfbp5 occurred in granule neurons through a non-cell autonomous mechanism and was concomitant with activation of the Insulin-like growth factor I (Igf-I) pathway, and, in particular, the Igf-I receptor, expressed in part on Purkinje cells (PC). These data define a possible common pathogenic response in SCA1 and SCA7 and reveal the importance of neuron-neuron interactions in SCA1 and SCA7 pathogenesis. The sensitivity of Igfbp5 levels to disease state could render it and other components of its effector pathway useful as biomarkers in this class of diseases.« less
Synergistic Toxicity of Polyglutamine-Expanded TATA-Binding Protein in Glia and Neuronal Cells: Therapeutic Implications for Spinocerebellar Ataxia 17

PubMed Central

Yang, Yang; Cui, Yiting; Tang, Beisha

2017-01-01

Spinocerebellar ataxia 17 (SCA17) is caused by polyglutamine (polyQ) repeat expansion in the TATA-binding protein (TBP) and is among a family of neurodegenerative diseases in which polyQ expansion leads to preferential neuronal loss in the brain. Although previous studies have demonstrated that expression of polyQ-expanded proteins in glial cells can cause neuronal injury via noncell-autonomous mechanisms, these studies investigated animal models that overexpress transgenic mutant proteins. Since glial cells are particularly reactive to overexpressed mutant proteins, it is important to investigate the in vivo role of glial dysfunction in neurodegeneration when mutant polyQ proteins are endogenously expressed. In the current study, we generated two conditional TBP-105Q knock-in mouse models that specifically express mutant TBP at the endogenous level in neurons or in astrocytes. We found that mutant TBP expression in neuronal cells or astrocytes alone only caused mild neurodegeneration, whereas severe neuronal toxicity requires the expression of mutant TBP in both neuronal and glial cells. Coculture of neurons and astrocytes further validated that mutant TBP in astrocytes promoted neuronal injury. We identified activated inflammatory signaling pathways in mutant TBP-expressing astrocytes, and blocking nuclear factor κB (NF-κB) signaling in astrocytes ameliorated neurodegeneration. Our results indicate that the synergistic toxicity of mutant TBP in neuronal and glial cells plays a critical role in SCA17 pathogenesis and that targeting glial inflammation could be a potential therapeutic approach for SCA17 treatment. SIGNIFICANCE STATEMENT Mutant TBP with polyglutamine expansion preferentially affects neuronal viability in SCA17 patients. Whether glia, the cells that support and protect neurons, contribute to neurodegeneration in SCA17 remains mostly unexplored. In this study, we provide both in vivo and in vitro evidence arguing that endogenous expression of mutant TBP in neurons and glia synergistically impacts neuronal survival. Hyperactivated inflammatory signaling pathways, particularly the NF-κB pathway, underlie glia-mediated neurotoxicity. Moreover, blocking NF-κB activity with small chemical inhibitors alleviated such neurotoxicity. Our study establishes glial dysfunction as an important component of SCA17 pathogenesis and suggests targeting glial inflammation as a potential therapeutic approach for SCA17 treatment. PMID:28821675
ATF3 plays a protective role against toxicity by N-terminal fragment of mutant huntingtin in stable PC12 cell line

PubMed Central

Liang, Yideng; Jiang, Haibing; Ratovitski, Tamara; Jie, Chunfa; Nakamura, Masayuki; Hirschhorn, Ricky R.; Wang, Xiaofang; Smith, Wanli W.; Hai, Tsonwin; Poirier, Michelle A.; Ross, Christopher A.

2009-01-01

Huntington's disease is a progressive neurodegenerative disorder caused by a polyglutamine expansion near the N-terminus of huntingtin. The mechanisms of polyglutamine neurotoxicity, and cellular responses are not fully understood. We have studied gene expression profiles by cDNA array using an inducible PC12 cell model expressing an N-terminal huntingtin fragment with expanded polyglutamine (Htt-N63-148Q). Mutant huntingtin Htt-N63 induced cell death and increased the mRNA and protein levels of activating transcription factor 3 (ATF3). Mutant Htt-N63 also significantly enhanced ATF3 transcriptional activity by a promoter-based reporter assay. Overexpression of ATF3 protects against mutant Htt-N63 toxicity and knocking down ATF3 expression reduced Htt-N63 toxicity in a stable PC12 cell line. These results indicated that ATF3 plays a critical role in toxicity induced by mutant Htt-N63 and may lead to a useful therapeutic target. PMID:19559011
Analysis of sequence repeats of proteins in the PDB.

PubMed

Mary Rajathei, David; Selvaraj, Samuel

2013-12-01

Internal repeats in protein sequences play a significant role in the evolution of protein structure and function. Applications of different bioinformatics tools help in the identification and characterization of these repeats. In the present study, we analyzed sequence repeats in a non-redundant set of proteins available in the Protein Data Bank (PDB). We used RADAR for detecting internal repeats in a protein, PDBeFOLD for assessing structural similarity, PDBsum for finding functional involvement and Pfam for domain assignment of the repeats in a protein. Through the analysis of sequence repeats, we found that identity of the sequence repeats falls in the range of 20-40% and, the superimposed structures of the most of the sequence repeats maintain similar overall folding. Analysis sequence repeats at the functional level reveals that most of the sequence repeats are involved in the function of the protein through functionally involved residues in the repeat regions. We also found that sequence repeats in single and two domain proteins often contained conserved sequence motifs for the function of the domain. Copyright © 2013 Elsevier Ltd. All rights reserved.
Transcriptional control of amino acid homeostasis is disrupted in Huntington’s disease

PubMed Central

Sbodio, Juan I.; Snyder, Solomon H.; Paul, Bindu D.

2016-01-01

Disturbances in amino acid metabolism, which have been observed in Huntington’s disease (HD), may account for the profound inanition of HD patients. HD is triggered by an expansion of polyglutamine repeats in the protein huntingtin (Htt), impacting diverse cellular processes, ranging from transcriptional regulation to cognitive and motor functions. We show here that the master regulator of amino acid homeostasis, activating transcription factor 4 (ATF4), is dysfunctional in HD because of oxidative stress contributed by aberrant cysteine biosynthesis and transport. Consistent with these observations, antioxidant supplementation reverses the disordered ATF4 response to nutrient stress. Our findings establish a molecular link between amino acid disposition and oxidative stress leading to cytotoxicity. This signaling cascade may be relevant to other diseases involving redox imbalance and deficits in amino acid metabolism. PMID:27436896
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen Min; Mikecz, Anna von

Despite of their exponentially growing use, little is known about cell biological effects of nanoparticles. Here, we report uptake of silica (SiO{sub 2}) nanoparticles to the cell nucleus where they induce aberrant clusters of topoisomerase I (topo I) in the nucleoplasm that additionally contain signature proteins of nuclear domains, and protein aggregation such as ubiquitin, proteasomes, cellular glutamine repeat (polyQ) proteins, and huntingtin. Formation of intranuclear protein aggregates (1) inhibits replication, transcription, and cell proliferation; (2) does not significantly alter proteasomal activity or cell viability; and (3) is reversible by Congo red and trehalose. Since SiO{sub 2} nanoparticles trigger amore » subnuclear pathology resembling the one occurring in expanded polyglutamine neurodegenerative disorders, we suggest that integrity of the functional architecture of the cell nucleus should be used as a read out for cytotoxicity and considered in the development of safe nanotechnology.« less
Polyglutamine Disease Modeling: Epitope Based Screen for Homologous Recombination using CRISPR/Cas9 System.

PubMed

An, Mahru C; O'Brien, Robert N; Zhang, Ningzhe; Patra, Biranchi N; De La Cruz, Michael; Ray, Animesh; Ellerby, Lisa M

2014-04-15

We have previously reported the genetic correction of Huntington's disease (HD) patient-derived induced pluripotent stem cells using traditional homologous recombination (HR) approaches. To extend this work, we have adopted a CRISPR-based genome editing approach to improve the efficiency of recombination in order to generate allelic isogenic HD models in human cells. Incorporation of a rapid antibody-based screening approach to measure recombination provides a powerful method to determine relative efficiency of genome editing for modeling polyglutamine diseases or understanding factors that modulate CRISPR/Cas9 HR.
The cryo-electron microscopy structure of huntingtin

NASA Astrophysics Data System (ADS)

Guo, Qiang; Bin Huang; Cheng, Jingdong; Seefelder, Manuel; Engler, Tatjana; Pfeifer, Günter; Oeckl, Patrick; Otto, Markus; Moser, Franziska; Maurer, Melanie; Pautsch, Alexander; Baumeister, Wolfgang; Fernández-Busnadiego, Rubén; Kochanek, Stefan

2018-03-01

Huntingtin (HTT) is a large (348 kDa) protein that is essential for embryonic development and is involved in diverse cellular activities such as vesicular transport, endocytosis, autophagy and the regulation of transcription. Although an integrative understanding of the biological functions of HTT is lacking, the large number of identified HTT interactors suggests that it serves as a protein-protein interaction hub. Furthermore, Huntington’s disease is caused by a mutation in the HTT gene, resulting in a pathogenic expansion of a polyglutamine repeat at the amino terminus of HTT. However, only limited structural information regarding HTT is currently available. Here we use cryo-electron microscopy to determine the structure of full-length human HTT in a complex with HTT-associated protein 40 (HAP40; encoded by three F8A genes in humans) to an overall resolution of 4 Å. HTT is largely α-helical and consists of three major domains. The amino- and carboxy-terminal domains contain multiple HEAT (huntingtin, elongation factor 3, protein phosphatase 2A and lipid kinase TOR) repeats arranged in a solenoid fashion. These domains are connected by a smaller bridge domain containing different types of tandem repeats. HAP40 is also largely α-helical and has a tetratricopeptide repeat-like organization. HAP40 binds in a cleft and contacts the three HTT domains by hydrophobic and electrostatic interactions, thereby stabilizing the conformation of HTT. These data rationalize previous biochemical results and pave the way for improved understanding of the diverse cellular functions of HTT.
Bifunctional Anti-Huntingtin Proteasome-Directed Intrabodies Mediate Efficient Degradation of Mutant Huntingtin Exon 1 Protein Fragments

PubMed Central

Butler, David C.; Messer, Anne

2011-01-01

Huntington's disease (HD) is a fatal autosomal dominant neurodegenerative disorder caused by a trinucleotide (CAG)n repeat expansion in the coding sequence of the huntingtin gene, and an expanded polyglutamine (>37Q) tract in the protein. This results in misfolding and accumulation of huntingtin protein (htt), formation of neuronal intranuclear and cytoplasmic inclusions, and neuronal dysfunction/degeneration. Single-chain Fv antibodies (scFvs), expressed as intrabodies that bind htt and prevent aggregation, show promise as immunotherapeutics for HD. Intrastriatal delivery of anti-N-terminal htt scFv-C4 using an adeno-associated virus vector (AAV2/1) significantly reduces the size and number of aggregates in HDR6/1 transgenic mice; however, this protective effect diminishes with age and time after injection. We therefore explored enhancing intrabody efficacy via fusions to heterologous functional domains. Proteins containing a PEST motif are often targeted for proteasomal degradation and generally have a short half life. In ST14A cells, fusion of the C-terminal PEST region of mouse ornithine decarboxylase (mODC) to scFv-C4 reduces htt exon 1 protein fragments with 72 glutamine repeats (httex1-72Q) by ∼80–90% when compared to scFv-C4 alone. Proteasomal targeting was verified by either scrambling the mODC-PEST motif, or via proteasomal inhibition with epoxomicin. For these constructs, the proteasomal degradation of the scFv intrabody proteins themselves was reduced<25% by the addition of the mODC-PEST motif, with or without antigens. The remaining intrabody levels were amply sufficient to target N-terminal httex1-72Q protein fragment turnover. Critically, scFv-C4-PEST prevents aggregation and toxicity of httex1-72Q fragments at significantly lower doses than scFv-C4. Fusion of the mODC-PEST motif to intrabodies is a valuable general approach to specifically target toxic antigens to the proteasome for degradation. PMID:22216210
Structure-function analysis of mouse Sry reveals dual essential roles of the C-terminal polyglutamine tract in sex determination.

PubMed

Zhao, Liang; Ng, Ee Ting; Davidson, Tara-Lynne; Longmuss, Enya; Urschitz, Johann; Elston, Marlee; Moisyadi, Stefan; Bowles, Josephine; Koopman, Peter

2014-08-12

The mammalian sex-determining factor SRY comprises a conserved high-mobility group (HMG) box DNA-binding domain and poorly conserved regions outside the HMG box. Mouse Sry is unusual in that it includes a C-terminal polyglutamine (polyQ) tract that is absent in nonrodent SRY proteins, and yet, paradoxically, is essential for male sex determination. To dissect the molecular functions of this domain, we generated a series of Sry mutants, and studied their biochemical properties in cell lines and transgenic mouse embryos. Sry protein lacking the polyQ domain was unstable, due to proteasomal degradation. Replacing this domain with irrelevant sequences stabilized the protein but failed to restore Sry's ability to up-regulate its key target gene SRY-box 9 (Sox9) and its sex-determining function in vivo. These functions were restored only when a VP16 transactivation domain was substituted. We conclude that the polyQ domain has important roles in protein stabilization and transcriptional activation, both of which are essential for male sex determination in mice. Our data disprove the hypothesis that the conserved HMG box domain is the only functional domain of Sry, and highlight an evolutionary paradox whereby mouse Sry has evolved a novel bifunctional module to activate Sox9 directly, whereas SRY proteins in other taxa, including humans, seem to lack this ability, presumably making them dependent on partner proteins(s) to provide this function.
Activation of IGF-1 and insulin signaling pathways ameliorate mitochondrial function and energy metabolism in Huntington's Disease human lymphoblasts.

PubMed

Naia, Luana; Ferreira, I Luísa; Cunha-Oliveira, Teresa; Duarte, Ana I; Ribeiro, Márcio; Rosenstock, Tatiana R; Laço, Mário N; Ribeiro, Maria J; Oliveira, Catarina R; Saudou, Frédéric; Humbert, Sandrine; Rego, A Cristina

2015-02-01

Huntington's disease (HD) is an inherited neurodegenerative disease caused by a polyglutamine repeat expansion in the huntingtin protein. Mitochondrial dysfunction associated with energy failure plays an important role in this untreated pathology. In the present work, we used lymphoblasts obtained from HD patients or unaffected parentally related individuals to study the protective role of insulin-like growth factor 1 (IGF-1) versus insulin (at low nM) on signaling and metabolic and mitochondrial functions. Deregulation of intracellular signaling pathways linked to activation of insulin and IGF-1 receptors (IR,IGF-1R), Akt, and ERK was largely restored by IGF-1 and, at a less extent, by insulin in HD human lymphoblasts. Importantly, both neurotrophic factors stimulated huntingtin phosphorylation at Ser421 in HD cells. IGF-1 and insulin also rescued energy levels in HD peripheral cells, as evaluated by increased ATP and phosphocreatine, and decreased lactate levels. Moreover, IGF-1 effectively ameliorated O2 consumption and mitochondrial membrane potential (Δψm) in HD lymphoblasts, which occurred concomitantly with increased levels of cytochrome c. Indeed, constitutive phosphorylation of huntingtin was able to restore the Δψm in lymphoblasts expressing an abnormal expansion of polyglutamines. HD lymphoblasts further exhibited increased intracellular Ca(2+) levels before and after exposure to hydrogen peroxide (H2O2), and decreased mitochondrial Ca(2+) accumulation, being the later recovered by IGF-1 and insulin in HD lymphoblasts pre-exposed to H2O2. In summary, the data support an important role for IR/IGF-1R mediated activation of signaling pathways and improved mitochondrial and metabolic function in HD human lymphoblasts.
Sleep disorders in spinocerebellar ataxia type 2 patients.

PubMed

Velázquez-Pérez, Luis; Voss, Ursula; Rodríguez-Labrada, Roberto; Auburger, Georg; Canales Ochoa, Nalia; Sánchez Cruz, Gilberto; Galicia Polo, Lourdes; Haro Valencia, Reyes; Aguilera Rodríguez, Raúl; Medrano Montero, Jacqueline; Laffita Mesa, Jose M; Tuin, Inka

2011-01-01

Sleep disturbances are common features in spinocerebellar ataxias (SCAs). Nevertheless, sleep data on SCA2 come from scarce studies including few patients, limiting the evaluation of the prevalence and determinants of sleep disorders. To assess the frequency and possible determinants of sleep disorders in the large and homogeneous SCA2 Cuban population. Thirty-two SCA2 patients and their age- and sex-matched controls were studied by video-polysomnography and sleep interviews. The most striking video-polysomnography features were rapid eye movement (REM) sleep pathology and periodic leg movements (PLMs). REM sleep abnormalities included a consistent reduction of the REM sleep percentage and REM density as well as an increase in REM sleep without atonia (RWA). REM sleep and REM density decreases were closely related to the increase in ataxia scores, whereas the RWA percentage was influenced by the cytosine-adenine-guanine (CAG) repeats. PLMs were observed in 37.5% of cases. The PLM index showed a significant association with the ataxia score and disease duration but not with CAG repeats. REM sleep pathology and PLMs are closely related to SCA2 severity, suggesting their usefulness as disease progression markers. The RWA percentage is influenced by the CAG repeats and might thus be a sensitive parameter for reflecting polyglutamine toxicity. Finally, as PLMs are sensible to drug treatment, they represents a new therapeutic target for the symptomatic treatment of SCA2. Copyright © 2011 S. Karger AG, Basel.
Spinocerebellar ataxia type 6.

PubMed

Solodkin, Ana; Gomez, Christopher M

2012-01-01

The autosomal dominant spinocerebellar ataxias (SCA) are a genetically heterogeneous group of neurodegenerative disorders characterized by progressive motor incoordination, in some cases with ataxia alone and in others in association with additional progressive neurological deficits. Spinocerebellar ataxia type 6 (SCA6) is the prototype of a pure cerebellar ataxia, associated with a severe form of progressive ataxia and cerebellar dysfunction. SCA6, originally classified as such by Zhuchenko et al. (1997), is caused by a CAG repeat expansion in the CACNA1A gene which encodes the α1A subunit of the P/Q-type voltage-gated calcium channel. SCA6 is one of ten polyglutamine-encoding CAG nucleotide repeat expansion disorders comprising other neurodegenerative disorders such as Huntington's disease. The present review describes clinical, genetic, and pathological manifestations associated with this illness. Currently, there is no treatment for this neurodegenerative disease. Successful therapeutic strategies must target a valid pathological mechanism; thus, understanding the underlying mechanisms of disease is crucial to finding a proper treatment. Hence, this chapter will discuss as well the molecular mechanisms possibly associated with SCA6 pathology and their implication for the development of future treatment. 2012 Elsevier B.V. All rights reserved.
Molecular mechanism of Spinocerebellar Ataxia type 6: glutamine repeat disorder, channelopathy and transcriptional dysregulation. The multifaceted aspects of a single mutation

PubMed Central

Giunti, Paola; Mantuano, Elide; Frontali, Marina; Veneziano, Liana

2015-01-01

Spinocerebellar Ataxia type 6 (SCA6) is an autosomal dominant neurodegenerative disease characterized by late onset, slowly progressive, mostly pure cerebellar ataxia. It is one of three allelic disorders associated to CACNA1A gene, coding for the Alpha1 A subunit of P/Q type calcium channel Cav2.1 expressed in the brain, particularly in the cerebellum. The other two disorders are Episodic Ataxia type 2 (EA2), and Familial Hemiplegic Migraine type 1 (FHM1). These disorders show distinct phenotypes that often overlap but have different pathogenic mechanisms. EA2 and FHM1 are due to mutations causing, respectively, a loss and a gain of channel function. SCA6, instead, is associated with short expansions of a polyglutamine stretch located in the cytoplasmic C-terminal tail of the protein. This domain has a relevant role in channel regulation, as well as in transcription regulation of other neuronal genes; thus the SCA6 CAG repeat expansion results in complex pathogenic molecular mechanisms reflecting the complex Cav2.1 C-terminus activity. We will provide a short review for an update on the SCA6 molecular mechanism. PMID:25762895
Cellular Models: HD Patient-Derived Pluripotent Stem Cells.

PubMed

Geater, Charlene; Hernandez, Sarah; Thompson, Leslie; Mattis, Virginia B

2018-01-01

Huntington's disease (HD) is an autosomal dominant neurodegenerative disorder caused by expanded polyglutamine (polyQ)-encoding repeats in the Huntingtin (HTT) gene. Traditionally, HD cellular models consisted of either patient cells not affected by disease or rodent neurons expressing expanded polyQ repeats in HTT. As these models can be limited in their disease manifestation or proper genetic context, respectively, human HD pluripotent stem cells (PSCs) are currently under investigation as a way to model disease in patient-derived neurons and other neural cell types. This chapter reviews embryonic stem cell (ESC) and induced pluripotent stem cell (iPSC) models of disease, including published differentiation paradigms for neurons and their associated phenotypes, as well as current challenges to the field such as validation of the PSCs and PSC-derived cells. Highlighted are potential future technical advances to HD PSC modeling, including transdifferentiation, complex in vitro multiorgan/system reconstruction, and personalized medicine. Using a human HD patient model of the central nervous system, hopefully one day researchers can tease out the consequences of mutant HTT (mHTT) expression on specific cell types within the brain in order to identify and test novel therapies for disease.
Polyglutamine aggregation in Huntington and related diseases.

PubMed

Polling, Saskia; Hill, Andrew F; Hatters, Danny M

2012-01-01

Polyglutamine (polyQ)-expansions in different proteins cause nine neurodegenerative diseases. While polyQ aggregation is a key pathological hallmark of these diseases, how aggregation relates to pathogenesis remains contentious. In this chapter, we review what is known about the aggregation process and how cells respond and interact with the polyQ-expanded proteins. We cover detailed biophysical and structural studies to uncover the intrinsic features of polyQ aggregates and concomitant effects in the cellular environment. We also examine the functional consequences ofpolyQ aggregation and how cells may attempt to intervene and guide the aggregation process.
[MicroRNA in neurodegenerative disorders].

PubMed

Sobue, Gen

2013-01-01

MicroRNAs (miRNAs) bind to the 3'-untranslated region of mRNA, and thereby suppress the gene expression. Recent studies suggest that miRNAs modify the pathogenesis of cancer and neurodegeneration. Our study demonstrated that the expression levels of miR-196a is increased in a mouse model of spinal and bulbar muscular atrophy (SBMA), a neurodegenerative disease caused by the expansion of polyglutamine in androgen receptor (AR). In cultured neuronal cells, miR-196a decayed the mutant AR mRNA via silencing CUG triplet repeat RNA binding protein 2, a potent miR-196a targeting mRNA, which contributed to stabilize the mutant AR mRNA. Adeno-associated virus vector-mediated delivery of this miRNA attenuates the expression of the mutant AR, resulting in the mitigation of motor neuron degeneration in the SBMA mice. Introduction of miRNA appears to be a novel therapeutic strategy for devastating neurodegenerative diseases.
Short poly-glutamine repeat in the androgen receptor in New World monkeys.

PubMed

Hiramatsu, Chihiro; Paukner, Annika; Kuroshima, Hika; Fujita, Kazuo; Suomi, Stephen J; Inoue-Murayama, Miho

2017-12-01

The androgen receptor mediates various physiological and developmental functions and is highly conserved in mammals. Although great intraspecific length polymorphisms in poly glutamine (poly-Q) and poly glycine (poly-G) regions of the androgen receptor in humans, apes and several Old World monkeys have been reported, little is known about the characteristics of these regions in New World monkeys. In this study, we surveyed 17 species of New World monkeys and found length polymorphisms in these regions in three species (common squirrel monkeys, tufted capuchin monkeys and owl monkeys). We found that the poly-Q region in New World monkeys is relatively shorter than that in catarrhines (humans, apes and Old World monkeys). In addition, we observed that codon usage for poly-G region in New World monkeys is unique among primates. These results suggest that the length of polymorphic regions in androgen receptor genes have evolved uniquely in New World monkeys.
Sequence repeats and protein structure

NASA Astrophysics Data System (ADS)

Hoang, Trinh X.; Trovato, Antonio; Seno, Flavio; Banavar, Jayanth R.; Maritan, Amos

2012-11-01

Repeats are frequently found in known protein sequences. The level of sequence conservation in tandem repeats correlates with their propensities to be intrinsically disordered. We employ a coarse-grained model of a protein with a two-letter amino acid alphabet, hydrophobic (H) and polar (P), to examine the sequence-structure relationship in the realm of repeated sequences. A fraction of repeated sequences comprises a distinct class of bad folders, whose folding temperatures are much lower than those of random sequences. Imperfection in sequence repetition improves the folding properties of the bad folders while deteriorating those of the good folders. Our results may explain why nature has utilized repeated sequences for their versatility and especially to design functional proteins that are intrinsically unstructured at physiological temperatures.

Comparison of simple sequence repeats in 19 Archaea.

PubMed

Trivedi, S

2006-12-05

All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.
[Mutation Analysis of 19 STR Loci in 20 723 Cases of Paternity Testing].

PubMed

Bi, J; Chang, J J; Li, M X; Yu, C Y

2017-06-01

To observe and analyze the confirmed cases of paternity testing, and to explore the mutation rules of STR loci. The mutant STR loci were screened from 20 723 confirmed cases of paternity testing by Goldeneye 20A system．The mutation rates, and the sources, fragment length, steps and increased or decreased repeat sequences of mutant alleles were counted for the analysis of the characteristics of mutation-related factors. A total of 548 mutations were found on 19 STR loci, and 557 mutation events were observed. The loci mutation rate was 0.07‰-2.23‰. The ratio of paternal to maternal mutant events was 3.06:1. One step mutation was the main mutation, and the number of the increased repeat sequences was almost the same as the decreased repeat sequences. The repeat sequences were more likely to decrease in two steps mutation and above. Mutation mainly occurred in the medium allele, and the number of the increased repeat sequences was almost the same as the decreased repeat sequences. In long allele mutations, the decreased repeat sequences were significantly more than the increased repeat sequences. The number of the increased repeat sequences was almost the same as the decreased repeat sequences in paternal mutation, while the decreased repeat sequences were more than the increased in maternal mutation. There are significant differences in the mutation rate of each locus. When one or two loci do not conform to the genetic law, other detection system should be added, and PI value should be calculated combined with the information of the mutate STR loci in order to further clarify the identification opinions. Copyright© by the Editorial Department of Journal of Forensic Medicine
CAMELOT: A machine learning approach for coarse-grained simulations of aggregation of block-copolymeric protein sequences

PubMed Central

Ruff, Kiersten M.; Harmon, Tyler S.; Pappu, Rohit V.

2015-01-01

We report the development and deployment of a coarse-graining method that is well suited for computer simulations of aggregation and phase separation of protein sequences with block-copolymeric architectures. Our algorithm, named CAMELOT for Coarse-grained simulations Aided by MachinE Learning Optimization and Training, leverages information from converged all atom simulations that is used to determine a suitable resolution and parameterize the coarse-grained model. To parameterize a system-specific coarse-grained model, we use a combination of Boltzmann inversion, non-linear regression, and a Gaussian process Bayesian optimization approach. The accuracy of the coarse-grained model is demonstrated through direct comparisons to results from all atom simulations. We demonstrate the utility of our coarse-graining approach using the block-copolymeric sequence from the exon 1 encoded sequence of the huntingtin protein. This sequence comprises of 17 residues from the N-terminal end of huntingtin (N17) followed by a polyglutamine (polyQ) tract. Simulations based on the CAMELOT approach are used to show that the adsorption and unfolding of the wild type N17 and its sequence variants on the surface of polyQ tracts engender a patchy colloid like architecture that promotes the formation of linear aggregates. These results provide a plausible explanation for experimental observations, which show that N17 accelerates the formation of linear aggregates in block-copolymeric N17-polyQ sequences. The CAMELOT approach is versatile and is generalizable for simulating the aggregation and phase behavior of a range of block-copolymeric protein sequences. PMID:26723608
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats

PubMed Central

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-01-01

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. PMID:26481363
Huntingtin processing in pathogenesis of Huntington disease.

PubMed

Qin, Zheng-Hong; Gu, Zhen-Lun

2004-10-01

Huntingtons disease (HD) is caused by an expansion of the polyglutamine tract in the protein named huntingtin. The expansion of polyglutamine tract induces selective degeneration of striatal projection neurons and cortical pyramidal neurons. The bio-hallmark of HD is the formation of intranuclear inclusions and cytoplasmic aggregates in association with other cellular proteins in vulnerable neurons. Accumulation of N-terminal mutant huntingtin in HD brains is prominent. These pathological features are related to protein misfolding and impairments in protein processing and degradation in neurons. This review focused on the role of proteases in huntingtin cleavage and degradation and the contribution of altered processing of mutant huntingtin to HD pathogenesis. Copyright 2004 Acta Pharmacologica Sinica
Suppression of polyglutamine toxicity by a Drosophila homolog of myeloid leukemia factor 1.

PubMed

Kazemi-Esfarjani, Parsa; Benzer, Seymour

2002-10-01

The toxicity of an abnormally long polyglutamine [poly(Q)] tract within specific proteins is the molecular lesion shared by Huntington's disease (HD) and several other hereditary neurodegenerative disorders. By a genetic screen in Drosophila, devised to uncover genes that suppress poly(Q) toxicity, we discovered a Drosophila homolog of human myeloid leukemia factor 1 (MLF1). Expression of the Drosophila homolog (dMLF) ameliorates the toxicity of poly(Q) expressed in the eye and central nervous system. In the retina, whether endogenously or ectopically expressed, dMLF co-localized with aggregates, suggesting that dMLF alone, or through an intermediary molecular partner, may suppress toxicity by sequestering poly(Q) and/or its aggregates.
Sequences characterization of microsatellite DNA sequences in Pacific abalone ( Haliotis discus hannai)

NASA Astrophysics Data System (ADS)

Li, Qi; Akihiro, Kijima

2007-01-01

The microsatellite-enriched library was constructed using magnetic bead hybridization selection method, and the microsatellite DNA sequences were analyzed in Pacific abalone Haliotis discus hannai. Three hundred and fifty white colonies were screened using PCR-based technique, and 84 clones were identified to potentially contain microsatellite repeat motif. The 84 clones were sequenced, and 42 microsatellites and 4 minisatellites with a minimum of five repeats were found (13.1% of white colonies screened). Besides the motif of CA contained in the oligoprobe, we also found other 16 types of microsatellite repeats including a dinucleotide repeat, two tetranucleotide repeats, twelve pentanucleotide repeats and a hexanucleotide repeat. According to Weber (1990), the microsatellite sequences obtained could be categorized structurally into perfect repeats (73.3%), imperfect repeats (13.3%), and compound repeats (13.4%). Among the microsatellite repeats, relatively short arrays (<20 repeats) were most abundant, accounting for 75.0%. The largest length of microsatellites was 48 repeats, and the average number of repeats was 13.4. The data on the composition and length distribution of microsatellites obtained in the present study can be useful for choosing the repeat motifs for microsatellite isolation in other abalone species.
The role of polyglutamine expansion and protein context in disease-related huntingtin/lipid interactions

NASA Astrophysics Data System (ADS)

Burke, Kathleen Anne

Huntington's Disease (HD) is a neurodegenerative disorder that is defined by the accumulation of nanoscale aggregates comprised of the huntingtin (htt) protein. Aggregation is directly caused by an expanded polyglutamine (polyQ) domain in htt, leading to a diverse population of aggregate species, such as oligomers, fibrils, and annular aggregates. Furthermore, the length of this polyQ domain is directly related to onset and severity of disease. The first 17 amino acids on the N-terminus (N17) and the polyproline domain on the C-terminal side of the polyQ domain have been shown to further modulate the aggregation process. Additionally, N17 appears to have lipid binding properties as htt interacts with a variety of membrane-containing structures present in cells, such as organelles, and interactions with these membrane surfaces may further modulate htt aggregation. To investigate the interaction between htt exon1 and lipid bilayers, in situ atomic force microscopy (AFM) was used to directly monitor the aggregation of htt exon1 constructs with varying Q-length (35Q, 46Q, 51Q, and myc- 53Q) or synthetic peptides with different polyQ domain flanking sequences (KK-Q35-KK, KK-Q 35-P10-KK, N17-Q35-KK, and N 17-Q35-P10-KK) on supported lipid membranes comprised of total brain lipid extract. The exon1 fragments accumulated on the lipid membranes, causing disruption of the membrane, in a polyQ dependent manner. By adding N-terminal tags to the htt exon1 fragments, the interaction with the lipid bilayer was impeded. The KK-Q35-KK and KK-Q 35-P10-KK peptides had no appreciable interaction with lipid bilayers. Interestingly, polyQ peptides with the N17 flanking sequence interacted with the bilayer. N17-Q35-KK formed discrete aggregates on the bilayer, but there was minimal membrane disruption. The N17-Q35-P10-KK peptide interacted more aggressively with the lipid bilayer in a manner reminiscent of the htt exon1 proteins.
EMQN/CMGS best practice guidelines for the molecular genetic testing of Huntington disease.

PubMed

Losekoot, Monique; van Belzen, Martine J; Seneca, Sara; Bauer, Peter; Stenhouse, Susan A R; Barton, David E

2013-05-01

Huntington disease (HD) is caused by the expansion of an unstable polymorphic trinucleotide (CAG)n repeat in exon 1 of the HTT gene, which translates into an extended polyglutamine tract in the protein. Laboratory diagnosis of HD involves estimation of the number of CAG repeats. Molecular genetic testing for HD is offered in a wide range of laboratories both within and outside the European community. In order to measure the quality and raise the standard of molecular genetic testing in these laboratories, the European Molecular Genetics Quality Network has organized a yearly external quality assessment (EQA) scheme for molecular genetic testing of HD for over 10 years. EQA compares a laboratory's output with a fixed standard both for genotyping and reporting of the results to the referring physicians. In general, the standard of genotyping is very high but the clarity of interpretation and reporting of the test result varies more widely. This emphasizes the need for best practice guidelines for this disorder. We have therefore developed these best practice guidelines for genetic testing for HD to assist in testing and reporting of results. The analytical methods and the potential pitfalls of molecular genetic testing are highlighted and the implications of the different test outcomes for the consultand and his or her family members are discussed.
The phasor-FLIM fingerprints reveal shifts from OXPHOS to enhanced glycolysis in Huntington Disease

PubMed Central

Sameni, Sara; Syed, Adeela; Marsh, J. Lawrence; Digman, Michelle A.

2016-01-01

Huntington disease (HD) is an autosomal neurodegenerative disorder caused by the expansion of Polyglutamine (polyQ) in exon 1 of the Huntingtin protein. Glutamine repeats below 36 are considered normal while repeats above 40 lead to HD. Impairment in energy metabolism is a common trend in Huntington pathogenesis; however, this effect is not fully understood. Here, we used the phasor approach and Fluorescence Lifetime Imaging Microscopy (FLIM) to measure changes between free and bound fractions of NADH as a indirect measure of metabolic alteration in living cells. Using Phasor-FLIM, pixel maps of metabolic alteration in HEK293 cell lines and in transgenic Drosophila expressing expanded and unexpanded polyQ HTT exon1 in the eye disc were developed. We found a significant shift towards increased free NADH, indicating an increased glycolytic state for cells and tissues expressing the expanded polyQ compared to unexpanded control. In the nucleus, a further lifetime shift occurs towards higher free NADH suggesting a possible synergism between metabolic dysfunction and transcriptional regulation. Our results indicate that metabolic dysfunction in HD shifts to increased glycolysis leading to oxidative stress and cell death. This powerful label free method can be used to screen native HD tissue samples and for potential drug screening. PMID:27713486
The phasor-FLIM fingerprints reveal shifts from OXPHOS to enhanced glycolysis in Huntington Disease

NASA Astrophysics Data System (ADS)

Sameni, Sara; Syed, Adeela; Marsh, J. Lawrence; Digman, Michelle A.

2016-10-01

Huntington disease (HD) is an autosomal neurodegenerative disorder caused by the expansion of Polyglutamine (polyQ) in exon 1 of the Huntingtin protein. Glutamine repeats below 36 are considered normal while repeats above 40 lead to HD. Impairment in energy metabolism is a common trend in Huntington pathogenesis; however, this effect is not fully understood. Here, we used the phasor approach and Fluorescence Lifetime Imaging Microscopy (FLIM) to measure changes between free and bound fractions of NADH as a indirect measure of metabolic alteration in living cells. Using Phasor-FLIM, pixel maps of metabolic alteration in HEK293 cell lines and in transgenic Drosophila expressing expanded and unexpanded polyQ HTT exon1 in the eye disc were developed. We found a significant shift towards increased free NADH, indicating an increased glycolytic state for cells and tissues expressing the expanded polyQ compared to unexpanded control. In the nucleus, a further lifetime shift occurs towards higher free NADH suggesting a possible synergism between metabolic dysfunction and transcriptional regulation. Our results indicate that metabolic dysfunction in HD shifts to increased glycolysis leading to oxidative stress and cell death. This powerful label free method can be used to screen native HD tissue samples and for potential drug screening.
Conformational Switching in PolyGln Amyloid Fibrils Resulting from a Single Amino Acid Insertion

PubMed Central

Huang, Rick K.; Baxa, Ulrich; Aldrian, Gudrun; Ahmed, Abdullah B.; Wall, Joseph S.; Mizuno, Naoko; Antzutkin, Oleg; Steven, Alasdair C.; Kajava, Andrey V.

2014-01-01

The established correlation between neurodegenerative disorders and intracerebral deposition of polyglutamine aggregates motivates attempts to better understand their fibrillar structure. We designed polyglutamines with a few lysines inserted to overcome the hindrance of extreme insolubility and two D-lysines to limit the lengths of β-strands. One is 33 amino acids long (PolyQKd-33) and the other has one fewer glutamine (PolyQKd-32). Both form well-dispersed fibrils suitable for analysis by electron microscopy. Electron diffraction confirmed cross-β structures in both fibrils. Remarkably, the deletion of just one glutamine residue from the middle of the peptide leads to substantially different amyloid structures. PolyQKd-32 fibrils are consistently 10–20% wider than PolyQKd-33, as measured by negative staining, cryo-electron microscopy, and scanning transmission electron microscopy. Scanning transmission electron microscopy analysis revealed that the PolyQKd-32 fibrils have 50% higher mass-per-length than PolyQKd-33. This distinction can be explained by a superpleated β-structure model for PolyQKd-33 and a model with two β-solenoid protofibrils for PolyQKd-32. These data provide evidence for β-arch-containing structures in polyglutamine fibrils and open future possibilities for structure-based drug design. PMID:24853742
Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution.

PubMed

Melters, Daniël P; Bradnam, Keith R; Young, Hugh A; Telis, Natalie; May, Michael R; Ruby, J Graham; Sebra, Robert; Peluso, Paul; Eid, John; Rank, David; Garcia, José Fernando; DeRisi, Joseph L; Smith, Timothy; Tobias, Christian; Ross-Ibarra, Jeffrey; Korf, Ian; Chan, Simon W L

2013-01-30

Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes.
Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution

PubMed Central

2013-01-01

Background Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Results Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. Conclusions While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes. PMID:23363705
Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm

PubMed Central

Glunčić, Matko; Paar, Vladimir

2013-01-01

The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K-string ensemble which enables a new method of direct mapping of symbolic DNA sequence into frequency domain, with straightforward identification of repeats as peaks in GRM diagram. In this way, we obtain very fast, efficient and highly automatized repeat finding tool. The method is robust to substitutions and insertions/deletions, as well as to various complexities of the sequence pattern. We present several case studies of GRM use, in order to illustrate its capabilities: identification of α-satellite tandem repeats and higher order repeats (HORs), identification of Alu dispersed repeats and of Alu tandems, identification of Period 3 pattern in exons, implementation of ‘magnifying glass’ effect, identification of complex HOR pattern, identification of inter-tandem transitional dispersed repeat sequences and identification of long segmental duplications. GRM algorithm is convenient for use, in particular, in cases of large repeat units, of highly mutated and/or complex repeats, and of global repeat maps for large genomic sequences (chromosomes and genomes). PMID:22977183
Identification and characterization of tandem repeats in exon III of dopamine receptor D4 (DRD4) genes from different mammalian species.

PubMed

Larsen, Svend Arild; Mogensen, Line; Dietz, Rune; Baagøe, Hans Jørgen; Andersen, Mogens; Werge, Thomas; Rasmussen, Henrik Berg

2005-12-01

In this study we have identified and characterized dopamine receptor D4 (DRD4) exon III tandem repeats in 33 public available nucleotide sequences from different mammalian species. We found that the tandem repeat in canids could be described in a novel and simple way, namely, as a structure composed of 15- and 12- bp modules. Tandem repeats composed of 18-bp modules were found in sequences from the horse, zebra, onager, and donkey, Asiatic bear, polar bear, common raccoon, dolphin, harbor porpoise, and domestic cat. Several of these sequences have been analyzed previously without a tandem repeat being found. In the domestic cow and gray seal we identified tandem repeats composed of 36-bp modules, each consisting of two closely related 18-bp basic units. A tandem repeat consisting of 9-bp modules was identified in sequences from mink and ferret. In the European otter we detected an 18-bp tandem repeat, while a tandem repeat consisting of 27-bp modules was identified in a sequence from European badger. Both these tandem repeats were composed of 9-bp basic units, which were closely related with the 9-bp repeat modules identified in the mink and ferret. Tandem repeats could not be identified in sequences from rodents. All tandem repeats possessed a high GC content with a strong bias for C. On phylogenetic analysis of the tandem repeats evolutionary related species were clustered into the same groups. The degree of conservation of the tandem repeats varied significantly between species. The deduced amino acid sequences of most of the tandem repeats exhibited a high propensity for disorder. This was also the case with an amino acid sequence of the human DRD4 exon III tandem repeat, which was included in the study for comparative purposes. We identified proline-containing motifs for SH3 and WW domain binding proteins, potential phosphorylation sites, PDZ domain binding motifs, and FHA domain binding motifs in the amino acid sequences of the tandem repeats. The numbers of potential functional sites varied pronouncedly between species. Our observations provide a platform for future studies of the architecture and evolution of the DRD4 exon III tandem repeat, and they suggest that differences in the structure of this tandem repeat contribute to specialization and generation of diversity in receptor function.
Brain pathology of spinocerebellar ataxias.

PubMed

Seidel, Kay; Siswanto, Sonny; Brunt, Ewout R P; den Dunnen, Wilfred; Korf, Horst-Werner; Rüb, Udo

2012-07-01

The autosomal dominant cerebellar ataxias (ADCAs) represent a heterogeneous group of neurodegenerative diseases with progressive ataxia and cerebellar degeneration. The current classification of this disease group is based on the underlying genetic defects and their typical disease courses. According to this categorization, ADCAs are divided into the spinocerebellar ataxias (SCAs) with a progressive disease course, and the episodic ataxias (EA) with episodic occurrences of ataxia. The prominent disease symptoms of the currently known and genetically defined 31 SCA types result from damage to the cerebellum and interconnected brain grays and are often accompanied by more specific extra-cerebellar symptoms. In the present review, we report the genetic and clinical background of the known SCAs and present the state of neuropathological investigations of brain tissue from SCA patients in the final disease stages. Recent findings show that the brain is commonly seriously affected in the polyglutamine SCAs (i.e. SCA1, SCA2, SCA3, SCA6, SCA7, and SCA17) and that the patterns of brain damage in these diseases overlap considerably in patients suffering from advanced disease stages. In the more rarely occurring non-polyglutamine SCAs, post-mortem neuropathological data currently are scanty and investigations have been primarily performed in vivo by means of MRI brain imaging. Only a minority of SCAs exhibit symptoms and degenerative patterns allowing for a clear and unambiguous diagnosis of the disease, e.g. retinal degeneration in SCA7, tau aggregation in SCA11, dentate calcification in SCA20, protein depositions in the Purkinje cell layer in SCA31, azoospermia in SCA32, and neurocutaneous phenotype in SCA34. The disease proteins of polyglutamine ataxias and some non-polyglutamine ataxias aggregate as cytoplasmic or intranuclear inclusions and serve as morphological markers. Although inclusions may impair axonal transport, bind transcription factors, and block protein quality control, detailed molecular and pathogenetic consequences remain to be determined.
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

PubMed

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-11-16

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Repeated sequence sets in mitochondrial DNA molecules of root knot nematodes (Meloidogyne): nucleotide sequences, genome location and potential for host-race identification.

PubMed Central

Okimoto, R; Chamberlin, H M; Macfarlane, J L; Wolstenholme, D R

1991-01-01

Within a 7 kb segment of the mtDNA molecule of the root knot nematode, Meloidogyne javanica, that lacks standard mitochondrial genes, are three sets of strictly tandemly arranged, direct repeat sequences: approximately 36 copies of a 102 ntp sequence that contains a TaqI site; 11 copies of a 63 ntp sequence, and 5 copies of an 8 ntp sequence. The 7 kb repeat-containing segment is bounded by putative tRNAasp and tRNAf-met genes and the arrangement of sequences within this segment is: the tRNAasp gene; a unique 1,528 ntp segment that contains two highly stable hairpin-forming sequences; the 102 ntp repeat set; the 8 ntp repeat set; a unique 1,068 ntp segment; the 63 ntp repeat set; and the tRNAf-met gene. The nucleotide sequences of the 102 ntp copies and the 63 ntp copies have been conserved among the species examined. Data from Southern hybridization experiments indicate that 102 ntp and 63 ntp repeats occur in the mtDNAs of three, two and two races of M.incognita, M.hapla and M.arenaria, respectively. Nucleotide sequences of the M.incognita Race-3 102 ntp repeat were found to be either identical or highly similar to those of the M.javanica 102 ntp repeat. Differences in migration distance and number of 102 ntp repeat-containing bands seen in Southern hybridization autoradiographs of restriction-digested mtDNAs of M.javanica and the different host races of M.incognita, M.hapla and M.arenaria are sufficient to distinguish the different host races of each species. Images PMID:2027769
HD CAG-correlated gene expression changes support a simple dominant gain of function

PubMed Central

Jacobsen, Jessie C.; Gregory, Gillian C.; Woda, Juliana M.; Thompson, Morgan N.; Coser, Kathryn R.; Murthy, Vidya; Kohane, Isaac S.; Gusella, James F.; Seong, Ihn Sik; MacDonald, Marcy E.; Shioda, Toshi; Lee, Jong-Min

2011-01-01

Huntington's disease is initiated by the expression of a CAG repeat-encoded polyglutamine region in full-length huntingtin, with dominant effects that vary continuously with CAG size. The mechanism could involve a simple gain of function or a more complex gain of function coupled to a loss of function (e.g. dominant negative-graded loss of function). To distinguish these alternatives, we compared genome-wide gene expression changes correlated with CAG size across an allelic series of heterozygous CAG knock-in mouse embryonic stem (ES) cell lines (HdhQ20/7, HdhQ50/7, HdhQ91/7, HdhQ111/7), to genes differentially expressed between Hdhex4/5/ex4/5 huntingtin null and wild-type (HdhQ7/7) parental ES cells. The set of 73 genes whose expression varied continuously with CAG length had minimal overlap with the 754-member huntingtin-null gene set but the two were not completely unconnected. Rather, the 172 CAG length-correlated pathways and 238 huntingtin-null significant pathways clustered into 13 shared categories at the network level. A closer examination of the energy metabolism and the lipid/sterol/lipoprotein metabolism categories revealed that CAG length-correlated genes and huntingtin-null-altered genes either were different members of the same pathways or were in unique, but interconnected pathways. Thus, varying the polyglutamine size in full-length huntingtin produced gene expression changes that were distinct from, but related to, the effects of lack of huntingtin. These findings support a simple gain-of-function mechanism acting through a property of the full-length huntingtin protein and point to CAG-correlative approaches to discover its effects. Moreover, for therapeutic strategies based on huntingtin suppression, our data highlight processes that may be more sensitive to the disease trigger than to decreased huntingtin levels. PMID:21536587

A Novel mouse model of enhanced proteostasis: Full-length human heat shock factor 1 transgenic mice

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pierce, Anson, E-mail: piercea2@uthscsa.edu; Barshop Institute for Longevity and Aging Studies, The University of Texas Health Science Center at San Antonio, San Antonio, Texas, 78229; The Department of Veteran's Affairs, South Texas Veterans Health Care System, San Antonio, Texas, 78284

2010-11-05

Research highlights: {yields} Development of mouse overexpressing native human HSF1 in all tissues including CNS. {yields} HSF1 overexpression enhances heat shock response at whole-animal and cellular level. {yields} HSF1 overexpression protects from polyglutamine toxicity and favors aggresomes. {yields} HSF1 overexpression enhances proteostasis at the whole-animal and cellular level. -- Abstract: The heat shock response (HSR) is controlled by the master transcriptional regulator heat shock factor 1 (HSF1). HSF1 maintains proteostasis and resistance to stress through production of heat shock proteins (HSPs). No transgenic model exists that overexpresses HSF1 in tissues of the central nervous system (CNS). We generated a transgenicmore » mouse overexpressing full-length non-mutant HSF1 and observed a 2-4-fold increase in HSF1 mRNA and protein expression in all tissues studied of HSF1 transgenic (HSF1{sup +/0}) mice compared to wild type (WT) littermates, including several regions of the CNS. Basal expression of HSP70 and 90 showed only mild tissue-specific changes; however, in response to forced exercise, the skeletal muscle HSR was more elevated in HSF1{sup +/0} mice compared to WT littermates and in fibroblasts following heat shock, as indicated by levels of inducible HSP70 mRNA and protein. HSF1{sup +/0} cells elicited a significantly more robust HSR in response to expression of the 82 repeat polyglutamine-YFP fusion construct (Q82YFP) and maintained proteasome-dependent processing of Q82YFP compared to WT fibroblasts. Overexpression of HSF1 was associated with fewer, but larger Q82YFP aggregates resembling aggresomes in HSF1{sup +/0} cells, and increased viability. Therefore, our data demonstrate that tissues and cells from mice overexpressing full-length non-mutant HSF1 exhibit enhanced proteostasis.« less
Bicistronic CACNA1A Gene Expression in Neurons Derived from Spinocerebellar Ataxia Type 6 Patient-Induced Pluripotent Stem Cells

PubMed Central

Bavassano, Carlo; Eigentler, Andreas; Stanika, Ruslan; Obermair, Gerald J.; Boesch, Sylvia; Dechant, Georg

2017-01-01

Spinocerebellar ataxia type 6 (SCA6) is an autosomal-dominant neurodegenerative disorder that is caused by a CAG trinucleotide repeat expansion in the CACNA1A gene. As one of the few bicistronic genes discovered in the human genome, CACNA1A encodes not only the α1A subunit of the P/Q type voltage-gated Ca2+ channel CaV2.1 but also the α1ACT protein, a 75 kDa transcription factor sharing the sequence of the cytoplasmic C-terminal tail of the α1A subunit. Isoforms of both proteins contain the polyglutamine (polyQ) domain that is expanded in SCA6 patients. Although certain SCA6 phenotypes appear to be specific for Purkinje neurons, other pathogenic effects of the SCA6 polyQ mutation can affect a broad spectrum of central nervous system (CNS) neuronal subtypes. We investigated the expression and function of CACNA1A gene products in human neurons derived from induced pluripotent stem cells from two SCA6 patients. Expression levels of CACNA1A encoding α1A subunit were similar between SCA6 and control neurons, and no differences were found in the subcellular distribution of CaV2.1 channel protein. The α1ACT immunoreactivity was detected in the majority of cell nuclei of SCA6 and control neurons. Although no SCA6 genotype-dependent differences in CaV2.1 channel function were observed, they were found in the expression levels of the α1ACT target gene Granulin (GRN) and in glutamate-induced cell vulnerability. PMID:28946818
Huntington’s Disease: The Past, Present, and Future Search for Disease Modifiers

PubMed Central

Clabough, Erin B.D.

2013-01-01

Huntington’s disease (HD) is an autosomal dominant genetic disorder that specifically causes neurodegeneration of striatal neurons, resulting in a triad of symptoms that includes emotional, cognitive, and motor disturbances. The HD mutation causes a polyglutamine repeat expansion within the N-terminal of the huntingtin (Htt) protein. This expansion causes aggregate formation within the cytosol and nucleus due to the presence of misfolded mutant Htt, as well as altered interactions with Htt’s multiple binding partners, and changes in post-translational Htt modifications. The present review charts efforts toward a therapy that delays age of onset or slows symptom progression in patients affected by HD, as there is currently no effective treatment. Although silencing Htt expression appears promising as a disease modifying treatment, it should be attempted with caution in light of Htt’s essential roles in neural maintenance and development. Other therapeutic targets include those that boost aggregate dissolution, target excitotoxicity and metabolic issues, and supplement growth factors. PMID:23766742
Variation, Repetition, And Choice

PubMed Central

Abreu-Rodrigues, Josele; Lattal, Kennon A; dos Santos, Cristiano V; Matos, Ricardo A

2005-01-01

Experiment 1 investigated the controlling properties of variability contingencies on choice between repeated and variable responding. Pigeons were exposed to concurrent-chains schedules with two alternatives. In the REPEAT alternative, reinforcers in the terminal link depended on a single sequence of four responses. In the VARY alternative, a response sequence in the terminal link was reinforced only if it differed from the n previous sequences (lag criterion). The REPEAT contingency generated low, constant levels of sequence variation whereas the VARY contingency produced levels of sequence variation that increased with the lag criterion. Preference for the REPEAT alternative tended to increase directly with the degree of variation required for reinforcement. Experiment 2 examined the potential confounding effects in Experiment 1 of immediacy of reinforcement by yoking the interreinforcer intervals in the REPEAT alternative to those in the VARY alternative. Again, preference for REPEAT was a function of the lag criterion. Choice between varying and repeating behavior is discussed with respect to obtained behavioral variability, probability of reinforcement, delay of reinforcement, and switching within a sequence. PMID:15828592
Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

PubMed

Rehm, Charlotte; Wurmthaler, Lena A; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S

2015-01-01

In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1-5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6-9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria.
Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

PubMed Central

Rehm, Charlotte; Wurmthaler, Lena A.; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S.

2015-01-01

In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1–5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6–9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria. PMID:26695179
Analysis of simple sequence repeat (SSR) structure and sequence within Epichloë endophyte genomes reveals impacts on gene structure and insights into ancestral hybridization events.

PubMed

Clayton, William; Eaton, Carla Jane; Dupont, Pierre-Yves; Gillanders, Tim; Cameron, Nick; Saikia, Sanjay; Scott, Barry

2017-01-01

Epichloë grass endophytes comprise a group of filamentous fungi of both sexual and asexual species. Known for the beneficial characteristics they endow upon their grass hosts, the identification of these endophyte species has been of great interest agronomically and scientifically. The use of simple sequence repeat loci and the variation in repeat elements has been used to rapidly identify endophyte species and strains, however, little is known of how the structure of repeat elements changes between species and strains, and where these repeat elements are located in the fungal genome. We report on an in-depth analysis of the structure and genomic location of the simple sequence repeat locus B10, commonly used for Epichloë endophyte species identification. The B10 repeat was found to be located within an exon of a putative bZIP transcription factor, suggesting possible impacts on polypeptide sequence and thus protein function. Analysis of this repeat in the asexual endophyte hybrid Epichloë uncinata revealed that the structure of B10 alleles reflects the ancestral species that hybridized to give rise to this species. Understanding the structure and sequence of these simple sequence repeats provides a useful set of tools for readily distinguishing strains and for gaining insights into the ancestral species that have undergone hybridization events.
TRAP: automated classification, quantification and annotation of tandemly repeated sequences.

PubMed

Sobreira, Tiago José P; Durham, Alan M; Gruber, Arthur

2006-02-01

TRAP, the Tandem Repeats Analysis Program, is a Perl program that provides a unified set of analyses for the selection, classification, quantification and automated annotation of tandemly repeated sequences. TRAP uses the results of the Tandem Repeats Finder program to perform a global analysis of the satellite content of DNA sequences, permitting researchers to easily assess the tandem repeat content for both individual sequences and whole genomes. The results can be generated in convenient formats such as HTML and comma-separated values. TRAP can also be used to automatically generate annotation data in the format of feature table and GFF files.
Suppression of polyglutamine protein toxicity by co-expression of a heat-shock protein 40 and a heat-shock protein 110

PubMed Central

Kuo, Y; Ren, S; Lao, U; Edgar, B A; Wang, T

2013-01-01

A network of heat-shock proteins mediates cellular protein homeostasis, and has a fundamental role in preventing aggregation-associated neurodegenerative diseases. In a Drosophila model of polyglutamine (polyQ) disease, the HSP40 family protein, DNAJ-1, is a superior suppressor of toxicity caused by the aggregation of polyQ containing proteins. Here, we demonstrate that one specific HSP110 protein, 70 kDa heat-shock cognate protein cb (HSC70cb), interacts physically and genetically with DNAJ-1 in vivo, and that HSC70cb is necessary for DNAJ-1 to suppress polyglutamine-induced cell death in Drosophila. Expression of HSC70cb together with DNAJ-1 significantly enhanced the suppressive effects of DNAJ-1 on polyQ-induced neurodegeneration, whereas expression of HSC70cb alone did not suppress neurodegeneration in Drosophila models of either general polyQ disease or Huntington's disease. Furthermore, expression of a human HSP40, DNAJB1, together with a human HSP110, APG-1, protected cells from polyQ-induced neural degeneration in flies, whereas expression of either component alone had little effect. Our data provide a functional link between HSP40 and HSP110 in suppressing the cytotoxicity of aggregation-prone proteins, and suggest that HSP40 and HSP110 function together in protein homeostasis control. PMID:24091676
Regions of conservation and divergence in the 3' untranslated sequences of genomic RNA from Ross River virus isolates.

PubMed

Faragher, S G; Dalgarno, L

1986-07-20

The 3' untranslated (UT) sequences of the genomic RNAs of five geographic variants of the alphavirus Ross River virus (RRV) were determined and compared with the 3' UT sequence of RRV T48, the prototype strain. Part of the 3' UT region of Getah virus, a close serological relative of RRV, was also sequenced. The RRV 3' UT region varies markedly in length between variants. Large deletions or insertions, sequence rearrangements and single nucleotide substitutions are observed. A sequence tract of 49 to 58 nucleotides, which is repeated as four blocks in the RRV T48 3' UT region, occurs only once in the 3' UT region of one RRV strain (NB5092), indicating that the existence of repeat sequence blocks is not essential for RRV replication. However, the precise sequence of the 3' proximal copy of the repeat block and its position relative to the poly(A) tail were identical in all RRV isolates examined, suggesting that it has an important role in RRV replication. Nucleotide substitutions between RRV variants are distributed non-randomly along the length of the 3' UT region. The sequence of 120 to 130 nucleotides adjacent to the poly(A) tail is strongly conserved. Getah virus RNA contains three repeat sequence blocks in the 3' UT region. These are similar in sequence to those in RRV RNA but differ in their arrangement. Homology between the RRV and Getah 3' UT sequences is greatest in the 3' proximal repeat sequence block that shows three differences in 49 nucleotides. The 3' proximal repeat in Getah RNA occurs at the same position, relative to the poly(A) tail, as in all RRV variants. The RRV and Getah virus 3' UT sequences show extensive homology in the region between the 3' proximal repeat and the poly(A) tail but, apart from the repeat blocks themselves, they show no significant homology elsewhere.
Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure.

PubMed

Ruhlman, Tracey A; Zhang, Jin; Blazier, John C; Sabir, Jamal S M; Jansen, Robert K

2017-04-01

There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements. We used Illumina paired-end and PacBio single-molecule real-time (SMRT) sequences to characterize repeat structure in the plastome of Monsonia emarginata (Geraniaceae). We used OrgConv and inspected nucleotide alignments to infer ancestral nucleotides and identify gene conversion among repeats and mapped long (>1 kb) SMRT reads against the unit-genome assembly to identify alternative sequence arrangements. Although M. emarginata lacks the canonical IR, we found that large repeats (>1 kilobase; kb) represent ∼22% of the plastome nucleotide content. Among the largest repeats (>2 kb), we identified GC-biased gene conversion and mapping filtered, long SMRT reads to the M. emarginata unit-genome assembly revealed alternative, substoichiometric sequence arrangements. We offer a model based on RDR and gene conversion between long repeated sequences in the M. emarginata plastome and provide support that both intra-and intermolecular recombination between large repeats, particularly in repeat-rich plastomes, varies unit-genome structure while homogenizing the nucleotide sequence of repeats. © 2017 Botanical Society of America.
Methods for sequencing GC-rich and CCT repeat DNA templates

DOEpatents

Robinson, Donna L.

2007-02-20

The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.
The Contribution of Short Repeats of Low Sequence Complexity to Large Conifer Genomes

Treesearch

A. Schmidt; R.L. Doudrick; J.S. Heslop-Harrison; T. Schmidt

2000-01-01

Abstract: The abundance and genomic organization of six simple sequence repeats, consisting of di-, tri-, and tetranucleotide sequence motifs, and a minisatellite repeat have been analyzed in different gymnosperms by Southern hybridization. Within the gymnosperm genomes investigated, the abundance and genomic organization of micro- and...
Always look on both sides: Phylogenetic information conveyed by simple sequence repeat allele sequences

USDA-ARS?s Scientific Manuscript database

Simple sequence repeat (SSR) markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily,...
[Bioinformatics Analysis of Clustered Regularly Interspaced Short Palindromic Repeats in the Genomes of Shigella].

PubMed

Wang, Pengfei; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Wang, Linlin; Guo, Xiangjiao; Yang, Haiyan; Xi, Yuanlin

2015-04-01

This study was aimed to explore the features of clustered regularly interspaced short palindromic repeats (CRISPR) structures in Shigella by using bioinformatics. We used bioinformatics methods, including BLAST, alignment and RNA structure prediction, to analyze the CRISPR structures of Shigella genomes. The results showed that the CRISPRs existed in the four groups of Shigella, and the flanking sequences of upstream CRISPRs could be classified into the same group with those of the downstream. We also found some relatively conserved palindromic motifs in the leader sequences. Repeat sequences had the same group with corresponding flanking sequences, and could be classified into two different types by their RNA secondary structures, which contain "stem" and "ring". Some spacers were found to homologize with part sequences of plasmids or phages. The study indicated that there were correlations between repeat sequences and flanking sequences, and the repeats might act as a kind of recognition mechanism to mediate the interaction between foreign genetic elements and Cas proteins.
A novel species-specific tandem repeat DNA family from Sinapis arvensis: detection of telomere-like sequences.

PubMed

Kapila, R; Das, S; Srivastava, P S; Lakshmikumaran, M

1996-08-01

DNA sequences representing a tandemly repeated DNA family of the Sinapis arvensis genome were cloned and characterized. The 700-bp tandem repeat family is represented by two clones, pSA35 and pSA52, which are 697 and 709 bp in length, respectively. Dot matrix analysis of the sequences indicates the presence of repeated elements within each monomeric unit. Sequence analysis of the repetitive region of clones pSA35 and pSA52 shows that there are several copies of a 7-bp repeat element organized in tandem. The consensus sequence of this repeat element is 5'-TTTAGGG-3'. These elements are highly mutated and the difference in length between the two clones is due to different copy numbers of these elements. The repetitive region of clone pSA35 has 26 copies of the element TTTAGGG, whereas clone pSA52 has 28 copies. The repetitive region in both clones is flanked on either side by inverted repeats that may be footprints of a transposition event. Sequence comparison indicates that the element TTTAGGG is identical to telomeric repeats present in Arabidopsis, maize, tomato, and other plants. However, Bal31 digestion kinetics indicates non-telomeric localization of the 700-bp tandem repeats. The clones represent a novel repeat family as (i) they contain telomere-like motifs as subrepeats within each unit; and (ii) they do not hybridize to related crucifers and are species-specific in nature.
Salivary testosterone and a trinucleotide (CAG) length polymorphism in the androgen receptor gene predict amygdala reactivity in men.

PubMed

Manuck, Stephen B; Marsland, Anna L; Flory, Janine D; Gorka, Adam; Ferrell, Robert E; Hariri, Ahmad R

2010-01-01

In studies employing functional magnetic resonance imaging (fMRI), reactivity of the amygdala to threat-related sensory cues (viz., facial displays of negative emotion) has been found to correlate positively with interindividual variability in testosterone levels of women and young men and to increase on acute administration of exogenous testosterone. Many of the biological actions of testosterone are mediated by intracellular androgen receptors (ARs), which exert transcriptional control of androgen-dependent genes and are expressed in various regions of the brain, including the amygdala. Transactivation potential of the AR decreases (yielding relative androgen insensitivity) with expansion a polyglutamine stretch in the N-terminal domain of the AR protein, as encoded by a trinucleotide (CAG) repeat polymorphism in exon 1 of the X-chromosome AR gene. Here we examined whether amygdala reactivity to threat-related facial expressions (fear, anger) differs as a function of AR CAG length variation and endogenous (salivary) testosterone in a mid-life sample of 41 healthy men (mean age=45.6 years, range: 34-54 years; CAG repeats, range: 19-29). Testosterone correlated inversely with participant age (r=-0.39, p=0.012) and positively with number of CAG repeats (r=0.45, p=0.003). In partial correlations adjusted for testosterone level, reactivity in the ventral amygdala was lowest among men with largest number of CAG repeats. This inverse association was seen in both the right (r(p)=-0.34, p<0.05) and left (r(p)=-0.32, p<0.05) hemisphere. Activation of dorsal amygdala, correlated positively with individual differences in salivary testosterone, also in right (r=0.40, p<0.02) and left (r=0.32, p<0.05) hemisphere, but was not affected by number of CAG repeats. Hence, androgenic influences on threat-related reactivity in the ventral amygdala may be moderated partially by CAG length variation in the AR gene. Because individual differences in salivary testosterone also predicted dorsal amygdala reactivity and did so independently of CAG repeats, it is suggested that androgenic influences within this anatomically distinct region may be mediated, in part, by non-genomic or AR-independent mechanisms.
Stability of Tandem Repeats in the Drosophila Melanogaster HSR-Omega Nuclear RNA

PubMed Central

Hogan, N. C.; Slot, F.; Traverse, K. L.; Garbe, J. C.; Bendena, W. G.; Pardue, M. L.

1995-01-01

The Drosophila melanogaster Hsr-omega locus produces a nuclear RNA containing >5 kb of tandem repeat sequences. These repeats are unique to Hsr-omega and show concerted evolution similar to that seen with classical satellite DNAs. In D. melanogaster the monomer is ~280 bp. Sequences of 191/2 monomers differ by 8 +/- 5% (mean +/- SD), when all pairwise comparisons are considered. Differences are single nucleotide substitutions and 1-3 nucleotide deletions/insertions. Changes appear to be randomly distributed over the repeat unit. Outer repeats do not show the decrease in monomer homogeneity that might be expected if homogeneity is maintained by recombination. However, just outside the last complete repeat at each end, there are a few fragments of sequence similar to the monomer. The sequences in these flanking regions are not those predicted for sequences decaying in the absence of recombination. Instead, the fragmentation of the sequence homology suggests that flanking regions have undergone more severe disruptions, possibly during an insertion or amplification event. Hsr-omega alleles differing in the number of repeats are detected and appear to be stable over a few thousand generations; however, both increases and decreases in repeat numbers have been observed. The new alleles appear to be as stable as their predecessors. No alleles of less than ~5 kb nor more than ~16 kb of repeats were seen in any stocks examined. The evidence that there is a limit on the minimum number of repeats is consistent with the suggestion that these repeats are important in the function of the unusual Hsr-omega nuclear RNA. PMID:7540581
Typing Clostridium difficile strains based on tandem repeat sequences

PubMed Central

2009-01-01

Background Genotyping of epidemic Clostridium difficile strains is necessary to track their emergence and spread. Portability of genotyping data is desirable to facilitate inter-laboratory comparisons and epidemiological studies. Results This report presents results from a systematic screen for variation in repetitive DNA in the genome of C. difficile. We describe two tandem repeat loci, designated 'TR6' and 'TR10', which display extensive sequence variation that may be useful for sequence-based strain typing. Based on an investigation of 154 C. difficile isolates comprising 75 ribotypes, tandem repeat sequencing demonstrated excellent concordance with widely used PCR ribotyping and equal discriminatory power. Moreover, tandem repeat sequences enabled the reconstruction of the isolates' largely clonal population structure and evolutionary history. Conclusion We conclude that sequence analysis of the two repetitive loci introduced here may be highly useful for routine typing of C. difficile. Tandem repeat sequence typing resolves phylogenetic diversity to a level equivalent to PCR ribotypes. DNA sequences may be stored in databases accessible over the internet, obviating the need for the exchange of reference strains. PMID:19133124
[Comparative analysis of clustered regularly interspaced short palindromic repeats (CRISPRs) loci in the genomes of halophilic archaea].

PubMed

Zhang, Fan; Zhang, Bing; Xiang, Hua; Hu, Songnian

2009-11-01

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a widespread system that provides acquired resistance against phages in bacteria and archaea. Here we aim to genome-widely analyze the CRISPR in extreme halophilic archaea, of which the whole genome sequences are available at present time. We used bioinformatics methods including alignment, conservation analysis, GC content and RNA structure prediction to analyze the CRISPR structures of 7 haloarchaeal genomes. We identified the CRISPR structures in 5 halophilic archaea and revealed a conserved palindromic motif in the flanking regions of these CRISPR structures. In addition, we found that the repeat sequences of large CRISPR structures in halophilic archaea were greatly conserved, and two types of predicted RNA secondary structures derived from the repeat sequences were likely determined by the fourth base of the repeat sequence. Our results support the proposal that the leader sequence may function as recognition site by having palindromic structures in flanking regions, and the stem-loop secondary structure formed by repeat sequences may function in mediating the interaction between foreign genetic elements and CAS-encoded proteins.

Anhydrous trifluoroacetic acid pretreatment converts insoluble polyglutamine peptides to soluble monomers.

PubMed

Burra, Gunasekhar; Thakur, Ashwani Kumar

2015-12-01

The data provided in this article are related to the research article entitled "Unaided trifluoroacetic acid pretreatment solubilizes polyglutamine (polyGln) peptides and retains their biophysical properties of aggregation" by Burra and Thakur (in press) [1]. This research article reports data from size exclusion chromatography (SEC), reversed phase-high performance liquid chromatography (RP-HPLC) and mass spectrometry (MS) assays. This data show that trifluoroacetic acid (TFA) has the ability to convert insoluble polyGln peptides to soluble monomers. The data also clarify the possibility of trifluoroacetylation modification caused due to TFA. We hope the data presented here will enhance the understanding of polyGln disaggregation and solubilization. For more insightful and useful discussions, see the research article published in Analytical Biochemistry: Methods in the Biological Sciences (Burra and Thakur, in press [1]).
Genome Wide Characterization of Simple Sequence Repeats in Cucumber

USDA-ARS?s Scientific Manuscript database

The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...
Telomere extension by telomerase and ALT generates variant repeats by mechanistically distinct processes

PubMed Central

Lee, Michael; Hills, Mark; Conomos, Dimitri; Stutz, Michael D.; Dagg, Rebecca A.; Lau, Loretta M.S.; Reddel, Roger R.; Pickett, Hilda A.

2014-01-01

Telomeres are terminal repetitive DNA sequences on chromosomes, and are considered to comprise almost exclusively hexameric TTAGGG repeats. We have evaluated telomere sequence content in human cells using whole-genome sequencing followed by telomere read extraction in a panel of mortal cell strains and immortal cell lines. We identified a wide range of telomere variant repeats in human cells, and found evidence that variant repeats are generated by mechanistically distinct processes during telomerase- and ALT-mediated telomere lengthening. Telomerase-mediated telomere extension resulted in biased repeat synthesis of variant repeats that differed from the canonical sequence at positions 1 and 3, but not at positions 2, 4, 5 or 6. This indicates that telomerase is most likely an error-prone reverse transcriptase that misincorporates nucleotides at specific positions on the telomerase RNA template. In contrast, cell lines that use the ALT pathway contained a large range of variant repeats that varied greatly between lines. This is consistent with variant repeats spreading from proximal telomeric regions throughout telomeres in a stochastic manner by recombination-mediated templating of DNA synthesis. The presence of unexpectedly large numbers of variant repeats in cells utilizing either telomere maintenance mechanism suggests a conserved role for variant sequences at human telomeres. PMID:24225324
Identification of Simple Sequence Repeats in Chloroplast Genomes of Magnoliids Through Bioinformatics Approach.

PubMed

Srivastava, Deepika; Shanker, Asheesh

2016-12-01

Basal angiosperms or Magnoliids is an important clade of commercially important plants which mainly include spices and edible fruits. In this study, 17 chloroplast genome sequences belonging to clade Magnoliids were screened for the identification of chloroplast simple sequence repeats (cpSSRs). Simple sequence repeats or microsatellites are short stretches of DNA up to 1-6 base pair in length. These repeats are ubiquitous and play important role in the development of molecular markers and to study the mapping of traits of economic, medical or ecological interest. A total of 479 SSRs were detected, showing average density of 1 SSR/6.91 kb. Depending on the repeat units, the length of SSRs ranged from 12 to 24 bp for mono-, 12 to 18 bp for di-, 12 to 26 bp for tri-, 12 to 24 bp for tetra-, 15 bp for penta- and 18 bp for hexanucleotide repeats. Mononucleotide repeats were the most frequent (207, 43.21 %) followed by tetranucleotide repeats (130, 27.13 %). Penta- and hexanucleotide repeats were least frequent or absent in these chloroplast genomes.
Molecular characterization and distribution of a 145-bp tandem repeat family in the genus Populus.

PubMed

Rajagopal, J; Das, S; Khurana, D K; Srivastava, P S; Lakshmikumaran, M

1999-10-01

This report aims to describe the identification and molecular characterization of a 145-bp tandem repeat family that accounts for nearly 1.5% of the Populus genome. Three members of this repeat family were cloned and sequenced from Populus deltoides and P. ciliata. The dimers of the repeat were sequenced in order to confirm the head-to-tail organization of the repeat. Hybridization-based analysis using the 145-bp tandem repeat as a probe on genomic DNA gave rise to ladder patterns which were identified to be a result of methylation and (or) sequence heterogeneity. Analysis of the methylation pattern of the repeat family using methylation-sensitive isoschizomers revealed variable methylation of the C residues and lack of methylation of the A residues. Sequence comparisons between the monomers revealed a high degree of sequence divergence that ranged between 6% and 11% in P. deltoides and between 4.2% and 8.3% in P. ciliata. This indicated the presence of sub-families within the 145-bp tandem family of repeats. Divergence was mainly due to the accumulation of point mutations and was concentrated in the central region of the repeat. The 145-bp tandem repeat family did not show significant homology to known tandem repeats from plants. A short stretch of 36 bp was found to show homology of 66.7% to a centromeric repeat from Chironomus plumosus. Dot-blot analysis and Southern hybridization data revealed the presence of the repeat family in 13 of the 14 Populus species examined. The absence of the 145-bp repeat from P. euphratica suggested that this species is relatively distant from other members of the genus, which correlates with taxonomic classifications. The widespread occurrence of the tandem family in the genus indicated that this family may be of ancient origin.
Small tandemly repeated DNA sequences of higher plants likely originate from a tRNA gene ancestor.

PubMed Central

Benslimane, A A; Dron, M; Hartmann, C; Rode, A

1986-01-01

Several monomers (177 bp) of a tandemly arranged repetitive nuclear DNA sequence of Brassica oleracea have been cloned and sequenced. They share up to 95% homology between one another and up to 80% with other satellite DNA sequences of Cruciferae, suggesting a common ancestor. Both strands of these monomers show more than 50% homology with many tRNA genes; the best homologies have been obtained with Lys and His yeast mitochondrial tRNA genes (respectively 64% and 60%). These results suggest that small tandemly repeated DNA sequences of plants may have evolved from a tRNA gene ancestor. These tandem repeats have probably arisen via a process involving reverse transcription of polymerase III RNA intermediates, as is the case for interspersed DNA sequences of mammalians. A model is proposed to explain the formation of such small tandemly repeated DNA sequences. Images PMID:3774553
Two tandemly repeated telomere-associated sequences in Nicotiana plumbaginifolia.

PubMed

Chen, C M; Wang, C T; Wang, C J; Ho, C H; Kao, Y Y; Chen, C C

1997-12-01

Two tandemly repeated telomere-associated sequences, NP3R and NP4R, have been isolated from Nicotiana plumbaginifolia. The length of a repeating unit for NP3R and NP4R is 165 and 180 nucleotides respectively. The abundance of NP3R, NP4R and telomeric repeats is, respectively, 8.4 x 10(4), 6 x 10(3) and 1.5 x 10(6) copies per haploid genome of N. plumbaginifolia. Fluorescence in situ hybridization revealed that NP3R is located at the ends and/or in interstitial regions of all 10 chromosomes and NP4R on the terminal regions of three chromosomes in the haploid genome of N. plumbaginifolia. Sequence homology search revealed that not only are NP3R and NP4R homologous to HRS60 and GRS, respectively, two tandem repeats isolated from N. tabacum, but that NP3R and NP4R are also related to each other, suggesting that they originated from a common ancestral sequence. The role of these repeated sequences in chromosome healing is discussed based on the observation that two to three copies of a telomere-similar sequence were present in each repeating unit of NP3R and NP4R.
Development of Pineapple Microsatellite Markers and Germplasm Genetic Diversity Analysis

PubMed Central

Tong, Helin; Chen, You; Wang, Jingyi; Chen, Yeyuan; Sun, Guangming; He, Junhu; Wu, Yaoting

2013-01-01

Two methods were used to develop pineapple microsatellite markers. Genomic library-based SSR development: using selectively amplified microsatellite assay, 86 sequences were generated from pineapple genomic library. 91 (96.8%) of the 94 Simple Sequence Repeat (SSR) loci were dinucleotide repeats (39 AC/GT repeats and 52 GA/TC repeats, accounting for 42.9% and 57.1%, resp.), and the other three were mononucleotide repeats. Thirty-six pairs of SSR primers were designed; 24 of them generated clear bands of expected sizes, and 13 of them showed polymorphism. EST-based SSR development: 5659 pineapple EST sequences obtained from NCBI were analyzed; among 1397 nonredundant EST sequences, 843 were found containing 1110 SSR loci (217 of them contained more than one SSR locus). Frequency of SSRs in pineapple EST sequences is 1SSR/3.73 kb, and 44 types were found. Mononucleotide, dinucleotide, and trinucleotide repeats dominate, accounting for 95.6% in total. AG/CT and AGC/GCT were the dominant type of dinucleotide and trinucleotide repeats, accounting for 83.5% and 24.1%, respectively. Thirty pairs of primers were designed for each of randomly selected 30 sequences; 26 of them generated clear and reproducible bands, and 22 of them showed polymorphism. Eighteen pairs of primers obtained by the one or the other of the two methods above that showed polymorphism were selected to carry out germplasm genetic diversity analysis for 48 breeds of pineapple; similarity coefficients of these breeds were between 0.59 and 1.00, and they can be divided into four groups accordingly. Amplification products of five SSR markers were extracted and sequenced, corresponding repeat loci were found and locus mutations are mainly in copy number of repeats and base mutations in the flanking region. PMID:24024187
Androgen receptor polyglutamine repeat length (AR-CAGn) modulates the effect of testosterone on androgen-associated somatic traits in Filipino young adult men.

PubMed

Ryan, Calen P; Georgiev, Alexander V; McDade, Thomas W; Gettler, Lee T; Eisenberg, Dan T A; Rzhetskaya, Margarita; Agustin, Sonny S; Hayes, M Geoffrey; Kuzawa, Christopher W

2017-06-01

The androgen receptor (AR) mediates expression of androgen-associated somatic traits such as muscle mass and strength. Within the human AR is a highly variable glutamine short-tandem repeat (AR-CAGn), and CAG repeat number has been inversely correlated to AR transcriptional activity in vitro. However, evidence for an attenuating effect of long AR-CAGn on androgen-associated somatic traits has been inconsistent in human populations. One possible explanation for this lack of consistency is that the effect of AR-CAGn on AR bioactivity in target tissues likely varies in relation to circulating androgen levels. We tested whether relationships between AR-CAGn and several androgen-associated somatic traits (waist circumference, lean mass, arm muscle area, and grip strength) were modified by salivary (waking and pre-bed) and circulating (total) testosterone (T) levels in young adult males living in metropolitan Cebu, Philippines (n = 675). When men's waking T was low, they had a reduction in three out of four androgen-associated somatic traits with lengthening AR-CAGn (p < .1), consistent with in vitro research. However, when waking T was high, we observed the opposite effect-lengthening AR-CAGn was associated with an increase in these same somatic traits. Our finding that longer AR-CAGn predicts greater androgen-associated trait expression among high-T men runs counter to in vitro work, but is generally consistent with the few prior studies to evaluate similar interactions in human populations. Collectively, these results raise questions about the applicability of findings derived from in vitro AR-CAGn studies to the receptor's role in maintaining androgen-associated somatic traits in human populations. © 2017 Wiley Periodicals, Inc.
A novel tandem repeat sequence located on human chromosome 4p: isolation and characterization.

PubMed

Kogi, M; Fukushige, S; Lefevre, C; Hadano, S; Ikeda, J E

1997-06-01

In an effort to analyze the genomic region of the distal half of human chromosome 4p, to where Huntington disease and other diseases have been mapped, we have isolated the cosmid clone (CRS447) that was likely to contain a region with specific repeat sequences. Clone CRS447 was subjected to detailed analysis, including chromosome mapping, restriction mapping, and DNA sequencing. Chromosome mapping by both a human-CHO hybrid cell panel and FISH revealed that CRS447 was predominantly located in the 4p15.1-15.3 region. CRS447 was shown to consist of tandem repeats of 4.7-kb units present on chromosome 4p. A single EcoRI unit was subcloned (pRS447), and the complete sequence was determined as 4752 nucleotides. When pRS447 was used as a probe, the number of copies of this repeat per haploid genome was estimated to be 50-70. Sequence analysis revealed that it contained two internal CA repeats and one putative ORF. Database search established that this sequence was unreported. However, two homologous STS markers were found in the database. We concluded that CRS447/pRS447 is a novel tandem repeat sequence that is mainly specific to human chromosome 4p.
Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

PubMed

Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

2011-01-01

Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.
Survey and Analysis of Microsatellites in the Silkworm, Bombyx mori

PubMed Central

Prasad, M. Dharma; Muthulakshmi, M.; Madhu, M.; Archak, Sunil; Mita, K.; Nagaraju, J.

2005-01-01

We studied microsatellite frequency and distribution in 21.76-Mb random genomic sequences, 0.67-Mb BAC sequences from the Z chromosome, and 6.3-Mb EST sequences of Bombyx mori. We mined microsatellites of ≥15 bases of mononucleotide repeats and ≥5 repeat units of other classes of repeats. We estimated that microsatellites account for 0.31% of the genome of B. mori. Microsatellite tracts of A, AT, and ATT were the most abundant whereas their number drastically decreased as the length of the repeat motif increased. In general, tri- and hexanucleotide repeats were overrepresented in the transcribed sequences except TAA, GTA, and TGA, which were in excess in genomic sequences. The Z chromosome sequences contained shorter repeat types than the rest of the chromosomes in addition to a higher abundance of AT-rich repeats. Our results showed that base composition of the flanking sequence has an influence on the origin and evolution of microsatellites. Transitions/transversions were high in microsatellites of ESTs, whereas the genomic sequence had an equal number of substitutions and indels. The average heterozygosity value for 23 polymorphic microsatellite loci surveyed in 13 diverse silkmoth strains having 2–14 alleles was 0.54. Only 36 (18.2%) of 198 microsatellite loci were polymorphic between the two divergent silkworm populations and 10 (5%) loci revealed null alleles. The microsatellite map generated using these polymorphic markers resulted in 8 linkage groups. B. mori microsatellite loci were the most conserved in its immediate ancestor, B. mandarina, followed by the wild saturniid silkmoth, Antheraea assama. PMID:15371363
Spatio-temporal Variations of Characteristic Repeating Earthquake Sequences along the Middle America Trench in Mexico

NASA Astrophysics Data System (ADS)

Dominguez, L. A.; Taira, T.; Hjorleifsdottir, V.; Santoyo, M. A.

2015-12-01

Repeating earthquake sequences are sets of events that are thought to rupture the same area on the plate interface and thus provide nearly identical waveforms. We systematically analyzed seismic records from 2001 through 2014 to identify repeating earthquakes with highly correlated waveforms occurring along the subduction zone of the Cocos plate. Using the correlation coefficient (cc) and spectral coherency (coh) of the vertical components as selection criteria, we found a set of 214 sequences whose waveforms exceed cc≥95% and coh≥95%. Spatial clustering along the trench shows large variations in repeating earthquakes activity. Particularly, the rupture zone of the M8.1, 1985 earthquake shows an almost absence of characteristic repeating earthquakes, whereas the Guerrero Gap zone and the segment of the trench close to the Guerrero-Oaxaca border shows a significantly larger number of repeating earthquakes sequences. Furthermore, temporal variations associated to stress changes due to major shows episodes of unlocking and healing of the interface. Understanding the different components that control the location and recurrence time of characteristic repeating sequences is a key factor to pinpoint areas where large megathrust earthquakes may nucleate and consequently to improve the seismic hazard assessment.
Mutant Ataxin-1 Inhibits Neural Progenitor Cell Proliferation in SCA1

PubMed Central

Cvetanovic, Marija; Hu, Yuan-Shih; Opal, Puneet

2017-01-01

Spinocerebellar ataxia type 1 (SCA1) is a dominantly inherited neurodegenerative disease caused by the expansion of a polyglutamine (Q) repeat tract in the protein ataxin-1 (ATXN1). Beginning as a cerebellar ataxic disorder, SCA1 progresses to involve the cerebral cortex, hippocampus, and brainstem. Using SCA1 knock-in mice that mirror the complexity of the human disease, we report a significant decrease in the capacity of adult neuronal progenitor cells (NPCs) to proliferate. Remarkably, a decrease in NPCs proliferation can be observed in vitro, outside the degenerative milieu of surrounding neurons or glia, demonstrating that mutant ATXN1 acting cell autonomously within progenitor cells interferes with their ability to proliferate. Our findings suggest that compromised adult neurogenesis contributes to the progressive pathology of the disease particularly in areas such as the hippocampus and cerebral cortex where stem cells provide neurotropic factors and participate in adult neurogenesis. These findings not only shed light on the biology of the disease but also have therapeutic implications in any future stem cell- based clinical trials. PMID:27306906
Mutant Ataxin-1 Inhibits Neural Progenitor Cell Proliferation in SCA1.

PubMed

Cvetanovic, Marija; Hu, Yuan-Shih; Opal, Puneet

2017-04-01

Spinocerebellar ataxia type 1 (SCA1) is a dominantly inherited neurodegenerative disease caused by the expansion of a polyglutamine (Q) repeat tract in the protein ataxin-1 (ATXN1). Beginning as a cerebellar ataxic disorder, SCA1 progresses to involve the cerebral cortex, hippocampus, and brainstem. Using SCA1 knock-in mice that mirror the complexity of the human disease, we report a significant decrease in the capacity of adult neuronal progenitor cells (NPCs) to proliferate. Remarkably, a decrease in NPCs proliferation can be observed in vitro, outside the degenerative milieu of surrounding neurons or glia, demonstrating that mutant ATXN1 acting cell autonomously within progenitor cells interferes with their ability to proliferate. Our findings suggest that compromised adult neurogenesis contributes to the progressive pathology of the disease particularly in areas such as the hippocampus and cerebral cortex where stem cells provide neurotropic factors and participate in adult neurogenesis. These findings not only shed light on the biology of the disease but also have therapeutic implications in any future stem cell-based clinical trials.
Cystathionine γ-lyase deficiency mediates neurodegeneration in Huntington’s disease

PubMed Central

Paul, Bindu D.; Sbodio, Juan I.; Xu, Risheng; Vandiver, M. Scott; Cha, Jiyoung Y.; Snowman, Adele M.; Snyder, Solomon H.

2015-01-01

Huntington’s disease is an autosomal dominant disease associated with a mutation in the gene encoding huntingtin (Htt) leading to expanded polyglutamine repeats of mutant Htt (mHtt) that elicit oxidative stress, neurotoxicity, and motor and behavioural changes1. Huntington’s disease is characterized by highly selective and profound damage to the corpus striatum, which regulates motor function. Striatal selectivity of Huntington’s disease may reflect the striatally selective small G protein Rhes binding to mHtt and enhancing its neurotoxicity2. Specific molecular mechanisms by which mHtt elicits neurodegeneration have been hard to determine. Here we show a major depletion of cystathionine γ-lyase (CSE), the biosynthetic enzyme for cysteine, in Huntington’s disease tissues, which may mediate Huntington’s disease pathophysiology. The defect occurs at the transcriptional level and seems to reflect influences of mHtt on specificity protein 1, a transcriptional activator for CSE. Consistent with the notion of loss of CSE as a pathogenic mechanism, supplementation with cysteine reverses abnormalities in cultures of Huntington’s disease tissues and in intact mouse models of Huntington’s disease, suggesting therapeutic potential. PMID:24670645
Algorithm to find distant repeats in a single protein sequence

PubMed Central

Banerjee, Nirjhar; Sarani, Rangarajan; Ranjani, Chellamuthu Vasuki; Sowmiya, Govindaraj; Michael, Daliah; Balakrishnan, Narayanasamy; Sekar, Kanagaraj

2008-01-01

Distant repeats in protein sequence play an important role in various aspects of protein analysis. A keen analysis of the distant repeats would enable to establish a firm relation of the repeats with respect to their function and three-dimensional structure during the evolutionary process. Further, it enlightens the diversity of duplication during the evolution. To this end, an algorithm has been developed to find all distant repeats in a protein sequence. The scores from Point Accepted Mutation (PAM) matrix has been deployed for the identification of amino acid substitutions while detecting the distant repeats. Due to the biological importance of distant repeats, the proposed algorithm will be of importance to structural biologists, molecular biologists, biochemists and researchers involved in phylogenetic and evolutionary studies. PMID:19052663
Genetics Home Reference: dentatorubral-pallidoluysian atrophy

MedlinePlus

... Kawashima M, Tanaka F, Adachi H, Sobue G. Molecular genetics and biomarkers of polyglutamine diseases. Curr Mol Med. ... PubMed Tsuji S. Dentatorubral-pallidoluysian atrophy: clinical aspects and molecular genetics. Adv Neurol. 2002;89:231-9. Review. Citation ...
Search for Length Dependent Stable Structures of Polyglutamaine Proteins with Replica Exchange Molecular Dynamic

NASA Astrophysics Data System (ADS)

Kluber, Alexander; Hayre, Robert; Cox, Daniel

2012-02-01

Motivated by the need to find beta-structure aggregation nuclei for the polyQ diseases such as Huntington's, we have undertaken a search for length dependent structure in model polyglutamine proteins. We use the Onufriev-Bashford-Case (OBC) generalized Born implicit solvent GPU based AMBER11 molecular dynamics with the parm96 force field coupled with a replica exchange method to characterize monomeric strands of polyglutamine as a function of chain length and temperature. This force field and solvation method has been shown among other methods to accurately reproduce folded metastability in certain small peptides, and to yield accurately de novo folded structures in a millisecond time-scale protein. Using GPU molecular dynamics we can sample out into the microsecond range. Additionally, explicit solvent runs will be used to verify results from the implicit solvent runs. We will assess order using measures of secondary structure and hydrogen bond content.
Characterization of (CA)n microsatellite repeats from large-insert clones.

PubMed

Litt, M; Browne, D

2001-05-01

The most laborious part of developing (CA)n microsatellite repeats as genetic markers is constructing DNA clones to permit determination of sequences flanking the microsatellites. When cosmids or large-insert phage clones are used as primary sources of (CA)n repeat markers, they have traditionally been subcloned into plasmid vectors such as pUC18 or M13 mp 18/19 cloning vectors to obtain fragments of suitable size for DNA sequencing. This unit presents an alternative approach whereby a set of degenerate sequencing primers that anneal directly to (CA)n microsatellites can be used to determine sequences that are inaccessible with vector-derived primers. Because the primers anneal to the repeat and not to the vector, they can be used with subclones containing inserts of several kilobases and should, in theory, always give sequence in the regions directly flanking the repeat. Degeneracy at the 3 end of each of these primers prevents elongation of primers that have annealed out-of-register. The most laborious part of developing (CA)n microsatellite repeats as genetic markers is constructing DNA clones to permit.

Structural analysis of the rDNA intergenic spacer of Brassica nigra: evolutionary divergence of the spacers of the three diploid Brassica species.

PubMed

Bhatia, S; Singh Negi, M; Lakshmikumaran, M

1996-11-01

EcoRI restriction of the B. nigra rDNA recombinants, isolated from a lambda genomic library, showed that the 3.9-kb fragment corresponded to the Intergenic Spacer (IGS), which was sequenced and found to be 3,928 bp in size. Sequence and dot-matrix analyses showed that the organization of the B. nigra rDNA IGS was typical of most rDNA spacers, consisting of a central repetitive region and flanking unique sequences on either side. The repetitive region was composed of two repeat families-RF 'A' and RF 'B.' The B. nigra RF 'A' consisted of a tandem array of three full-length copies of a 106-bp sequence element. RF 'B' was composed of 66 tandemly repeated elements. Each 'B' element was only 21-bp in size and this is the smallest repeat unit identified in plant rDNA to date. The putative transcription initiation site (TIS) was identified as nucleotide position 3,110. Based on the sequence analysis it was suggested that the present organization of the repeat families was generated by successive cycles of deletions and amplifications and was being maintained by homogenization processes such as gene conversion and crossing-over.A detailed comparison of the rDNA IGS sequences of the three diploid Brassica species-namely, B. nigra, B. campestris, and B. oleracea-was carried out. First, comparisons revealed that B. campestris and B. oleracea were close to each other as the repeat families in both showed high sequence homology between each other. Second, the repeat elements in both the species were organized in an interspersed manner. Third, a 52-bp sequence, present just downstream of the repeats in B. campestris, was found to be identical to the B. oleracea repeats, thereby suggesting a common progenitor. On the other hand, in B. nigra no interspersion pattern of organization of repeats was observed. Further, the B. nigra RF 'A' was identified as distinct from the repeat families of B. campestris and B. oleracea. Based on this analysis, it was suggested that during speciation B. campestris and B. oleracea evolved in one lineage whereas B. nigra diverged into a separate lineage. The comparative analysis of the IGS helped in identifying not only conserved ancestral sequence motifs of possible functional significance such as promoters and enhancers, but also sequences which showed variation between the three diploid species and were therefore identified as species-specific sequences.
Interstitial telomeric sequences in vertebrate chromosomes: Origin, function, instability and evolution.

PubMed

Bolzán, Alejandro D

2017-07-01

By definition, telomeric sequences are located at the very ends or terminal regions of chromosomes. However, several vertebrate species show blocks of (TTAGGG)n repeats present in non-terminal regions of chromosomes, the so-called interstitial telomeric sequences (ITSs), interstitial telomeric repeats or interstitial telomeric bands, which include those intrachromosomal telomeric-like repeats located near (pericentromeric ITSs) or within the centromere (centromeric ITSs) and those telomeric repeats located between the centromere and the telomere (i.e., truly interstitial telomeric sequences) of eukaryotic chromosomes. According with their sequence organization, localization and flanking sequences, ITSs can be classified into four types: 1) short ITSs, 2) subtelomeric ITSs, 3) fusion ITSs, and 4) heterochromatic ITSs. The first three types have been described mainly in the human genome, whereas heterochromatic ITSs have been found in several vertebrate species but not in humans. Several lines of evidence suggest that ITSs play a significant role in genome instability and evolution. This review aims to summarize our current knowledge about the origin, function, instability and evolution of these telomeric-like repeats in vertebrate chromosomes. Copyright © 2017 Elsevier B.V. All rights reserved.
Clustered regularly interspaced short palindromic repeats (CRISPRs) for the genotyping of bacterial pathogens.

PubMed

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2009-01-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) are DNA sequences composed of a succession of repeats (23- to 47-bp long) separated by unique sequences called spacers. Polymorphism can be observed in different strains of a species and may be used for genotyping. We describe protocols and bioinformatics tools that allow the identification of CRISPRs from sequenced genomes, their comparison, and their component determination (the direct repeats and the spacers). A schematic representation of the spacer organization can be produced, allowing an easy comparison between strains.
Mapping Simple Repeated DNA Sequences in Heterochromatin of Drosophila Melanogaster

PubMed Central

Lohe, A. R.; Hilliker, A. J.; Roberts, P. A.

1993-01-01

Heterochromatin in Drosophila has unusual genetic, cytological and molecular properties. Highly repeated DNA sequences (satellites) are the principal component of heterochromatin. Using probes from cloned satellites, we have constructed a chromosome map of 10 highly repeated, simple DNA sequences in heterochromatin of mitotic chromosomes of Drosophila melanogaster. Despite extensive sequence homology among some satellites, chromosomal locations could be distinguished by stringent in situ hybridizations for each satellite. Only two of the localizations previously determined using gradient-purified bulk satellite probes are correct. Eight new satellite localizations are presented, providing a megabase-level chromosome map of one-quarter of the genome. Five major satellites each exhibit a multichromosome distribution, and five minor satellites hybridize to single sites on the Y chromosome. Satellites closely related in sequence are often located near one another on the same chromosome. About 80% of Y chromosome DNA is composed of nine simple repeated sequences, in particular (AAGAC)(n) (8 Mb), (AAGAG)(n) (7 Mb) and (AATAT)(n) (6 Mb). Similarly, more than 70% of the DNA in chromosome 2 heterochromatin is composed of five simple repeated sequences. We have also generated a high resolution map of satellites in chromosome 2 heterochromatin, using a series of translocation chromosomes whose breakpoints in heterochromatin were ordered by N-banding. Finally, staining and banding patterns of heterochromatic regions are correlated with the locations of specific repeated DNA sequences. The basis for the cytochemical heterogeneity in banding appears to depend exclusively on the different satellite DNAs present in heterochromatin. PMID:8375654
Identification of the centromeric repeat in the threespine stickleback fish (Gasterosteus aculeatus).

PubMed

Cech, Jennifer N; Peichel, Catherine L

2015-12-01

Centromere sequences exist as gaps in many genome assemblies due to their repetitive nature. Here we take an unbiased approach utilizing centromere protein A (CENP-A) chomatin immunoprecipitation followed by high-throughput sequencing to identify the centromeric repeat sequence in the threespine stickleback fish (Gasterosteus aculeatus). A 186-bp, AT-rich repeat was validated as centromeric using both fluorescence in situ hybridization (FISH) and immunofluorescence combined with FISH (IF-FISH) on interphase nuclei and metaphase spreads. This repeat hybridizes strongly to the centromere on all chromosomes, with the exception of weak hybridization to the Y chromosome. Together, our work provides the first validated sequence information for the threespine stickleback centromere.
Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution

PubMed Central

2012-01-01

Background Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis. PMID:23020678
Repeatless and repeat-based centromeres in potato: implications for centromere evolution.

PubMed

Gong, Zhiyun; Wu, Yufeng; Koblízková, Andrea; Torres, Giovana A; Wang, Kai; Iovene, Marina; Neumann, Pavel; Zhang, Wenli; Novák, Petr; Buell, C Robin; Macas, Jirí; Jiang, Jiming

2012-09-01

Centromeres in most higher eukaryotes are composed of long arrays of satellite repeats. By contrast, most newly formed centromeres (neocentromeres) do not contain satellite repeats and instead include DNA sequences representative of the genome. An unknown question in centromere evolution is how satellite repeat-based centromeres evolve from neocentromeres. We conducted a genome-wide characterization of sequences associated with CENH3 nucleosomes in potato (Solanum tuberosum). Five potato centromeres (Cen4, Cen6, Cen10, Cen11, and Cen12) consisted primarily of single- or low-copy DNA sequences. No satellite repeats were identified in these five centromeres. At least one transcribed gene was associated with CENH3 nucleosomes. Thus, these five centromeres structurally resemble neocentromeres. By contrast, six potato centromeres (Cen1, Cen2, Cen3, Cen5, Cen7, and Cen8) contained megabase-sized satellite repeat arrays that are unique to individual centromeres. The satellite repeat arrays likely span the entire functional cores of these six centromeres. At least four of the centromeric repeats were amplified from retrotransposon-related sequences and were not detected in Solanum species closely related to potato. The presence of two distinct types of centromeres, coupled with the boom-and-bust cycles of centromeric satellite repeats in Solanum species, suggests that repeat-based centromeres can rapidly evolve from neocentromeres by de novo amplification and insertion of satellite repeats in the CENH3 domains.
Repeatless and Repeat-Based Centromeres in Potato: Implications for Centromere Evolution[C][W

PubMed Central

Gong, Zhiyun; Wu, Yufeng; Koblížková, Andrea; Torres, Giovana A.; Wang, Kai; Iovene, Marina; Neumann, Pavel; Zhang, Wenli; Novák, Petr; Buell, C. Robin; Macas, Jiří; Jiang, Jiming

2012-01-01

Centromeres in most higher eukaryotes are composed of long arrays of satellite repeats. By contrast, most newly formed centromeres (neocentromeres) do not contain satellite repeats and instead include DNA sequences representative of the genome. An unknown question in centromere evolution is how satellite repeat-based centromeres evolve from neocentromeres. We conducted a genome-wide characterization of sequences associated with CENH3 nucleosomes in potato (Solanum tuberosum). Five potato centromeres (Cen4, Cen6, Cen10, Cen11, and Cen12) consisted primarily of single- or low-copy DNA sequences. No satellite repeats were identified in these five centromeres. At least one transcribed gene was associated with CENH3 nucleosomes. Thus, these five centromeres structurally resemble neocentromeres. By contrast, six potato centromeres (Cen1, Cen2, Cen3, Cen5, Cen7, and Cen8) contained megabase-sized satellite repeat arrays that are unique to individual centromeres. The satellite repeat arrays likely span the entire functional cores of these six centromeres. At least four of the centromeric repeats were amplified from retrotransposon-related sequences and were not detected in Solanum species closely related to potato. The presence of two distinct types of centromeres, coupled with the boom-and-bust cycles of centromeric satellite repeats in Solanum species, suggests that repeat-based centromeres can rapidly evolve from neocentromeres by de novo amplification and insertion of satellite repeats in the CENH3 domains. PMID:22968715
Molecular structure and chromosome distribution of three repetitive DNA families in Anemone hortensis L. (Ranunculaceae).

PubMed

Mlinarec, Jelena; Chester, Mike; Siljak-Yakovlev, Sonja; Papes, Drazena; Leitch, Andrew R; Besendorfer, Visnja

2009-01-01

The structure, abundance and location of repetitive DNA sequences on chromosomes can characterize the nature of higher plant genomes. Here we report on three new repeat DNA families isolated from Anemone hortensis L.; (i) AhTR1, a family of satellite DNA (stDNA) composed of a 554-561 bp long EcoRV monomer; (ii) AhTR2, a stDNA family composed of a 743 bp long HindIII monomer and; (iii) AhDR, a repeat family composed of a 945 bp long HindIII fragment that exhibits some sequence similarity to Ty3/gypsy-like retroelements. Fluorescence in-situ hybridization (FISH) to metaphase chromosomes of A. hortensis (2n = 16) revealed that both AhTR1 and AhTR2 sequences co-localized with DAPI-positive AT-rich heterochromatic regions. AhTR1 sequences occur at intercalary DAPI bands while AhTR2 sequences occur at 8-10 terminally located heterochromatic blocks. In contrast AhDR sequences are dispersed over all chromosomes as expected of a Ty3/gypsy-like element. AhTR2 and AhTR1 repeat families include polyA- and polyT-tracks, AT/TA-motifs and a pentanucleotide sequence (CAAAA) that may have consequences for chromatin packing and sequence homogeneity. AhTR2 repeats also contain TTTAGGG motifs and degenerate variants. We suggest that they arose by interspersion of telomeric repeats with subtelomeric repeats, before hybrid unit(s) amplified through the heterochromatic domain. The three repetitive DNA families together occupy approximately 10% of the A. hortensis genome. Comparative analyses of eight Anemone species revealed that the divergence of the A. hortensis genome was accompanied by considerable modification and/or amplification of repeats.
Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

DOEpatents

Weier, H.U.G.; Gray, J.W.

1995-06-27

A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.
Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

DOEpatents

Weier, Heinz-Ulrich G.; Gray, Joe W.

1995-01-01

A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.
De novo identification of highly diverged protein repeats by probabilistic consistency.

PubMed

Biegert, A; Söding, J

2008-03-15

An estimated 25% of all eukaryotic proteins contain repeats, which underlines the importance of duplication for evolving new protein functions. Internal repeats often correspond to structural or functional units in proteins. Methods capable of identifying diverged repeated segments or domains at the sequence level can therefore assist in predicting domain structures, inferring hypotheses about function and mechanism, and investigating the evolution of proteins from smaller fragments. We present HHrepID, a method for the de novo identification of repeats in protein sequences. It is able to detect the sequence signature of structural repeats in many proteins that have not yet been known to possess internal sequence symmetry, such as outer membrane beta-barrels. HHrepID uses HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs. In contrast to a previous method, the new method (1) generates a multiple alignment of repeats; (2) utilizes the transitive nature of homology through a novel merging procedure with fully probabilistic treatment of alignments; (3) improves alignment quality through an algorithm that maximizes the expected accuracy; (4) is able to identify different kinds of repeats within complex architectures by a probabilistic domain boundary detection method and (5) improves sensitivity through a new approach to assess statistical significance. Server: http://toolkit.tuebingen.mpg.de/hhrepid; Executables: ftp://ftp.tuebingen.mpg.de/pub/protevo/HHrepID
Mutant Huntingtin Gene-Dose Impacts on Aggregate Deposition, DARPP32 Expression and Neuroinflammation in HdhQ150 Mice

PubMed Central

Young, Douglas; Mayer, Franziska; Vidotto, Nella; Schweizer, Tatjana; Berth, Ramon; Abramowski, Dorothee; Shimshek, Derya R.; van der Putten, P. Herman; Schmid, Peter

2013-01-01

Huntington's disease (HD) is an autosomal dominant, progressive and fatal neurological disorder caused by an expansion of CAG repeats in exon-1 of the huntingtin gene. The encoded poly-glutamine stretch renders mutant huntingtin prone to aggregation. HdhQ150 mice genocopy a pathogenic repeat (∼150 CAGs) in the endogenous mouse huntingtin gene and model predominantly pre-manifest HD. Treating early is likely important to prevent or delay HD, and HdhQ150 mice may be useful to assess therapeutic strategies targeting pre-manifest HD. This requires appropriate markers and here we demonstrate, that pre-symptomatic HdhQ150 mice show several dramatic mutant huntingtin gene-dose dependent pathological changes including: (i) an increase of neuronal intra-nuclear inclusions (NIIs) in brain, (ii) an increase of extra-nuclear aggregates in dentate gyrus, (iii) a decrease of DARPP32 protein and (iv) an increase in glial markers of neuroinflammation, which curiously did not correlate with local neuronal mutant huntingtin inclusion-burden. HdhQ150 mice developed NIIs also in all retinal neuron cell-types, demonstrating that retinal NIIs are not specific to human exon-1 R6 HD mouse models. Taken together, the striking and robust mutant huntingtin gene-dose related changes in aggregate-load, DARPP32 levels and glial activation markers should greatly facilitate future testing of therapeutic strategies in the HdhQ150 HD mouse model. PMID:24086450
Detecting and Characterizing Repeating Earthquake Sequences During Volcanic Eruptions

NASA Astrophysics Data System (ADS)

Tepp, G.; Haney, M. M.; Wech, A.

2017-12-01

A major challenge in volcano seismology is forecasting eruptions. Repeating earthquake sequences often precede volcanic eruptions or lava dome activity, providing an opportunity for short-term eruption forecasting. Automatic detection of these sequences can lead to timely eruption notification and aid in continuous monitoring of volcanic systems. However, repeating earthquake sequences may also occur after eruptions or along with magma intrusions that do not immediately lead to an eruption. This additional challenge requires a better understanding of the processes involved in producing these sequences to distinguish those that are precursory. Calculation of the inverse moment rate and concepts from the material failure forecast method can lead to such insights. The temporal evolution of the inverse moment rate is observed to differ for precursory and non-precursory sequences, and multiple earthquake sequences may occur concurrently. These observations suggest that sequences may occur in different locations or through different processes. We developed an automated repeating earthquake sequence detector and near real-time alarm to send alerts when an in-progress sequence is identified. Near real-time inverse moment rate measurements can further improve our ability to forecast eruptions by allowing for characterization of sequences. We apply the detector to eruptions of two Alaskan volcanoes: Bogoslof in 2016-2017 and Redoubt Volcano in 2009. The Bogoslof eruption produced almost 40 repeating earthquake sequences between its start in mid-December 2016 and early June 2017, 21 of which preceded an explosive eruption, and 2 sequences in the months before eruptive activity. Three of the sequences occurred after the implementation of the alarm in late March 2017 and successfully triggered alerts. The nearest seismometers to Bogoslof are over 45 km away, requiring a detector that can work with few stations and a relatively low signal-to-noise ratio. During the Redoubt eruption, earthquake sequences were observed in the months leading up to the eruptive activity beginning in March 2009 as well as immediately preceding 7 of the 19 explosive events. In contrast to Bogoslof, Redoubt has a local monitoring network which allows for better detection and more detailed analysis of the repeating earthquake sequences.
The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats

PubMed Central

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2007-01-01

Background In Archeae and Bacteria, the repeated elements called CRISPRs for "clustered regularly interspaced short palindromic repeats" are believed to participate in the defence against viruses. Short sequences called spacers are stored in-between repeated elements. In the current model, motifs comprising spacers and repeats may target an invading DNA and lead to its degradation through a proposed mechanism similar to RNA interference. Analysis of intra-species polymorphism shows that new motifs (one spacer and one repeated element) are added in a polarised fashion. Although their principal characteristics have been described, a lot remains to be discovered on the way CRISPRs are created and evolve. As new genome sequences become available it appears necessary to develop automated scanning tools to make available CRISPRs related information and to facilitate additional investigations. Description We have produced a program, CRISPRFinder, which identifies CRISPRs and extracts the repeated and unique sequences. Using this software, a database is constructed which is automatically updated monthly from newly released genome sequences. Additional tools were created to allow the alignment of flanking sequences in search for similarities between different loci and to build dictionaries of unique sequences. To date, almost six hundred CRISPRs have been identified in 475 published genomes. Two Archeae out of thirty-seven and about half of Bacteria do not possess a CRISPR. Fine analysis of repeated sequences strongly supports the current view that new motifs are added at one end of the CRISPR adjacent to the putative promoter. Conclusion It is hoped that availability of a public database, regularly updated and which can be queried on the web will help in further dissecting and understanding CRISPR structure and flanking sequences evolution. Subsequent analyses of the intra-species CRISPR polymorphism will be facilitated by CRISPRFinder and the dictionary creator. CRISPRdb is accessible at PMID:17521438
Assembly of a phased diploid Candida albicans genome facilitates allele-specific measurements and provides a simple model for repeat and indel structure

PubMed Central

2013-01-01

Background Candida albicans is a ubiquitous opportunistic fungal pathogen that afflicts immunocompromised human hosts. With rare and transient exceptions the yeast is diploid, yet despite its clinical relevance the respective sequences of its two homologous chromosomes have not been completely resolved. Results We construct a phased diploid genome assembly by deep sequencing a standard laboratory wild-type strain and a panel of strains homozygous for particular chromosomes. The assembly has 700-fold coverage on average, allowing extensive revision and expansion of the number of known SNPs and indels. This phased genome significantly enhances the sensitivity and specificity of allele-specific expression measurements by enabling pooling and cross-validation of signal across multiple polymorphic sites. Additionally, the diploid assembly reveals pervasive and unexpected patterns in allelic differences between homologous chromosomes. Firstly, we see striking clustering of indels, concentrated primarily in the repeat sequences in promoters. Secondly, both indels and their repeat-sequence substrate are enriched near replication origins. Finally, we reveal an intimate link between repeat sequences and indels, which argues that repeat length is under selective pressure for most eukaryotes. This connection is described by a concise one-parameter model that explains repeat-sequence abundance in C. albicans as a function of the indel rate, and provides a general framework to interpret repeat abundance in species ranging from bacteria to humans. Conclusions The phased genome assembly and insights into repeat plasticity will be valuable for better understanding allele-specific phenomena and genome evolution. PMID:24025428
Unrelated sequences at the 5' end of mouse LINE-1 repeated elements define two distinct subfamilies.

PubMed Central

Wincker, P; Jubier-Maurin, V; Roizès, G

1987-01-01

Some full length members of the mouse long interspersed repeated DNA family L1Md have been shown to be associated at their 5' end with a variable number of tandem repetitions, the A repeats, that have been suggested to be transcription controlling elements. We report that the other type of repeat, named F, found at the 5' end of a few L1 elements is also an integral part of full length L1 copies. Sequencing shows that the F repeats are GC rich, and organized in tandem. The L1 copies associated with either A or F repeats can be correlated with two different subsets of L1 sequences distinguished by a series of variant nucleotides specific to each and by unassociated but frequent restriction sites. These findings suggest that sequence replacement has occurred at least once in 5' of L1Md, and is related to the generation of specific subfamilies. Images PMID:3684566
Plant chromosomes from end to end: telomeres, heterochromatin and centromeres.

PubMed

Lamb, Jonathan C; Yu, Weichang; Han, Fangpu; Birchler, James A

2007-04-01

Recent evidence indicates that heterochromatin in plants is composed of heterogeneous sequences, which are usually composed of transposable elements or tandem repeat arrays. These arrays are associated with chromatin modifications that produce a closed configuration that limits transcription. Centromere sequences in plants are usually composed of tandem repeat arrays that are homogenized across the genome. Analysis of such arrays in closely related taxa suggests a rapid turnover of the repeat unit that is typical of a particular species. In addition, two lines of evidence for an epigenetic component of centromere specification have been reported, namely an example of a neocentromere formed over sequences without the typical repeat array and examples of centromere inactivation. Although the telomere repeat unit is quite prevalent in the plant kingdom, unusual repeats have been found in some families. Recently, it was demonstrated that the introduction of telomere sequences into plants cells causes truncation of the chromosomes, and that this technique can be used to produce artificial chromosome platforms.
SSRscanner: a program for reporting distribution and exact location of simple sequence repeats.

PubMed

Anwar, Tamanna; Khan, Asad U

2006-02-20

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com.
A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

PubMed Central

2010-01-01

Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT). Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the repeat may be disseminated by HGT and intra-genomic shuffling. Conclusions We describe novel features of PARCELs (Palindromic Amphipathic Repeat Coding ELements), a set of widely distributed repeat protein domains and coding sequences that were likely acquired through HGT by diverse unicellular microbes, further mobilized and diversified within genomes, and co-opted for expression in the membrane proteome of some taxa. Disseminated by multiple gene-centric vehicles, ORFs harboring these elements enhance accessory gene pools as part of the "mobilome" connecting genomes of various clades, in taxa sharing common niches. PMID:20626840

A TALE-inspired computational screen for proteins that contain approximate tandem repeats.

PubMed

Perycz, Malgorzata; Krwawicz, Joanna; Bochtler, Matthias

2017-01-01

TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen.
A TALE-inspired computational screen for proteins that contain approximate tandem repeats

PubMed Central

Krwawicz, Joanna

2017-01-01

TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen. PMID:28617832
Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing.

PubMed

Hribová, Eva; Neumann, Pavel; Matsumoto, Takashi; Roux, Nicolas; Macas, Jirí; Dolezel, Jaroslav

2010-09-16

Bananas and plantains (Musa spp.) are grown in more than a hundred tropical and subtropical countries and provide staple food for hundreds of millions of people. They are seed-sterile crops propagated clonally and this makes them vulnerable to a rapid spread of devastating diseases and at the same time hampers breeding improved cultivars. Although the socio-economic importance of bananas and plantains cannot be overestimated, they remain outside the focus of major research programs. This slows down the study of nuclear genome and the development of molecular tools to facilitate banana improvement. In this work, we report on the first thorough characterization of the repeat component of the banana (M. acuminata cv. 'Calcutta 4') genome. Analysis of almost 100 Mb of sequence data (0.15× genome coverage) permitted partial sequence reconstruction and characterization of repetitive DNA, making up about 30% of the genome. The results showed that the banana repeats are predominantly made of various types of Ty1/copia and Ty3/gypsy retroelements representing 16 and 7% of the genome respectively. On the other hand, DNA transposons were found to be rare. In addition to new families of transposable elements, two new satellite repeats were discovered and found useful as cytogenetic markers. To help in banana sequence annotation, a specific Musa repeat database was created, and its utility was demonstrated by analyzing the repeat composition of 62 genomic BAC clones. A low-depth 454 sequencing of banana nuclear genome provided the largest amount of DNA sequence data available until now for Musa and permitted reconstruction of most of the major types of DNA repeats. The information obtained in this study improves the knowledge of the long-range organization of banana chromosomes, and provides sequence resources needed for repeat masking and annotation during the Musa genome sequencing project. It also provides sequence data for isolation of DNA markers to be used in genetic diversity studies and in marker-assisted selection.
Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing

PubMed Central

2010-01-01

Background Bananas and plantains (Musa spp.) are grown in more than a hundred tropical and subtropical countries and provide staple food for hundreds of millions of people. They are seed-sterile crops propagated clonally and this makes them vulnerable to a rapid spread of devastating diseases and at the same time hampers breeding improved cultivars. Although the socio-economic importance of bananas and plantains cannot be overestimated, they remain outside the focus of major research programs. This slows down the study of nuclear genome and the development of molecular tools to facilitate banana improvement. Results In this work, we report on the first thorough characterization of the repeat component of the banana (M. acuminata cv. 'Calcutta 4') genome. Analysis of almost 100 Mb of sequence data (0.15× genome coverage) permitted partial sequence reconstruction and characterization of repetitive DNA, making up about 30% of the genome. The results showed that the banana repeats are predominantly made of various types of Ty1/copia and Ty3/gypsy retroelements representing 16 and 7% of the genome respectively. On the other hand, DNA transposons were found to be rare. In addition to new families of transposable elements, two new satellite repeats were discovered and found useful as cytogenetic markers. To help in banana sequence annotation, a specific Musa repeat database was created, and its utility was demonstrated by analyzing the repeat composition of 62 genomic BAC clones. Conclusion A low-depth 454 sequencing of banana nuclear genome provided the largest amount of DNA sequence data available until now for Musa and permitted reconstruction of most of the major types of DNA repeats. The information obtained in this study improves the knowledge of the long-range organization of banana chromosomes, and provides sequence resources needed for repeat masking and annotation during the Musa genome sequencing project. It also provides sequence data for isolation of DNA markers to be used in genetic diversity studies and in marker-assisted selection. PMID:20846365
Optimization of sequence alignment for simple sequence repeat regions.

PubMed

Jighly, Abdulqader; Hamwieh, Aladdin; Ogbonnaya, Francis C

2011-07-20

Microsatellites, or simple sequence repeats (SSRs), are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs) mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs).SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type.When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic phylogenic relationship.
A Dynamic Tandem Repeat in Monocotyledons Inferred from a Comparative Analysis of Chloroplast Genomes in Melanthiaceae.

PubMed

Do, Hoang Dang Khoa; Kim, Joo-Hwan

2017-01-01

Chloroplast genomes (cpDNA) are highly valuable resources for evolutionary studies of angiosperms, since they are highly conserved, are small in size, and play critical roles in plants. Slipped-strand mispairing (SSM) was assumed to be a mechanism for generating repeat units in cpDNA. However, research on the employment of different small repeated sequences through SSM events, which may induce the accumulation of distinct types of repeats within the same region in cpDNA, has not been documented. Here, we sequenced two chloroplast genomes from the endemic species Heloniopsis tubiflora (Korea) and Xerophyllum tenax (USA) to cover the gap between molecular data and explore "hot spots" for genomic events in Melanthiaceae. Comparative analysis of 23 complete cpDNA sequences revealed that there were different stages of deletion in the rps16 region across the Melanthiaceae. Based on the partial or complete loss of rps16 gene in cpDNA, we have firstly reported potential molecular markers for recognizing two sections ( Veratrum and Fuscoveratrum ) of Veratrum . Melathiaceae exhibits a significant change in the junction between large single copy and inverted repeat regions, ranging from trnH_GUG to a part of rps3 . Our results show an accumulation of tandem repeats in the rpl23-ycf2 regions of cpDNAs. Small conserved sequences exist and flank tandem repeats in further observation of this region across most of the examined taxa of Liliales. Therefore, we propose three scenarios in which different small repeated sequences were used during SSM events to generate newly distinct types of repeats. Occasionally, prior to the SSM process, point mutation event and double strand break repair occurred and induced the formation of initial repeat units which are indispensable in the SSM process. SSM may have likely occurred more frequently for short repeats than for long repeat sequences in tribe Parideae (Melanthiaceae, Liliales). Collectively, these findings add new evidence of dynamic results from SSM in chloroplast genomes which can be useful for further evolutionary studies in angiosperms. Additionally, genomics events in cpDNA are potential resources for mining molecular markers in Liliales.
Molecular and bioinformatic analysis of the FB-NOF transposable element.

PubMed

Badal, Martí; Portela, Anna; Xamena, Noel; Cabré, Oriol

2006-04-12

The Drosophila melanogaster transposable element FB-NOF is known to play a role in genome plasticity through the generation of all sort of genomic rearrangements. Moreover, several insertional mutants due to FB mobilizations have been reported. Its structure and sequence, however, have been poorly studied mainly as a consequence of the long, complex and repetitive sequence of FB inverted repeats. This repetitive region is composed of several 154 bp blocks, each with five almost identical repeats. In this paper, we report the sequencing process of 2 kb long FB inverted repeats of a complete FB-NOF element, with high precision and reliability. This achievement has been possible using a new map of the FB repetitive region, which identifies unambiguously each repeat with new features that can be used as landmarks. With this new vision of the element, a list of FB-NOF in the D. melanogaster genomic clones has been done, improving previous works that used only bioinformatic algorithms. The availability of many FB and FB-NOF sequences allowed an analysis of the FB insertion sequences that showed no sequence specificity, but a preference for A/T rich sequences. The position of NOF into FB is also studied, revealing that it is always located after a second repeat in a random block. With the results of this analysis, we propose a model of transposition in which NOF jumps from FB to FB, using an unidentified transposase enzyme that should specifically recognize the second repeat end of the FB blocks.
The repetitive landscape of the chicken genome.

PubMed

Wicker, Thomas; Robertson, Jon S; Schulze, Stefan R; Feltus, F Alex; Magrini, Vincent; Morrison, Jason A; Mardis, Elaine R; Wilson, Richard K; Peterson, Daniel G; Paterson, Andrew H; Ivarie, Robert

2005-01-01

Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, and low-copy fractions of the chicken genome. Sequencing high-copy DNA of chicken to about 2.7 x coverage of its estimated sequence complexity led to the initial identification of several new repeat families, which were then used for a survey of the newly released first draft of the complete chicken genome. The analysis provided insight into the diversity and biology of known repeat structures such as CR1 and CNM, for which only limited sequence data had previously been available. Cot sequence data also resulted in the identification of four novel repeats (Birddawg, Hitchcock, Kronos, and Soprano), two new subfamilies of CR1 repeats, and many elements absent from the chicken genome assembly. Multiple autonomous elements were found for a novel Mariner-like transposon, Galluhop, in addition to nonautonomous deletion derivatives. Phylogenetic analysis of the high-copy repeats CR1, Galluhop, and Birddawg provided insight into two distinct genome dispersion strategies. This study also exemplifies the power of the CBCS method to create representative databases for the repetitive fractions of genomes for which only limited sequence data is available.
The repetitive landscape of the chicken genome

PubMed Central

Wicker, Thomas; Robertson, Jon S.; Schulze, Stefan R.; Feltus, F. Alex; Magrini, Vincent; Morrison, Jason A.; Mardis, Elaine R.; Wilson, Richard K.; Peterson, Daniel G.; Paterson, Andrew H.; Ivarie, Robert

2005-01-01

Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, and low-copy fractions of the chicken genome. Sequencing high-copy DNA of chicken to about 2.7× coverage of its estimated sequence complexity led to the initial identification of several new repeat families, which were then used for a survey of the newly released first draft of the complete chicken genome. The analysis provided insight into the diversity and biology of known repeat structures such as CR1 and CNM, for which only limited sequence data had previously been available. Cot sequence data also resulted in the identification of four novel repeats (Birddawg, Hitchcock, Kronos, and Soprano), two new subfamilies of CR1 repeats, and many elements absent from the chicken genome assembly. Multiple autonomous elements were found for a novel Mariner-like transposon, Galluhop, in addition to nonautonomous deletion derivatives. Phylogenetic analysis of the high-copy repeats CR1, Galluhop, and Birddawg provided insight into two distinct genome dispersion strategies. This study also exemplifies the power of the CBCS method to create representative databases for the repetitive fractions of genomes for which only limited sequence data is available. PMID:15256510
ATP hydrolysis provides functions that promote rejection of pairings between different copies of long repeated sequences

PubMed Central

Danilowicz, Claudia; Hermans, Laura; Coljee, Vincent; Prévost, Chantal

2017-01-01

Abstract During DNA recombination and repair, RecA family proteins must promote rapid joining of homologous DNA. Repeated sequences with >100 base pair lengths occupy more than 1% of bacterial genomes; however, commitment to strand exchange was believed to occur after testing ∼20–30 bp. If that were true, pairings between different copies of long repeated sequences would usually become irreversible. Our experiments reveal that in the presence of ATP hydrolysis even 75 bp sequence-matched strand exchange products remain quite reversible. Experiments also indicate that when ATP hydrolysis is present, flanking heterologous dsDNA regions increase the reversibility of sequence matched strand exchange products with lengths up to ∼75 bp. Results of molecular dynamics simulations provide insight into how ATP hydrolysis destabilizes strand exchange products. These results inspired a model that shows how pairings between long repeated sequences could be efficiently rejected even though most homologous pairings form irreversible products. PMID:28854739
Changes in the folding landscape of the WW domain provide a molecular mechanism for an inherited genetic syndrome.

PubMed

Pucheta-Martinez, Encarna; D'Amelio, Nicola; Lelli, Moreno; Martinez-Torrecuadrada, Jorge L; Sudol, Marius; Saladino, Giorgio; Gervasio, Francesco Luigi

2016-07-26

WW domains are small domains present in many human proteins with a wide array of functions and acting through the recognition of proline-rich sequences. The WW domain belonging to polyglutamine tract-binding protein 1 (PQBP1) is of particular interest due to its direct involvement in several X chromosome-linked intellectual disabilities, including Golabi-Ito-Hall (GIH) syndrome, where a single point mutation (Y65C) correlates with the development of the disease. The mutant cannot bind to its natural ligand WBP11, which regulates mRNA processing. In this work we use high-field high-resolution NMR and enhanced sampling molecular dynamics simulations to gain insight into the molecular causes the disease. We find that the wild type protein is partially unfolded exchanging among multiple beta-strand-like conformations in solution. The Y65C mutation further destabilizes the residual fold and primes the protein for the formation of a disulphide bridge, which could be at the origin of the loss of function.
Changes in the folding landscape of the WW domain provide a molecular mechanism for an inherited genetic syndrome

NASA Astrophysics Data System (ADS)

Pucheta-Martinez, Encarna; D'Amelio, Nicola; Lelli, Moreno; Martinez-Torrecuadrada, Jorge L.; Sudol, Marius; Saladino, Giorgio; Gervasio, Francesco Luigi

2016-07-01

WW domains are small domains present in many human proteins with a wide array of functions and acting through the recognition of proline-rich sequences. The WW domain belonging to polyglutamine tract-binding protein 1 (PQBP1) is of particular interest due to its direct involvement in several X chromosome-linked intellectual disabilities, including Golabi-Ito-Hall (GIH) syndrome, where a single point mutation (Y65C) correlates with the development of the disease. The mutant cannot bind to its natural ligand WBP11, which regulates mRNA processing. In this work we use high-field high-resolution NMR and enhanced sampling molecular dynamics simulations to gain insight into the molecular causes the disease. We find that the wild type protein is partially unfolded exchanging among multiple beta-strand-like conformations in solution. The Y65C mutation further destabilizes the residual fold and primes the protein for the formation of a disulphide bridge, which could be at the origin of the loss of function.
The Influence of Primary and Secondary DNA Structure in Deletion and Duplication between Direct Repeats in Escherichia Coli

PubMed Central

Trinh, T. Q.; Sinden, R. R.

1993-01-01

We describe a system to measure the frequency of both deletions and duplications between direct repeats. Short 17- and 18-bp palindromic and nonpalindromic DNA sequences were cloned into the EcoRI site within the chloramphenicol acetyltransferase gene of plasmids pBR325 and pJT7. This creates an insert between direct repeated EcoRI sites and results in a chloramphenicol-sensitive phenotype. Selection for chloramphenicol resistance was utilized to select chloramphenicol resistant revertants that included those with precise deletion of the insert from plasmid pBR325 and duplication of the insert in plasmid pJT7. The frequency of deletion or duplication varied more than 500-fold depending on the sequence of the short sequence inserted into the EcoRI site. For the nonpalindromic inserts, multiple internal direct repeats and the length of the direct repeats appear to influence the frequency of deletion. Certain palindromic DNA sequences with the potential to form DNA hairpin structures that might stabilize the misalignment of direct repeats had a high frequency of deletion. Other DNA sequences with the potential to form structures that might destabilize misalignment of direct repeats had a very low frequency of deletion. Duplication mutations occurred at the highest frequency when the DNA between the direct repeats contained no direct or inverted repeats. The presence of inverted repeats dramatically reduced the frequency of duplications. The results support the slippage-misalignment model, suggesting that misalignment occurring during DNA replication leads to deletion and duplication mutations. The results also support the idea that the formation of DNA secondary structures during DNA replication can facilitate and direct specific mutagenic events. PMID:8325478
Visual ModuleOrganizer: a graphical interface for the detection and comparative analysis of repeat DNA modules

PubMed Central

2014-01-01

Background DNA repeats, such as transposable elements, minisatellites and palindromic sequences, are abundant in sequences and have been shown to have significant and functional roles in the evolution of the host genomes. In a previous study, we introduced the concept of a repeat DNA module, a flexible motif present in at least two occurences in the sequences. This concept was embedded into ModuleOrganizer, a tool allowing the detection of repeat modules in a set of sequences. However, its implementation remains difficult for larger sequences. Results Here we present Visual ModuleOrganizer, a Java graphical interface that enables a new and optimized version of the ModuleOrganizer tool. To implement this version, it was recoded in C++ with compressed suffix tree data structures. This leads to less memory usage (at least 120-fold decrease in average) and decreases by at least four the computation time during the module detection process in large sequences. Visual ModuleOrganizer interface allows users to easily choose ModuleOrganizer parameters and to graphically display the results. Moreover, Visual ModuleOrganizer dynamically handles graphical results through four main parameters: gene annotations, overlapping modules with known annotations, location of the module in a minimal number of sequences, and the minimal length of the modules. As a case study, the analysis of FoldBack4 sequences clearly demonstrated that our tools can be extended to comparative and evolutionary analyses of any repeat sequence elements in a set of genomic sequences. With the increasing number of sequences available in public databases, it is now possible to perform comparative analyses of repeated DNA modules in a graphic and friendly manner within a reasonable time period. Availability Visual ModuleOrganizer interface and the new version of the ModuleOrganizer tool are freely available at: http://lcb.cnrs-mrs.fr/spip.php?rubrique313. PMID:24678954
The androgen receptor gene CAG polymorphism is associated with the severity of coronary artery disease in men.

PubMed

Alevizaki, M; Cimponeriu, A T; Garofallaki, M; Sarika, H L; Alevizaki, C C; Papamichael, C; Philippou, G; Anastasiou, E A; Lekakis, J P; Mavrikakis, M

2003-12-01

The role of androgens in the pathogenesis of coronary artery disease (CAD) remains controversial. The length of the polyglutamine stretch of the transactivation domain (CAG repeat) of the androgen receptor (AR) inversely affects androgen activity. The aim of this study was to investigate the effect of this polymorphism of the AR gene in the extent of CAD in male patients. The relationship of the length of the AR gene CAG repeat on the severity of CAD was examined in 131 men (36-86 years old) undergoing coronary angiography. The severity of CAD was assessed by the number (0-3) of coronary vessels with > 50% reduction in the luminal diameter. The interaction of the AR gene polymorphism with the intima media thickness (IMT) of peripheral arteries and serum levels of sex steroids, insulin and biochemical parameters were also studied. The upper quartile of CAG length (range 9-30) was > or = 23 repeats (longAR). The mean body mass index (BMI) of patients with shorter repeats (< 23; shortAR) was significantly lower than in men with longAR (26.1 vs. 27.6, respectively; P = 0.043 M-W Rank test). There was no correlation between the AR gene repeat length and serum testosterone. Oestradiol levels were significantly higher in longAR (0.19 +/- 0.08 nmol/l vs. 0.14 +/- 0.07 in shortAR, P = 0.031). This difference was independent of BMI. Men with shortAR had significant CAD (i.e. one to three arteries with stenosis) more frequently (79.5%) than men with longAR (20.5%); of the subjects with stenosis in no arteries, 56.5% had shortAR and 43.5% longAR (chi2 = 4.3, P = 0.038). This association was independent of age and BMI. The IMT of peripheral arteries, lipid parameters, basal insulin resistance, blood pressure and family history for early CAD, did not differ according to AR length. The shorter CAG repeat of the AR gene is associated with more severe CAD, which suggests a role for the sensitivity to androgens in the increased frequency of CAD in males. In addition, a protective role of endogenous oestrogen, which is higher in the longAR subgroup, can contribute to the observed difference.
Long interspersed repeated DNA (LINE) causes polymorphism at the rat insulin 1 locus.

PubMed

Lakshmikumaran, M S; D'Ambrosio, E; Laimins, L A; Lin, D T; Furano, A V

1985-09-01

The insulin 1, but not the insulin 2, locus is polymorphic (i.e., exhibits allelic variation) in rats. Restriction enzyme analysis and hybridization studies showed that the polymorphic region is 2.2 kilobases upstream of the insulin 1 coding region and is due to the presence or absence of an approximately 2.7-kilobase repeated DNA element. DNA sequence determination showed that this DNA element is a member of a long interspersed repeated DNA family (LINE) that is highly repeated (greater than 50,000 copies) and highly transcribed in the rat. Although the presence or absence of LINE sequences at the insulin 1 locus occurs in both the homozygous and heterozygous states, LINE-containing insulin 1 alleles are more prevalent in the rat population than are alleles without LINEs. Restriction enzyme analysis of the LINE-containing alleles indicated that at least two versions of the LINE sequence may be present at the insulin 1 locus in different rats. Either repeated transposition of LINE sequences or gene conversion between the resident insulin 1 LINE and other sequences in the genome are possible explanations for this.
An Out-of-frame Overlapping Reading Frame in the Ataxin-1 Coding Sequence Encodes a Novel Ataxin-1 Interacting Protein*

PubMed Central

Bergeron, Danny; Lapointe, Catherine; Bissonnette, Cyntia; Tremblay, Guillaume; Motard, Julie; Roucou, Xavier

2013-01-01

Spinocerebellar ataxia type 1 is an autosomal dominant cerebellar ataxia associated with the expansion of a polyglutamine tract within the ataxin-1 (ATXN1) protein. Recent studies suggest that understanding the normal function of ATXN1 in cellular processes is essential to decipher the pathogenesis mechanisms in spinocerebellar ataxia type 1. We found an alternative translation initiation ATG codon in the +3 reading frame of human ATXN1 starting 30 nucleotides downstream of the initiation codon for ATXN1 and ending at nucleotide 587. This novel overlapping open reading frame (ORF) encodes a 21-kDa polypeptide termed Alt-ATXN1 (Alternative ATXN1) with a completely different amino acid sequence from ATXN1. We introduced a hemagglutinin tag in-frame with Alt-ATXN1 in ATXN1 cDNA and showed in cell culture the co-expression of both ATXN1 and Alt-ATXN1. Remarkably, Alt-ATXN1 colocalized and interacted with ATXN1 in nuclear inclusions. In contrast, in the absence of ATXN1 expression, Alt-ATXN1 displays a homogenous nucleoplasmic distribution. Alt-ATXN1 interacts with poly(A)+ RNA, and its nuclear localization is dependent on RNA transcription. Polyclonal antibodies raised against Alt-ATXN1 confirmed the expression of Alt-ATXN1 in human cerebellum expressing ATXN1. These results demonstrate that human ATXN1 gene is a dual coding sequence and that ATXN1 interacts with and controls the subcellular distribution of Alt-ATXN1. PMID:23760502
Genome-wide characterization and selection of expressed sequence tag simple sequence repeat primers for optimized marker distribution and reliability in peach

USDA-ARS?s Scientific Manuscript database

Expressed sequence tag (EST) simple sequence repeats (SSRs) in Prunus were mined, and flanking primers designed and used for genome-wide characterization and selection of primers to optimize marker distribution and reliability. A total of 12,618 contigs were assembled from 84,727 ESTs, along with 34...
Microsatellite analysis in the genome of Acanthaceae: An in silico approach.

PubMed

Kaliswamy, Priyadharsini; Vellingiri, Srividhya; Nathan, Bharathi; Selvaraj, Saravanakumar

2015-01-01

Acanthaceae is one of the advanced and specialized families with conventionally used medicinal plants. Simple sequence repeats (SSRs) play a major role as molecular markers for genome analysis and plant breeding. The microsatellites existing in the complete genome sequences would help to attain a direct role in the genome organization, recombination, gene regulation, quantitative genetic variation, and evolution of genes. The current study reports the frequency of microsatellites and appropriate markers for the Acanthaceae family genome sequences. The whole nucleotide sequences of Acanthaceae species were obtained from National Center for Biotechnology Information database and screened for the presence of SSRs. SSR Locator tool was used to predict the microsatellites and inbuilt Primer3 module was used for primer designing. Totally 110 repeats from 108 sequences of Acanthaceae family plant genomes were identified, and the occurrence of dinucleotide repeats was found to be abundant in the genome sequences. The essential amino acid isoleucine was found rich in all the sequences. We also designed the SSR-based primers/markers for 59 sequences of this family that contains microsatellite repeats in their genome. The identified microsatellites and primers might be useful for breeding and genetic studies of plants that belong to Acanthaceae family in the future.
Isolation and mapping of telomeric pentanucleotide (TAACC)n repeats of the Pacific whiteleg shrimp, Penaeus vannamei, using fluorescence in situ hybridization.

PubMed

Alcivar-Warren, Acacia; Meehan-Meola, Dawn; Wang, Yongping; Guo, Ximing; Zhou, Linghua; Xiang, Jianhai; Moss, Shaun; Arce, Steve; Warren, William; Xu, Zhenkang; Bell, Kireina

2006-01-01

To develop genetic and physical maps for shrimp, accurate information on the actual number of chromosomes and a large number of genetic markers is needed. Previous reports have shown two different chromosome numbers for the Pacific whiteleg shrimp, Penaeus vannamei, the most important penaeid shrimp species cultured in the Western hemisphere. Preliminary results obtained by direct sequencing of clones from a Sau3A-digested genomic library of P. vannamei ovary identified a large number of (TAACC/GGTTA)-containing SSRs. The objectives of this study were to (1) examine the frequency of (TAACC)n repeats in 662 P. vannamei genomic clones that were directly sequenced, and perform homology searches of these clones, (2) confirm the number of chromosomes in testis of P. vannamei, and (3) localize the TAACC repeats in P. vannamei chromosome spreads using fluorescence in situ hybridization (FISH). Results for objective 1 showed that 395 out of the 662 clones sequenced contained single or multiple SSRs with three or more repeat motifs, 199 of which contained variable tandem repeats of the pentanucleotide (TAACC/GGTTA)n, with 3 to 14 copies per sequence. The frequency of (TAACC)n repeats in P. vannamei is 4.68 kb for SSRs with five or more repeat motifs. Sequence comparisons using the BLASTN nonredundant and expressed sequence tag (EST) databases indicated that most of the TAACC-containing clones were similar to either the core pentanucleotide repeat in PVPENTREP locus (GenBank accession no. X82619) or portions of 28S rRNA. Transposable elements (transposase for Tn1000 and reverse transcriptase family members), hypothetical or unnamed protein products, and genes of known function such as 18S and 28S rRNAs, heat shock protein 70, and thrombospondin were identified in non-TAACC-containing clones. For objective 2, the meiotic chromosome number of P. vannamei was confirmed as N = 44. For objective 3, four FISH probes (P1 to P4) containing different numbers of TAACC repeats produced positive signals on telomeres of P. vannamei chromosomes. A few chromosomes had positive signals interstitially. Probe signal strength and chromosome coverage differed in the general order of P1>P2>P3>P4, which correlated with the length of TAACC repeats within the probes: 83, 66, 35, and 30 bp, respectively, suggesting that the TAACC repeats, and not the flanking sequences, produced the TAACC signals at chromosome ends and TAACC is likely the telomere sequence for P. vannamei.

Complete mitochondrial genome of the whiter-spotted flower chafer, Protaetia brevitarsis (Coleoptera: Scarabaeidae).

PubMed

Kim, Min Jee; Im, Hyun Hwak; Lee, Kwang Youll; Han, Yeon Soo; Kim, Iksoo

2014-06-01

Abstract The complete nucleotide sequences of the mitochondrial genome from the whiter-spotted flower chafer, Protaetia brevitarsis (Coleoptera: Scarabaeidae), was determined. The 20,319-bp long circular genome is the longest among completely sequenced Coleoptera. As is typical in animals, the P. brevitarsis genome consisted of two ribosomal RNAs, 22 transfer RNAs, 13 protein-coding genes and one A + T-rich region. Although the size of the coding genes was typical, the non-coding A + T-rich region was 5654 bp, which is the longest in insects. The extraordinary length of this region was composed of 28,117-bp tandem repeats and 782-bp tandem repeats. These repeat sequences were encompassed by three non-repeat sequences constituting 1804 bp.
Cellular protein quality control and the evolution of aggregates in spinocerebellar ataxia type 3 (SCA3).

PubMed

Seidel, K; Meister, M; Dugbartey, G J; Zijlstra, M P; Vinet, J; Brunt, E R P; van Leeuwen, F W; Rüb, U; Kampinga, H H; den Dunnen, W F A

2012-10-01

A characteristic of polyglutamine diseases is the increased propensity of disease proteins to aggregate, which is thought to be a major contributing factor to the underlying neurodegeneration. Healthy cells contain mechanisms for handling protein damage, the protein quality control, which must be impaired or inefficient to permit proteotoxicity under pathological conditions. We used a quantitative analysis of immunohistochemistry of the pons of eight patients with the polyglutamine disorder spinocerebellar ataxia type 3. We employed the anti-polyglutamine antibody 1C2, antibodies against p62 that is involved in delivering ubiquitinated protein aggregates to autophagosomes, antibodies against the chaperones HSPA1A and DNAJB1 and the proteasomal stress marker UBB⁺¹. The 1C2 antibody stained neuronal nuclear inclusions (NNIs), diffuse nuclear staining (DNS), granular cytoplasmic staining (GCS) and combinations, with reproducible distribution. P62 always co-localized with 1C2 in NNI. DNS and GCS co-stained with a lower frequency. UBB⁺¹ was present in a subset of neurones with NNI. A subset of UBB⁺¹-containing neurones displayed increased levels of HSPA1A, while DNAJB1 was sequestered into the NNI. Based on our results, we propose a model for the aggregation-associated pathology of spinocerebellar ataxia type 3: GCS and DNS aggregation likely represents early stages of pathology, which progresses towards formation of p62-positive NNI. A fraction of NNI exhibits UBB⁺¹ staining, implying proteasomal overload at a later stage. Subsequently, the stress-inducible HSPA1A is elevated while DNAJB1 is recruited into NNIs. This indicates that the stress response is only induced late when all endogenous protein quality control systems have failed. © 2011 The Authors. Neuropathology and Applied Neurobiology © 2011 British Neuropathological Society.
The de-ubiquitinating enzyme ataxin-3 does not modulate disease progression in a knock-in mouse model of Huntington disease.

PubMed

Zeng, Li; Tallaksen-Greene, Sara J; Wang, Bo; Albin, Roger L; Paulson, Henry L

2013-01-01

Ataxin-3 is a deubiquitinating enzyme (DUB) that participates in ubiquitin-dependent protein quality control pathways and, based on studies in model systems, may be neuroprotective against toxic polyglutamine proteins such as the Huntington's disease (HD) protein, huntingtin (htt). HD is one of at least nine polyglutamine neurodegenerative diseases in which disease-causing proteins accumulate in ubiquitin-positive inclusions within neurons. In studies crossing mice null for ataxin-3 to an established HD knock-in mouse model (HdhQ200), we tested whether loss of ataxin-3 alters disease progression, perhaps by impairing the clearance of mutant htt or the ubiquitination of inclusions. While loss of ataxin-3 mildly exacerbated age-dependent motor deficits, it did not alter inclusion formation, ubiquitination of inclusions or levels of mutant or normal htt. Ataxin-3, itself a polyglutamine-containing protein with multiple ubiquitin binding domains, was not observed to localize to htt inclusions. Changes in neurotransmitter receptor binding known to occur in HD knock-in mice also were not altered by the loss of ataxin-3, although we unexpectedly observed increased GABAA receptor binding in the striatum of HdhQ200 mice, which has not previously been noted. Finally, we confirmed that CNS levels of hsp70 are decreased in HD mice as has been reported in other HD mouse models, regardless of the presence or absence of ataxin-3. We conclude that while ataxin-3 may participate in protein quality control pathways, it does not critically regulate the handling of mutant htt or contribute to major features of disease pathogenesis in HD.
Focused cerebellar laser light induced hyperthermia improves symptoms and pathology of polyglutamine disease SCA1 in a mouse model.

PubMed

Hearst, Scoty M; Shao, Qingmei; Lopez, Mariper; Raucher, Drazen; Vig, Parminder J S

2014-10-01

Spinocerebellar ataxia 1 (SCA1) results from pathologic glutamine expansion in the ataxin-1 protein (ATXN1). This misfolded ATXN1 causes severe Purkinje cell (PC) loss and cerebellar ataxia in both humans and mice with the SCA1 disease. The molecular chaperone heat-shock proteins (HSPs) are known to modulate polyglutamine protein aggregation and are neuroprotective. Since HSPs are induced under stress, we explored the effects of focused laser light induced hyperthermia (HT) on HSP-mediated protection against ATXN1 toxicity. We first tested the effects of HT in a cell culture model and found that HT induced Hsp70 and increased its localization to nuclear inclusions in HeLa cells expressing GFP-ATXN1[82Q]. HT treatment decreased ATXN1 aggregation by making GFP-ATXN1[82Q] inclusions smaller and more numerous compared to non-treated cells. Further, we tested our HT approach in vivo using a transgenic (Tg) mouse model of SCA1. We found that our laser method increased cerebellar temperature from 38 to 40 °C without causing any neuronal damage or inflammatory response. Interestingly, mild cerebellar HT stimulated the production of Hsp70 to a significant level. Furthermore, multiple exposure of focused cerebellar laser light induced HT to heterozygous SCA1 transgenic (Tg) mice significantly suppressed the SCA1 phenotype as compared to sham-treated control animals. Moreover, in treated SCA1 Tg mice, the levels of PC calcium signaling/buffering protein calbindin-D28k markedly increased followed by a reduction in PC neurodegenerative morphology. Taken together, our data suggest that laser light induced HT is a novel non-invasive approach to treat SCA1 and maybe other polyglutamine disorders.
Massively parallel sequencing of forensic STRs: Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements.

PubMed

Parson, Walther; Ballard, David; Budowle, Bruce; Butler, John M; Gettings, Katherine B; Gill, Peter; Gusmão, Leonor; Hares, Douglas R; Irwin, Jodi A; King, Jonathan L; Knijff, Peter de; Morling, Niels; Prinz, Mechthild; Schneider, Peter M; Neste, Christophe Van; Willuweit, Sascha; Phillips, Christopher

2016-05-01

The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established nomenclature for CE-based STR analysis will remain unchanged in the future, the nomenclature of sequence-based STR genotypes will need to follow updated rules and be generated by expert systems that translate MPS sequences to match CE conventions in order to guarantee compatibility between the different generations of STR data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
SSRscanner: a program for reporting distribution and exact location of simple sequence repeats

PubMed Central

Anwar, Tamanna; Khan, Asad U

2006-01-01

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. Availability This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com PMID:17597863
Structure and stability of the ankyrin domain of the Drosophila Notch receptor.

PubMed

Zweifel, Mark E; Leahy, Daniel J; Hughson, Frederick M; Barrick, Doug

2003-11-01

The Notch receptor contains a conserved ankyrin repeat domain that is required for Notch-mediated signal transduction. The ankyrin domain of Drosophila Notch contains six ankyrin sequence repeats previously identified as closely matching the ankyrin repeat consensus sequence, and a putative seventh C-terminal sequence repeat that exhibits lower similarity to the consensus sequence. To better understand the role of the Notch ankyrin domain in Notch-mediated signaling and to examine how structure is distributed among the seven ankyrin sequence repeats, we have determined the crystal structure of this domain to 2.0 angstroms resolution. The seventh, C-terminal, ankyrin sequence repeat adopts a regular ankyrin fold, but the first, N-terminal ankyrin repeat, which contains a 15-residue insertion, appears to be largely disordered. The structure reveals a substantial interface between ankyrin polypeptides, showing a high degree of shape and charge complementarity, which may be related to homotypic interactions suggested from indirect studies. However, the Notch ankyrin domain remains largely monomeric in solution, demonstrating that this interface alone is not sufficient to promote tight association. Using the structure, we have classified reported mutations within the Notch ankyrin domain that are known to disrupt signaling into those that affect buried residues and those restricted to surface residues. We show that the buried substitutions greatly decrease protein stability, whereas the surface substitutions have only a marginal affect on stability. The surface substitutions are thus likely to interfere with Notch signaling by disrupting specific Notch-effector interactions and map the sites of these interactions.
Molecular architecture of classical cytological landmarks: Centromeres and telomeres

DOE Office of Scientific and Technical Information (OSTI.GOV)

Meyne, J.

1994-11-01

Both the human telomere repeat and the pericentromeric repeat sequence (GGAAT)n were isolated based on evolutionary conservation. Their isolation was based on the premise that chromosomal features as structurally and functionally important as telomeres and centromeres should be highly conserved. Both sequences were isolated by high stringency screening of a human repetitive DNA library with rodent repetitive DNA. The pHuR library (plasmid Human Repeat) used for this project was enriched for repetitive DNA by using a modification of the standard DNA library preparation method. Usually DNA for a library is cut with restriction enzymes, packaged, infected, and the library ismore » screened. A problem with this approach is that many tandem repeats don`t have any (or many) common restriction sites. Therefore, many of the repeat sequences will not be represented in the library because they are not restricted to a viable length for the vector used. To prepare the pHuR library, human DNA was mechanically sheared to a small size. These relatively short DNA fragments were denatured and then renatured to C{sub o}t 50. Theoretically only repetitive DNA sequences should renature under C{sub o}t 50 conditions. The single-stranded regions were digested using S1 nuclease, leaving the double-stranded, renatured repeat sequences.« less
Sequences spanning the leader-repeat junction mediate CRISPR adaptation to phage in Streptococcus thermophilus

PubMed Central

Wei, Yunzhou; Chesne, Megan T.; Terns, Rebecca M.; Terns, Michael P.

2015-01-01

CRISPR-Cas systems are RNA-based immune systems that protect prokaryotes from invaders such as phages and plasmids. In adaptation, the initial phase of the immune response, short foreign DNA fragments are captured and integrated into host CRISPR loci to provide heritable defense against encountered foreign nucleic acids. Each CRISPR contains a ∼100–500 bp leader element that typically includes a transcription promoter, followed by an array of captured ∼35 bp sequences (spacers) sandwiched between copies of an identical ∼35 bp direct repeat sequence. New spacers are added immediately downstream of the leader. Here, we have analyzed adaptation to phage infection in Streptococcus thermophilus at the CRISPR1 locus to identify cis-acting elements essential for the process. We show that the leader and a single repeat of the CRISPR locus are sufficient for adaptation in this system. Moreover, we identified a leader sequence element capable of stimulating adaptation at a dormant repeat. We found that sequences within 10 bp of the site of integration, in both the leader and repeat of the CRISPR, are required for the process. Our results indicate that information at the CRISPR leader-repeat junction is critical for adaptation in this Type II-A system and likely other CRISPR-Cas systems. PMID:25589547
Structure, organization, and sequence of alpha satellite DNA from human chromosome 17: evidence for evolution by unequal crossing-over and an ancestral pentamer repeat shared with the human X chromosome.

PubMed

Waye, J S; Willard, H F

1986-09-01

The centromeric regions of all human chromosomes are characterized by distinct subsets of a diverse tandemly repeated DNA family, alpha satellite. On human chromosome 17, the predominant form of alpha satellite is a 2.7-kilobase-pair higher-order repeat unit consisting of 16 alphoid monomers. We present the complete nucleotide sequence of the 16-monomer repeat, which is present in 500 to 1,000 copies per chromosome 17, as well as that of a less abundant 15-monomer repeat, also from chromosome 17. These repeat units were approximately 98% identical in sequence, differing by the exclusion of precisely 1 monomer from the 15-monomer repeat. Homologous unequal crossing-over is suggested as a probable mechanism by which the different repeat lengths on chromosome 17 were generated, and the putative site of such a recombination event is identified. The monomer organization of the chromosome 17 higher-order repeat unit is based, in part, on tandemly repeated pentamers. A similar pentameric suborganization has been previously demonstrated for alpha satellite of the human X chromosome. Despite the organizational similarities, substantial sequence divergence distinguishes these subsets. Hybridization experiments indicate that the chromosome 17 and X subsets are more similar to each other than to the subsets found on several other human chromosomes. We suggest that the chromosome 17 and X alpha satellite subsets may be related components of a larger alphoid subfamily which have evolved from a common ancestral repeat into the contemporary chromosome-specific subsets.
Nucleotide sequences of Dictyostelium discoideum developmentally regulated cDNAs rich in (AAC) imply proteins that contain clusters of asparagine, glutamine, or threonine.

PubMed

Shaw, D R; Richter, H; Giorda, R; Ohmachi, T; Ennis, H L

1989-09-01

A Dictyostelium discoideum repetitive element composed of long repeats of the codon (AAC) is found in developmentally regulated transcripts. The concentration of (AAC) sequences is low in mRNA from dormant spores and growing cells and increases markedly during spore germination and multicellular development. The sequence hybridizes to many different sized Dictyostelium DNA restriction fragments indicating that it is scattered throughout the genome. Four cDNA clones isolated contain (AAC) sequences in the deduced coding region. Interestingly, the (AAC)-rich sequences are present in all three reading frames in the deduced proteins, i.e., AAC (asparagine), ACA (threonine) and CAA (glutamine). Three of the clones contain only one of these in-frame so that the individual proteins carry either asparagine, threonine, or glutamine clusters, not mixtures. However, one clone is both glutamine- and asparagine-rich. The (AAC) portion of the transcripts are reiterated 300 times in the haploid genome while the other portions of the cDNAs represent single copy genes, whose sequences show no similarity other than the (AAC) repeats. The repeated sequence is similar to the opa or M sequence found in Drosophila melanogaster notch and homeo box genes and in fly developmentally regulated transcripts. The transcripts are present on polysomes suggesting that they are translated. Although the function of these repeats is unknown, long amino acid repeats are a characteristic feature of extracellular proteins of lower eukaryotes.
Design, production and molecular structure of a new family of artificial alpha-helicoidal repeat proteins (αRep) based on thermostable HEAT-like repeats.

PubMed

Urvoas, Agathe; Guellouz, Asma; Valerio-Lepiniec, Marie; Graille, Marc; Durand, Dominique; Desravines, Danielle C; van Tilbeurgh, Herman; Desmadril, Michel; Minard, Philippe

2010-11-26

Repeat proteins have a modular organization and a regular architecture that make them attractive models for design and directed evolution experiments. HEAT repeat proteins, although very common, have not been used as a scaffold for artificial proteins, probably because they are made of long and irregular repeats. Here, we present and validate a consensus sequence for artificial HEAT repeat proteins. The sequence was defined from the structure-based sequence analysis of a thermostable HEAT-like repeat protein. Appropriate sequences were identified for the N- and C-caps. A library of genes coding for artificial proteins based on this sequence design, named αRep, was assembled using new and versatile methodology based on circular amplification. Proteins picked randomly from this library are expressed as soluble proteins. The biophysical properties of proteins with different numbers of repeats and different combinations of side chains in hypervariable positions were characterized. Circular dichroism and differential scanning calorimetry experiments showed that all these proteins are folded cooperatively and are very stable (T(m) >70 °C). Stability of these proteins increases with the number of repeats. Detailed gel filtration and small-angle X-ray scattering studies showed that the purified proteins form either monomers or dimers. The X-ray structure of a stable dimeric variant structure was solved. The protein is folded with a highly regular topology and the repeat structure is organized, as expected, as pairs of alpha helices. In this protein variant, the dimerization interface results directly from the variable surface enriched in aromatic residues located in the randomized positions of the repeats. The dimer was crystallized both in an apo and in a PEG-bound form, revealing a very well defined binding crevice and some structure flexibility at the interface. This fortuitous binding site could later prove to be a useful binding site for other low molecular mass partners. Copyright © 2010 Elsevier Ltd. All rights reserved.
Fluorescence lifetime dynamics of enhanced green fluorescent protein in protein aggregates with expanded polyglutamine

NASA Astrophysics Data System (ADS)

Ghukasyan, Vladimir; Hsu, Chih-Chun; Liu, Chia-Rung; Kao, Fu-Jen; Cheng, Tzu-Hao

2010-01-01

Protein aggregation is one of the characteristic steps in a number of neurodegenerative diseases eventually leading to neuronal death and thorough study of aggregation is required for the development of effective therapy. We apply fluorescence lifetime imaging for the characterization of the fluorescence dynamics of the enhanced green fluorescent protein (eGFP) in fusion with the polyQ-expanded polyglutamine stretch. At the expansion of polyQ above 39 residues, it has an inherent propensity to form amyloid-like fibrils and aggregates, and is responsible for Huntington's disease. The results of the experiments show that expression of the eGFP in fusion with the 97Q protein leads to the decrease of the eGFP fluorescence lifetime by ~300 ps. This phenomenon does not appear in Hsp104-deficient cells, where the aggregation in polyQ is prevented. We demonstrate that the lifetime decrease observed is related to the aggregation per se and discuss the possible role of refractive index and homo-FRET in these dynamics.
Unbiased screen identifies aripiprazole as a modulator of abundance of the polyglutamine disease protein, ataxin-3

PubMed Central

Costa, Maria do Carmo; Ashraf, Naila S.; Fischer, Svetlana; Yang, Yemen; Schapka, Emily; Joshi, Gnanada; McQuade, Thomas J.; Dharia, Rahil M.; Dulchavsky, Mark; Ouyang, Michelle; Cook, David; Sun, Duxin; Larsen, Martha J.; Gestwicki, Jason E.; Todi, Sokol V.; Ivanova, Magdalena I.; Paulson, Henry L.

2016-01-01

No disease-modifying treatment exists for the fatal neurodegenerative polyglutamine disease known both as Machado-Joseph disease and spinocerebellar ataxia type 3. As a potential route to therapy, we identified small molecules that reduce levels of the mutant disease protein, ATXN3. Screens of a small molecule collection, including 1250 Food and Drug Administration-approved drugs, in a novel cell-based assay, followed by secondary screens in brain slice cultures from transgenic mice expressing the human disease gene, identified the atypical antipsychotic aripiprazole as one of the hits. Aripiprazole increased longevity in a Drosophila model of Machado-Joseph disease and effectively reduced aggregated ATXN3 species in flies and in brains of transgenic mice treated for 10 days. The aripiprazole-mediated decrease in ATXN3 abundance may reflect a complex response culminating in the modulation of specific components of cellular protein homeostasis. Aripiprazole represents a potentially promising therapeutic drug for Machado-Joseph disease and possibly other neurological proteinopathies. PMID:27645800
Stem Cell-Based Therapies for Polyglutamine Diseases.

PubMed

Mendonça, Liliana S; Onofre, Isabel; Miranda, Catarina Oliveira; Perfeito, Rita; Nóbrega, Clévio; de Almeida, Luís Pereira

2018-01-01

Polyglutamine (polyQ) diseases are a family of neurodegenerative disorders with very heterogeneous clinical presentations, although with common features such as progressive neuronal death. Thus, at the time of diagnosis patients might present an extensive and irreversible neuronal death demanding cell replacement or support provided by cell-based therapies. For this purpose stem cells, which include diverse populations ranging from embryonic stem cells (ESCs), to fetal stem cells, mesenchymal stromal cells (MSCs) or induced pluripotent stem cells (iPSCs) have remarkable potential to promote extensive brain regeneration and recovery in neurodegenerative disorders. This regenerative potential has been demonstrated in exciting pre and clinical assays. However, despite these promising results, several drawbacks are hampering their successful clinical implementation. Problems related to ethical issues, quality control of the cells used and the lack of reliable models for the efficacy assessment of human stem cells. In this chapter the main advantages and disadvantages of the available sources of stem cells as well as their efficacy and potential to improve disease outcomes are discussed.
PML clastosomes prevent nuclear accumulation of mutant ataxin-7 and other polyglutamine proteins

PubMed Central

Janer, Alexandre; Martin, Elodie; Muriel, Marie-Paule; Latouche, Morwena; Fujigasaki, Hiroto; Ruberg, Merle; Brice, Alexis; Trottier, Yvon; Sittler, Annie

2006-01-01

The pathogenesis of spinocerebellar ataxia type 7 and other neurodegenerative polyglutamine (polyQ) disorders correlates with the aberrant accumulation of toxic polyQ-expanded proteins in the nucleus. Promyelocytic leukemia protein (PML) nuclear bodies are often present in polyQ aggregates, but their relation to pathogenesis is unclear. We show that expression of PML isoform IV leads to the formation of distinct nuclear bodies enriched in components of the ubiquitin-proteasome system. These bodies recruit soluble mutant ataxin-7 and promote its degradation by proteasome-dependent proteolysis, thus preventing the aggregate formation. Inversely, disruption of the endogenous nuclear bodies with cadmium increases the nuclear accumulation and aggregation of mutant ataxin-7, demonstrating their role in ataxin-7 turnover. Interestingly, β-interferon treatment, which induces the expression of endogenous PML IV, prevents the accumulation of transiently expressed mutant ataxin-7 without affecting the level of the endogenous wild-type protein. Therefore, clastosomes represent a potential therapeutic target for preventing polyQ disorders. PMID:16818720
Large-scale microfluidics providing high-resolution and high-throughput screening of Caenorhabditis elegans poly-glutamine aggregation model

NASA Astrophysics Data System (ADS)

Mondal, Sudip; Hegarty, Evan; Martin, Chris; Gökçe, Sertan Kutal; Ghorashian, Navid; Ben-Yakar, Adela

2016-10-01

Next generation drug screening could benefit greatly from in vivo studies, using small animal models such as Caenorhabditis elegans for hit identification and lead optimization. Current in vivo assays can operate either at low throughput with high resolution or with low resolution at high throughput. To enable both high-throughput and high-resolution imaging of C. elegans, we developed an automated microfluidic platform. This platform can image 15 z-stacks of ~4,000 C. elegans from 96 different populations using a large-scale chip with a micron resolution in 16 min. Using this platform, we screened ~100,000 animals of the poly-glutamine aggregation model on 25 chips. We tested the efficacy of ~1,000 FDA-approved drugs in improving the aggregation phenotype of the model and identified four confirmed hits. This robust platform now enables high-content screening of various C. elegans disease models at the speed and cost of in vitro cell-based assays.
TRedD—A database for tandem repeats over the edit distance

PubMed Central

Sokol, Dina; Atagun, Firat

2010-01-01

A ‘tandem repeat’ in DNA is a sequence of two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats are common in the genomes of both eukaryotic and prokaryotic organisms. They are significant markers for human identity testing, disease diagnosis, sequence homology and population studies. In this article, we describe a new database, TRedD, which contains the tandem repeats found in the human genome. The database is publicly available online, and the software for locating the repeats is also freely available. The definition of tandem repeats used by TRedD is a new and innovative definition based upon the concept of ‘evolutive tandem repeats’. In addition, we have developed a tool, called TandemGraph, to graphically depict the repeats occurring in a sequence. This tool can be coupled with any repeat finding software, and it should greatly facilitate analysis of results. Database URL: http://tandem.sci.brooklyn.cuny.edu/ PMID:20624712
ST proteins, a new family of plant tandem repeat proteins with a DUF2775 domain mainly found in Fabaceae and Asteraceae.

PubMed

Albornos, Lucía; Martín, Ignacio; Iglesias, Rebeca; Jiménez, Teresa; Labrador, Emilia; Dopico, Berta

2012-11-07

Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats. ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development. We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the described group of 20 to 40 amino acid tandem repeat proteins and also from known cell wall proteins with repeat sequences. Several putative roles in plant physiology can be inferred from the characteristics found.
ST proteins, a new family of plant tandem repeat proteins with a DUF2775 domain mainly found in Fabaceae and Asteraceae

PubMed Central

2012-01-01

Background Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats. Results ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development. Conclusions We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the described group of 20 to 40 amino acid tandem repeat proteins and also from known cell wall proteins with repeat sequences. Several putative roles in plant physiology can be inferred from the characteristics found. PMID:23134664

A candidate gene for choanal atresia in alpaca.

PubMed

Reed, Kent M; Bauer, Miranda M; Mendoza, Kristelle M; Armién, Aníbal G

2010-03-01

Choanal atresia (CA) is a common nasal craniofacial malformation in New World domestic camelids (alpaca and llama). CA results from abnormal development of the nasal passages and is especially debilitating to newborn crias. CA in camelids shares many of the clinical manifestations of a similar condition in humans (CHARGE syndrome). Herein we report on the regulatory gene CHD7 of alpaca, whose homologue in humans is most frequently associated with CHARGE. Sequence of the CHD7 coding region was obtained from a non-affected cria. The complete coding region was 9003 bp, corresponding to a translated amino acid sequence of 3000 aa. Additional genomic sequences corresponding to a significant portion of the CHD7 gene were identified and assembled from the 2x alpaca whole genome sequence, providing confirmatory sequence for much of the CHD7 coding region. The alpaca CHD7 mRNA sequence was 97.9% similar to the human sequence, with the greatest sequence difference being an insertion in exon 38 that results in a polyalanine repeat (A12). Polymorphism in this repeat was tested for association with CA in alpaca by cloning and sequencing the repeat from both affected and non-affected individuals. Variation in length of the poly-A repeat was not associated with CA. Complete sequencing of the CHD7 gene will be necessary to determine whether other mutations in CHD7 are the cause of CA in camelids.
A theory that may explain the Hayflick limit--a means to delete one copy of a repeating sequence during each cell cycle in certain human cells such as fibroblasts.

PubMed

Naveilhan, P; Baudet, C; Jabbour, W; Wion, D

1994-09-01

A model that may explain the limited division potential of certain cells such as human fibroblasts in culture is presented. The central postulate of this theory is that there exists, prior to certain key exons that code for materials needed for cell division, a unique sequence of specific repeating segments of DNA. One copy of such repeating segments is deleted during each cell cycle in cells that are not protected from such deletion through methylation of their cytosine residues. According to this theory, the means through which such repeated sequences are removed, one per cycle, is through the sequential action of enzymes that act much as bacterial restriction enzymes do--namely to produce scissions in both strands of DNA in areas that correspond to the DNA base sequence recognition specificities of such enzymes. After the first scission early in a replicative cycle, that enzyme becomes inhibited, but the cleavage of the first site exposes the closest site in the repetitive element to the action of a second restriction enzyme after which that enzyme also becomes inhibited. Then repair occurs, regenerating the original first site. Through this sequential activation and inhibition of two different restriction enzymes, only one copy of the repeating sequence is deleted during each cell cycle. In effect, the repeating sequence operates as a precise counter of the numbers of cell doubling that have occurred since the cells involved differentiated during development.
Molecular characterization and physical localization of highly repetitive DNA sequences from Brazilian Alstroemeria species.

PubMed

Kuipers, A G J; Kamstra, S A; de Jeu, M J; Visser, R G F

2002-01-01

Highly repetitive DNA sequences were isolated from genomic DNA libraries of Alstroemeria psittacina and A. inodora. Among the repetitive sequences that were isolated, tandem repeats as well as dispersed repeats could be discerned. The tandem repeats belonged to a family of interlinked Sau3A subfragments with sizes varying from 68-127 bp, and constituted a larger HinfI repeat of approximately 400 bp. Southern hybridization showed a similar molecular organization of the tandem repeats in each of the Brazilian Alstroemeria species tested. None of the repeats hybridized with DNA from Chilean Alstroemeria species, which indicates that they are specific for the Brazilian species. In-situ localization studies revealed the tandem repeats to be localized in clusters on the chromosomes of A. inodora and A. psittacina: distal hybridization sites were found on chromosome arms 2PS, 6PL, 7PS, 7PL and 8PL, interstitial sites on chromosome arms 2PL, 3PL, 4PL and 5PL. The applicability of the tandem repeats for cytogenetic analysis of interspecific hybrids and their role in heterochromatin organization are discussed.
Accurate typing of short tandem repeats from genome-wide sequencing data and its applications.

PubMed

Fungtammasan, Arkarachai; Ananda, Guruprasad; Hile, Suzanne E; Su, Marcia Shu-Wei; Sun, Chen; Harris, Robert; Medvedev, Paul; Eckert, Kristin; Makova, Kateryna D

2015-05-01

Short tandem repeats (STRs) are implicated in dozens of human genetic diseases and contribute significantly to genome variation and instability. Yet profiling STRs from short-read sequencing data is challenging because of their high sequencing error rates. Here, we developed STR-FM, short tandem repeat profiling using flank-based mapping, a computational pipeline that can detect the full spectrum of STR alleles from short-read data, can adapt to emerging read-mapping algorithms, and can be applied to heterogeneous genetic samples (e.g., tumors, viruses, and genomes of organelles). We used STR-FM to study STR error rates and patterns in publicly available human and in-house generated ultradeep plasmid sequencing data sets. We discovered that STRs sequenced with a PCR-free protocol have up to ninefold fewer errors than those sequenced with a PCR-containing protocol. We constructed an error correction model for genotyping STRs that can distinguish heterozygous alleles containing STRs with consecutive repeat numbers. Applying our model and pipeline to Illumina sequencing data with 100-bp reads, we could confidently genotype several disease-related long trinucleotide STRs. Utilizing this pipeline, for the first time we determined the genome-wide STR germline mutation rate from a deeply sequenced human pedigree. Additionally, we built a tool that recommends minimal sequencing depth for accurate STR genotyping, depending on repeat length and sequencing read length. The required read depth increases with STR length and is lower for a PCR-free protocol. This suite of tools addresses the pressing challenges surrounding STR genotyping, and thus is of wide interest to researchers investigating disease-related STRs and STR evolution. © 2015 Fungtammasan et al.; Published by Cold Spring Harbor Laboratory Press.
Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.)

USDA-ARS?s Scientific Manuscript database

Background: Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed S...
Development and characterization of simple sequence repeats for Bipolaris sokiniana and cross transferability to related species

USDA-ARS?s Scientific Manuscript database

Simple sequence repeats (SSR) markers were developed from a small insert genomic library for Bipolaris sorokiniana, a mitosporic fungal pathogen that causes spot blotch and root rot in switchgrass. About 59% of sequenced clones (n=384) harbored various SSR motifs. After eliminating the redundant seq...
Are the TTAGG and TTAGGG telomeric repeats phylogenetically conserved in aculeate Hymenoptera?

NASA Astrophysics Data System (ADS)

Menezes, Rodolpho S. T.; Bardella, Vanessa B.; Cabral-de-Mello, Diogo C.; Lucena, Daercio A. A.; Almeida, Eduardo A. B.

2017-10-01

Despite the (TTAGG)n telomeric repeat supposed being the ancestral DNA motif of telomeres in insects, it was repeatedly lost within some insect orders. Notably, parasitoid hymenopterans and the social wasp Metapolybia decorata (Gribodo) lack the (TTAGG)n sequence, but in other representatives of Hymenoptera, this motif was noticed, such as different ant species and the honeybee. These findings raise the question of whether the insect telomeric repeat is or not phylogenetically predominant in Hymenoptera. Thus, we evaluated the occurrence of both the (TTAGG)n sequence and the vertebrate telomere sequence (TTAGGG)n using dot-blotting hybridization in 25 aculeate species of Hymenoptera. Our results revealed the absence of (TTAGG)n sequence in all tested species, elevating the number of hymenopteran families lacking this telomeric sequence to 13 out of the 15 tested families so far. The (TTAGGG)n was not observed in any tested species. Based on our data and compiled information, we suggest that the (TTAGG)n sequence was putatively lost in the ancestor of Apocrita with at least two subsequent independent regains (in Formicidae and Apidae).
Repetitive DNA in the pea (Pisum sativum L.) genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula

PubMed Central

Macas, Jiří; Neumann, Pavel; Navrátilová, Alice

2007-01-01

Background Extraordinary size variation of higher plant nuclear genomes is in large part caused by differences in accumulation of repetitive DNA. This makes repetitive DNA of great interest for studying the molecular mechanisms shaping architecture and function of complex plant genomes. However, due to methodological constraints of conventional cloning and sequencing, a global description of repeat composition is available for only a very limited number of higher plants. In order to provide further data required for investigating evolutionary patterns of repeated DNA within and between species, we used a novel approach based on massive parallel sequencing which allowed a comprehensive repeat characterization in our model species, garden pea (Pisum sativum). Results Analysis of 33.3 Mb sequence data resulted in quantification and partial sequence reconstruction of major repeat families occurring in the pea genome with at least thousands of copies. Our results showed that the pea genome is dominated by LTR-retrotransposons, estimated at 140,000 copies/1C. Ty3/gypsy elements are less diverse and accumulated to higher copy numbers than Ty1/copia. This is in part due to a large population of Ogre-like retrotransposons which alone make up over 20% of the genome. In addition to numerous types of mobile elements, we have discovered a set of novel satellite repeats and two additional variants of telomeric sequences. Comparative genome analysis revealed that there are only a few repeat sequences conserved between pea and soybean genomes. On the other hand, all major families of pea mobile elements are well represented in M. truncatula. Conclusion We have demonstrated that even in a species with a relatively large genome like pea, where a single 454-sequencing run provided only 0.77% coverage, the generated sequences were sufficient to reconstruct and analyze major repeat families corresponding to a total of 35–48% of the genome. These data provide a starting point for further investigations of legume plant genomes based on their global comparative analysis and for the development of more sophisticated approaches for data mining. PMID:18031571
Molecular basis of length polymorphism in the human zeta-globin gene complex.

PubMed Central

Goodbourn, S E; Higgs, D R; Clegg, J B; Weatherall, D J

1983-01-01

The length polymorphism between the human zeta-globin gene and its pseudogene is caused by an allele-specific variation in the copy number of a tandemly repeating 36-base-pair sequence. This sequence is related to a tandemly repeated 14-base-pair sequence in the 5' flanking region of the human insulin gene, which is known to cause length polymorphism, and to a repetitive sequence in intervening sequence (IVS) 1 of the pseudo-zeta-globin gene. Evidence is presented that the latter is also of variable length, probably because of differences in the copy number of the tandem repeat. The homology between the three length polymorphisms may be an indication of the presence of a more widespread group of related sequences in the human genome, which might be useful for generalized linkage studies. PMID:6308667
Identification and Analysis of Novel Amino-Acid Sequence Repeats in Bacillus anthracis str. Ames Proteome Using Computational Tools

PubMed Central

Hemalatha, G. R.; Rao, D. Satyanarayana; Guruprasad, L.

2007-01-01

We have identified four repeats and ten domains that are novel in proteins encoded by the Bacillus anthracis str. Ames proteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 57-amino-acid-residue PxV domain, (2) 122-amino-acid-residue FxF domain, (3) 111-amino-acid-residue YEFF domain, (4) 109-amino-acid-residue IMxxH domain, (5) 103-amino-acid-residue VxxT domain, (6) 84-amino-acid-residue ExW domain, (7) 104-amino-acid-residue NTGFIG domain, (8) 36-amino-acid-residue NxGK repeat, (9) 95-amino-acid-residue VYV domain, (10) 75-amino-acid-residue KEWE domain, (11) 59-amino-acid-residue AFL domain, (12) 53-amino-acid-residue RIDVK repeat, (13) (a) 41-amino-acid-residue AGQF repeat and (b) 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure. PMID:17538688
Complete mitochondrial genome of the larch hawk moth, Sphinx morio (Lepidoptera: Sphingidae).

PubMed

Kim, Min Jee; Choi, Sei-Woong; Kim, Iksoo

2013-12-01

The larch hawk moth, Sphinx morio, belongs to the lepidopteran family Sphingidae that has long been studied as a family of model insects in a diverse field. In this study, we describe the complete mitochondrial genome (mitogenome) sequences of the species in terms of general genomic features and characteristic short repetitive sequences found in the A + T-rich region. The 15,299-bp-long genome consisted of a typical set of genes (13 protein-coding genes, 2 rRNA genes, and 22 tRNA genes) and one major non-coding A + T-rich region, with the typical arrangement found in Lepidoptera. The 316-bp-long A + T-rich region located between srRNA and tRNA(Met) harbored the conserved sequence blocks that are typically found in lepidopteran insects. Additionally, the A + T-rich region of S. morio contained three characteristic repeat sequences that are rarely found in Lepidoptera: two identical 12-bp repeat, three identical 5-bp-long tandem repeat, and six nearly identical 5-6 bp long repeat sequences.
PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

PubMed

Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

2011-01-01

PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
Long interspersed repeated DNA (LINE) causes polymorphism at the rat insulin 1 locus.

PubMed Central

Lakshmikumaran, M S; D'Ambrosio, E; Laimins, L A; Lin, D T; Furano, A V

1985-01-01

The insulin 1, but not the insulin 2, locus is polymorphic (i.e., exhibits allelic variation) in rats. Restriction enzyme analysis and hybridization studies showed that the polymorphic region is 2.2 kilobases upstream of the insulin 1 coding region and is due to the presence or absence of an approximately 2.7-kilobase repeated DNA element. DNA sequence determination showed that this DNA element is a member of a long interspersed repeated DNA family (LINE) that is highly repeated (greater than 50,000 copies) and highly transcribed in the rat. Although the presence or absence of LINE sequences at the insulin 1 locus occurs in both the homozygous and heterozygous states, LINE-containing insulin 1 alleles are more prevalent in the rat population than are alleles without LINEs. Restriction enzyme analysis of the LINE-containing alleles indicated that at least two versions of the LINE sequence may be present at the insulin 1 locus in different rats. Either repeated transposition of LINE sequences or gene conversion between the resident insulin 1 LINE and other sequences in the genome are possible explanations for this. Images PMID:3016521
Spectroscopic insights into quadruplexes of five-repeat telomere DNA sequences upon G-block damage.

PubMed

Dvořáková, Zuzana; Vorlíčková, Michaela; Renčiuk, Daniel

2017-11-01

The DNA lesions, resulting from oxidative damage, were shown to destabilize human telomere four-repeat quadruplex and to alter its structure. Long telomere DNA, as a repetitive sequence, offers, however, other mechanisms of dealing with the lesion: extrusion of the damaged repeat into loop or shifting the quadruplex position by one repeat. Using circular dichroism and UV absorption spectroscopy and polyacrylamide electrophoresis, we studied consequences of lesions at different positions of the model five-repeat human telomere DNA sequences on the structure and stability of their quadruplexes in sodium and in potassium. The repeats affected by lesion are preferentially positioned as terminal overhangs of the core quadruplex structurally similar to the four-repeat one. Forced affecting of the inner repeats leads to presence of variety of more parallel folds in potassium. In sodium the designed models form mixture of two dominant antiparallel quadruplexes whose population varies with the position of the affected repeat. The shapes of quadruplex CD spectra, namely the height of dominant peaks, significantly correlate with melting temperatures. Lesion in one guanine tract of a more than four repeats long human telomere DNA sequence may cause re-positioning of its quadruplex arrangement associated with a shift of the structure to less common quadruplex conformations. The type of the quadruplex depends on the loop position and external conditions. The telomere DNA quadruplexes are quite resistant to the effect of point mutations due to the telomere DNA repetitive nature, although their structure and, consequently, function might be altered. Copyright © 2017. Published by Elsevier B.V.
Microsatellite analysis in the genome of Acanthaceae: An in silico approach

PubMed Central

Kaliswamy, Priyadharsini; Vellingiri, Srividhya; Nathan, Bharathi; Selvaraj, Saravanakumar

2015-01-01

Background: Acanthaceae is one of the advanced and specialized families with conventionally used medicinal plants. Simple sequence repeats (SSRs) play a major role as molecular markers for genome analysis and plant breeding. The microsatellites existing in the complete genome sequences would help to attain a direct role in the genome organization, recombination, gene regulation, quantitative genetic variation, and evolution of genes. Objective: The current study reports the frequency of microsatellites and appropriate markers for the Acanthaceae family genome sequences. Materials and Methods: The whole nucleotide sequences of Acanthaceae species were obtained from National Center for Biotechnology Information database and screened for the presence of SSRs. SSR Locator tool was used to predict the microsatellites and inbuilt Primer3 module was used for primer designing. Results: Totally 110 repeats from 108 sequences of Acanthaceae family plant genomes were identified, and the occurrence of dinucleotide repeats was found to be abundant in the genome sequences. The essential amino acid isoleucine was found rich in all the sequences. We also designed the SSR-based primers/markers for 59 sequences of this family that contains microsatellite repeats in their genome. Conclusion: The identified microsatellites and primers might be useful for breeding and genetic studies of plants that belong to Acanthaceae family in the future. PMID:25709226
The repeating nucleotide sequence in the repetitive mitochondrial DNA from a "low-density" petite mutant of yeast.

PubMed Central

Van Kreijl, C F; Bos, J L

1977-01-01

The repeating nucleotide sequence of 68 base pairs in the mtDNA from an ethidium-induced cytoplasmic petite mutant of yeast has been determined. For sequence analysis specifically primed and terminated RNA copies, obtained by in vitro transcription of the separated strands, were use. The sequence consists of 66 consecutive AT base pairs flanked by two GC pairs and comprises nearly all of the mutant mitochondrial genome. The sequence, moreover, also represents the first part of wild-type mtDNA sequence so far. Images PMID:198740
A Lossy Compression Technique Enabling Duplication-Aware Sequence Alignment

PubMed Central

Freschi, Valerio; Bogliolo, Alessandro

2012-01-01

In spite of the recognized importance of tandem duplications in genome evolution, commonly adopted sequence comparison algorithms do not take into account complex mutation events involving more than one residue at the time, since they are not compliant with the underlying assumption of statistical independence of adjacent residues. As a consequence, the presence of tandem repeats in sequences under comparison may impair the biological significance of the resulting alignment. Although solutions have been proposed, repeat-aware sequence alignment is still considered to be an open problem and new efficient and effective methods have been advocated. The present paper describes an alternative lossy compression scheme for genomic sequences which iteratively collapses repeats of increasing length. The resulting approximate representations do not contain tandem duplications, while retaining enough information for making their comparison even more significant than the edit distance between the original sequences. This allows us to exploit traditional alignment algorithms directly on the compressed sequences. Results confirm the validity of the proposed approach for the problem of duplication-aware sequence alignment. PMID:22518086
Tandemly repeated sequences in mtDNA control region of whitefish, Coregonus lavaretus.

PubMed

Brzuzan, P

2000-06-01

Length variation of the mitochondrial DNA control region was observed with PCR amplification of a sample of 138 whitefish (Coregonus lavaretus). Nucleotide sequences of representative PCR products showed that the variation was due to the presence of an approximately 100-bp motif tandemly repeated two, three, or five times in the region between the conserved sequence block-3 (CSB-3) and the gene for phenylalanine tRNA. This is the first report on the tandem array composed of long repeat units in mitochondrial DNA of salmonids.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Ding, Ying; Adachi, Hiroaki, E-mail: hadachi-ns@umin.org; Department of Neurology, University of Occupational and Environmental Health School of Medicine, 1-1 Iseigaoka, Yahata-nishi-ku, Kitakyushu 807-8555

Spinal and bulbar muscular atrophy (SBMA) is an inherited motor neuron disease caused by the expansion of a polyglutamine (polyQ)-encoding tract within the androgen receptor (AR) gene. The pathologic features of SBMA are motor neuron loss in the spinal cord and brainstem and diffuse nuclear accumulation and nuclear inclusions of mutant AR in residual motor neurons and certain visceral organs. Hepatocyte growth factor (HGF) is a polypeptide growth factor which has neuroprotective properties. To investigate whether HGF overexpression can affect disease progression in a mouse model of SBMA, we crossed SBMA transgenic model mice expressing an AR gene with anmore » expanded CAG repeat with mice overexpressing HGF. Here, we report that high expression of HGF induces Akt phosphorylation and modestly ameliorated motor symptoms in an SBMA transgenic mouse model treated with or without castration. These findings suggest that HGF overexpression can provide a potential therapeutic avenue as a combination therapy with disease-modifying therapies in SBMA. - Highlights: • HGF overexpression ameliorates the motor phenotypes of the SBMA mouse model. • HGF overexpression induces Akt phosphorylation in the SBMA mouse model. • This is the first report of combination therapy in a mouse model of polyQ diseases.« less
Gas-Phase Folding of Small Glutamine Containing Peptides: Sidechain Hydrogen Bonding Stabilizes β-turns

NASA Astrophysics Data System (ADS)

Walsh, Patrick S.; Blodgett, Karl N.; McBurney, Carl; Gellman, Samuel H.; Zwier, Timothy S.

Glutamine is vitally important to a class of neurodegenerative diseases called poly-glutamine (poly-Q) repeat diseases such as Huntington's Disease (HD). Recent studies have revealed a pathogenic pathway that proceeds through misfolding of poly-Q regions into characteristic β-turn/ β-hairpin structures that are highly correlated with toxicity. The inherent conformational preferences of small glutamine containing peptides (Ac-Q-Q-NHBn and Ac-A-Q-NHBn) were studied using conformation-specific IR and UV spectroscopies, with the goal of probing the delicate interplay between three competitive hydrogen bonding motifs: backbone-backbone, sidechain-backbone, and sidechain-sidechain hydrogen bonds. Laser desorption, coupled with a supersonic expansion, was used to introduce the non-thermally labile sample into the gas-phase. Resonant ion-dip infrared (RIDIR) spectroscopy is a powerful tool for recording the vibrational spectra of single conformational isomers and was used here to reveal the innate structural preferences of the glutamine containing peptides. Experimental results are compared against density functional calculations to arrive at firm conformational assignments. Our results demonstrate a striking preference for β-turn formation in the non-polar environment of the gas-phase. Previous Affiliation: Purdue University, Department of Chemistry.

The role of oxidative stress in Huntington's disease: are antioxidants good therapeutic candidates?

PubMed

Gil-Mohapel, Joana; Brocardo, Patricia S; Christie, Brian R

2014-04-01

Huntington's disease (HD) is the most common polyglutamine neurodegenerative disorder in humans, and is caused by a mutation of an unstable expansion of CAG repeats within the coding region of the HD gene, which expresses the protein huntingtin. Although abnormal protein is ubiquitously expressed throughout the organism, cell degeneration occurs mainly in the brain, and there, predominantly in the striatum and cortex. The mechanisms that account for this selective neuronal death are multifaceted in nature and several lines of evidence suggest that mitochondrial dysfunction, overproduction of reactive oxygen species (ROS) and oxidative stress (an imbalance between pro-oxidant and antioxidant systems resulting in oxidative damage to proteins, lipids and DNA) might play important roles. Over time, this can result in the death of the affected neuronal populations. In this review article we present an overview of the preclinical and clinical studies that have indicated a link between oxidative stress, neurodegeneration, and cell death in HD. We also discuss how changes in ROS production affect neuronal survival, highlighting the evidence for the use of antioxidants including essential fatty acids, coenzyme Q10, and creatine, as potential therapeutic strategies for the treatment of this devastating neurodegenerative disorder.
Reprint of: Early Behavioural Facilitation by Temporal Expectations in Complex Visual-motor Sequences.

PubMed

Heideman, Simone G; van Ede, Freek; Nobre, Anna C

2018-05-24

In daily life, temporal expectations may derive from incidental learning of recurring patterns of intervals. We investigated the incidental acquisition and utilisation of combined temporal-ordinal (spatial/effector) structure in complex visual-motor sequences using a modified version of a serial reaction time (SRT) task. In this task, not only the series of targets/responses, but also the series of intervals between subsequent targets was repeated across multiple presentations of the same sequence. Each participant completed three sessions. In the first session, only the repeating sequence was presented. During the second and third session, occasional probe blocks were presented, where a new (unlearned) spatial-temporal sequence was introduced. We first confirm that participants not only got faster over time, but that they were slower and less accurate during probe blocks, indicating that they incidentally learned the sequence structure. Having established a robust behavioural benefit induced by the repeating spatial-temporal sequence, we next addressed our central hypothesis that implicit temporal orienting (evoked by the learned temporal structure) would have the largest influence on performance for targets following short (as opposed to longer) intervals between temporally structured sequence elements, paralleling classical observations in tasks using explicit temporal cues. We found that indeed, reaction time differences between new and repeated sequences were largest for the short interval, compared to the medium and long intervals, and that this was the case, even when comparing late blocks (where the repeated sequence had been incidentally learned), to early blocks (where this sequence was still unfamiliar). We conclude that incidentally acquired temporal expectations that follow a sequential structure can have a robust facilitatory influence on visually-guided behavioural responses and that, like more explicit forms of temporal orienting, this effect is most pronounced for sequence elements that are expected at short inter-element intervals. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Evolutionary conservation of sequence and secondary structures inCRISPR repeats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kunin, Victor; Sorek, Rotem; Hugenholtz, Philip

Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in {approx}40% of bacterial and all archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CAS), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been proposed that the CRISPR/CAS system samples, maintains a record of, and inactivates invasive DNA that the cell has encountered, and therefore constitutes a prokaryotic analog of an immune system. Here we analyze CRISPR repeatsmore » identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. All individual repeats in any given cluster were inferred to form characteristic RNA secondary structure, ranging from non-existent to pronounced. Stable secondary structures included G:U base pairs and exhibited multiple compensatory base changes in the stem region, indicating evolutionary conservation and functional importance. We also show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification including specific relationships between CRISPR and CAS subtypes.« less
Heterogeneity of the Epstein-Barr Virus (EBV) Major Internal Repeat Reveals Evolutionary Mechanisms of EBV and a Functional Defect in the Prototype EBV Strain B95-8.

PubMed

Ba Abdullah, Mohammed M; Palermo, Richard D; Palser, Anne L; Grayson, Nicholas E; Kellam, Paul; Correia, Samantha; Szymula, Agnieszka; White, Robert E

2017-12-01

Epstein-Barr virus (EBV) is a ubiquitous pathogen of humans that can cause several types of lymphoma and carcinoma. Like other herpesviruses, EBV has diversified through both coevolution with its host and genetic exchange between virus strains. Sequence analysis of the EBV genome is unusually challenging because of the large number and lengths of repeat regions within the virus. Here we describe the sequence assembly and analysis of the large internal repeat 1 of EBV (IR1; also known as the BamW repeats) for more than 70 strains. The diversity of the latency protein EBV nuclear antigen leader protein (EBNA-LP) resides predominantly within the exons downstream of IR1. The integrity of the putative BWRF1 open reading frame (ORF) is retained in over 80% of strains, and deletions truncating IR1 always spare BWRF1. Conserved regions include the IR1 latency promoter (Wp) and one zone upstream of and two within BWRF1. IR1 is heterogeneous in 70% of strains, and this heterogeneity arises from sequence exchange between strains as well as from spontaneous mutation, with interstrain recombination being more common in tumor-derived viruses. This genetic exchange often incorporates regions of <1 kb, and allelic gene conversion changes the frequency of small regions within the repeat but not close to the flanks. These observations suggest that IR1-and, by extension, EBV-diversifies through both recombination and breakpoint repair, while concerted evolution of IR1 is driven by gene conversion of small regions. Finally, the prototype EBV strain B95-8 contains four nonconsensus variants within a single IR1 repeat unit, including a stop codon in the EBNA-LP gene. Repairing IR1 improves EBNA-LP levels and the quality of transformation by the B95-8 bacterial artificial chromosome (BAC). IMPORTANCE Epstein-Barr virus (EBV) infects the majority of the world population but causes illness in only a small minority of people. Nevertheless, over 1% of cancers worldwide are attributable to EBV. Recent sequencing projects investigating virus diversity to see if different strains have different disease impacts have excluded regions of repeating sequence, as they are more technically challenging. Here we analyze the sequence of the largest repeat in EBV (IR1). We first characterized the variations in protein sequences encoded across IR1. In studying variations within the repeat of each strain, we identified a mutation in the main laboratory strain of EBV that impairs virus function, and we suggest that tumor-associated viruses may be more likely to contain DNA mixed from two strains. The patterns of this mixing suggest that sequences can spread between strains (and also within the repeat) by copying sequence from another strain (or repeat unit) to repair DNA damage. Copyright © 2017 Ba abdullah et al.
Improved PCR-Based Detection of Soil Transmitted Helminth Infections Using a Next-Generation Sequencing Approach to Assay Design.

PubMed

Pilotte, Nils; Papaiakovou, Marina; Grant, Jessica R; Bierwert, Lou Ann; Llewellyn, Stacey; McCarthy, James S; Williams, Steven A

2016-03-01

The soil transmitted helminths are a group of parasitic worms responsible for extensive morbidity in many of the world's most economically depressed locations. With growing emphasis on disease mapping and eradication, the availability of accurate and cost-effective diagnostic measures is of paramount importance to global control and elimination efforts. While real-time PCR-based molecular detection assays have shown great promise, to date, these assays have utilized sub-optimal targets. By performing next-generation sequencing-based repeat analyses, we have identified high copy-number, non-coding DNA sequences from a series of soil transmitted pathogens. We have used these repetitive DNA elements as targets in the development of novel, multi-parallel, PCR-based diagnostic assays. Utilizing next-generation sequencing and the Galaxy-based RepeatExplorer web server, we performed repeat DNA analysis on five species of soil transmitted helminths (Necator americanus, Ancylostoma duodenale, Trichuris trichiura, Ascaris lumbricoides, and Strongyloides stercoralis). Employing high copy-number, non-coding repeat DNA sequences as targets, novel real-time PCR assays were designed, and assays were tested against established molecular detection methods. Each assay provided consistent detection of genomic DNA at quantities of 2 fg or less, demonstrated species-specificity, and showed an improved limit of detection over the existing, proven PCR-based assay. The utilization of next-generation sequencing-based repeat DNA analysis methodologies for the identification of molecular diagnostic targets has the ability to improve assay species-specificity and limits of detection. By exploiting such high copy-number repeat sequences, the assays described here will facilitate soil transmitted helminth diagnostic efforts. We recommend similar analyses when designing PCR-based diagnostic tests for the detection of other eukaryotic pathogens.
Evolution Analysis of Simple Sequence Repeats in Plant Genome.

PubMed

Qin, Zhen; Wang, Yanping; Wang, Qingmei; Li, Aixian; Hou, Fuyun; Zhang, Liming

2015-01-01

Simple sequence repeats (SSRs) are widespread units on genome sequences, and play many important roles in plants. In order to reveal the evolution of plant genomes, we investigated the evolutionary regularities of SSRs during the evolution of plant species and the plant kingdom by analysis of twelve sequenced plant genome sequences. First, in the twelve studied plant genomes, the main SSRs were those which contain repeats of 1-3 nucleotides combination. Second, in mononucleotide SSRs, the A/T percentage gradually increased along with the evolution of plants (except for P. patens). With the increase of SSRs repeat number the percentage of A/T in C. reinhardtii had no significant change, while the percentage of A/T in terrestrial plants species gradually declined. Third, in dinucleotide SSRs, the percentage of AT/TA increased along with the evolution of plant kingdom and the repeat number increased in terrestrial plants species. This trend was more obvious in dicotyledon than monocotyledon. The percentage of CG/GC showed the opposite pattern to the AT/TA. Forth, in trinucleotide SSRs, the percentages of combinations including two or three A/T were in a rising trend along with the evolution of plant kingdom; meanwhile with the increase of SSRs repeat number in plants species, different species chose different combinations as dominant SSRs. SSRs in C. reinhardtii, P. patens, Z. mays and A. thaliana showed their specific patterns related to evolutionary position or specific changes of genome sequences. The results showed that, SSRs not only had the general pattern in the evolution of plant kingdom, but also were associated with the evolution of the specific genome sequence. The study of the evolutionary regularities of SSRs provided new insights for the analysis of the plant genome evolution.
Translocation and gross deletion breakpoints in human inherited disease and cancer II: Potential involvement of repetitive sequence elements in secondary structure formation between DNA ends.

PubMed

Chuzhanova, Nadia; Abeysinghe, Shaun S; Krawczak, Michael; Cooper, David N

2003-09-01

Translocations and gross deletions are responsible for a significant proportion of both cancer and inherited disease. Although such gene rearrangements are nonuniformly distributed in the human genome, the underlying mutational mechanisms remain unclear. We have studied the potential involvement of various types of repetitive sequence elements in the formation of secondary structure intermediates between the single-stranded DNA ends that recombine during rearrangements. Complexity analysis was used to assess the potential of these ends to form secondary structures, the maximum decrease in complexity consequent to a gross rearrangement being used as an indicator of the type of repeat and the specific DNA ends involved. A total of 175 pairs of deletion/translocation breakpoint junction sequences available from the Gross Rearrangement Breakpoint Database [GRaBD; www.uwcm.ac.uk/uwcm/mg/grabd/grabd.html] were analyzed. Potential secondary structure was noted between the 5' flanking sequence of the first breakpoint and the 3' flanking sequence of the second breakpoint in 49% of rearrangements and between the 5' flanking sequence of the second breakpoint and the 3' flanking sequence of the first breakpoint in 36% of rearrangements. Inverted repeats, inversions of inverted repeats, and symmetric elements were found in association with gross rearrangements at approximately the same frequency. However, inverted repeats and inversions of inverted repeats accounted for the vast majority (83%) of deletions plus small insertions, symmetric elements for one-half of all antigen receptor-mediated translocations, while direct repeats appear only to be involved in mediating simple deletions. These findings extend our understanding of illegitimate recombination by highlighting the importance of secondary structure formation between single-stranded DNA ends at breakpoint junctions. Copyright 2003 Wiley-Liss, Inc.
Perceived empty duration between sounds of different lengths: Possible relation with repetition and rhythmic grouping.

PubMed

Kuroda, Tsuyoshi; Tomimatsu, Erika; Grondin, Simon; Miyazaki, Makoto

2016-11-01

We investigated how perceived duration of empty time intervals would be modulated by the length of sounds marking those intervals. Three sounds were successively presented in Experiment 1. Each sound was short (S) or long (L), and the temporal position of the middle sound's onset was varied. The lengthening of each sound resulted in delayed perception of the onset; thus, the middle sound's onset had to be presented earlier in the SLS than in the LSL sequence so that participants perceived the three sounds as presented at equal interonset intervals. In Experiment 2, a short sound and a long sound were alternated repeatedly, and the relative duration of the SL interval to the LS interval was varied. This repeated sequence was perceived as consisting of equal interonset intervals when the onsets of all sounds were aligned at physically equal intervals. If the same onset delay as in the preceding experiment had occurred, participants should have perceived equality between the interonset intervals in the repeated sequence when the SL interval was physically shortened relative to the LS interval. The effects of sound length seemed to be canceled out when the presentation of intervals was repeated. Finally, the perceived duration of the interonset intervals in the repeated sequence was not influenced by whether the participant's native language was French or Japanese, or by how the repeated sequence was perceptually segmented into rhythmic groups.
Genetic and DNA sequence analysis of the kanamycin resistance transposon Tn903.

PubMed Central

Grindley, N D; Joyce, C M

1980-01-01

The kanamycin resistance transposon Tn903 consists of a unique region of about 1000 base pairs bounded by a pair of 1050-base-pair inverted repeat sequences. Each repeat contains two Pvu II endonuclease cleavage sites separated by 520 base pairs. We have constructed derivatives of Tn903 in which this 520-base-pair fragment is deleted from one or both repeats. Those derivatives that lack both 520-base-pair fragments cannot transpose, whereas those that lack just one remain transposition proficient. One such transposable derivative, Tn903 delta I, has been selected for further study. We have determined the sequence of the intact inverted repeat. The 18 base pairs at each end are identical and inverted relative to one another, a structure characteristic of insertion sequences. Additional experiments indicate that a single inverted repeat from Tn903 can, in fact, transpose; we propose that this element be called IS903. To correlate the DNA sequence with genetic activities, we have created mutations by inserting a 10-base-pair DNA fragment at several sites within the intact repeat of Tn903 delta 1, and we have examined the effect of such insertions on transposability. The results suggest that IS903 encodes a 307-amino-acid polypeptide (a "transposase") that is absolutely required for transposition of IS903 or Tn903. Images PMID:6261245
Sunflower centromeres consist of a centromere-specific LINE and a chromosome-specific tandem repeat.

PubMed

Nagaki, Kiyotaka; Tanaka, Keisuke; Yamaji, Naoki; Kobayashi, Hisato; Murata, Minoru

2015-01-01

The kinetochore is a protein complex including kinetochore-specific proteins that plays a role in chromatid segregation during mitosis and meiosis. The complex associates with centromeric DNA sequences that are usually species-specific. In plant species, tandem repeats including satellite DNA sequences and retrotransposons have been reported as centromeric DNA sequences. In this study on sunflowers, a cDNA-encoding centromere-specific histone H3 (CENH3) was isolated from a cDNA pool from a seedling, and an antibody was raised against a peptide synthesized from the deduced cDNA. The antibody specifically recognized the sunflower CENH3 (HaCENH3) and showed centromeric signals by immunostaining and immunohistochemical staining analysis. The antibody was also applied in chromatin immunoprecipitation (ChIP)-Seq to isolate centromeric DNA sequences and two different types of repetitive DNA sequences were identified. One was a long interspersed nuclear element (LINE)-like sequence, which showed centromere-specific signals on almost all chromosomes in sunflowers. This is the first report of a centromeric LINE sequence, suggesting possible centromere targeting ability. Another type of identified repetitive DNA was a tandem repeat sequence with a 187-bp unit that was found only on a pair of chromosomes. The HaCENH3 content of the tandem repeats was estimated to be much higher than that of the LINE, which implies centromere evolution from LINE-based centromeres to more stable tandem-repeat-based centromeres. In addition, the epigenetic status of the sunflower centromeres was investigated by immunohistochemical staining and ChIP, and it was found that centromeres were heterochromatic.
Direct repeat sequences in the Streptomyces chitinase-63 promoter direct both glucose repression and chitin induction

PubMed Central

Ni, Xiangyang; Westpheling, Janet

1997-01-01

The chi63 promoter directs glucose-sensitive, chitin-dependent transcription of a gene involved in the utilization of chitin as carbon source. Analysis of 5′ and 3′ deletions of the promoter region revealed that a 350-bp segment is sufficient for wild-type levels of expression and regulation. The analysis of single base changes throughout the promoter region, introduced by random and site-directed mutagenesis, identified several sequences to be important for activity and regulation. Single base changes at −10, −12, −32, −33, −35, and −37 upstream of the transcription start site resulted in loss of activity from the promoter, suggesting that bases in these positions are important for RNA polymerase interaction. The sequences centered around −10 (TATTCT) and −35 (TTGACC) in this promoter are, in fact, prototypical of eubacterial promoters. Overlapping the RNA polymerase binding site is a perfect 12-bp direct repeat sequence. Some base changes within this direct repeat resulted in constitutive expression, suggesting that this sequence is an operator for negative regulation. Other base changes resulted in loss of glucose repression while retaining the requirement for chitin induction, suggesting that this sequence is also involved in glucose repression. The fact that cis-acting mutations resulted in glucose resistance but not inducer independence rules out the possibility that glucose repression acts exclusively by inducer exclusion. The fact that mutations that affect glucose repression and chitin induction fall within the same direct repeat sequence module suggests that the direct repeat sequence facilitates both chitin induction and glucose repression. PMID:9371809
Superior ab initio identification, annotation and characterisation of TEs and segmental duplications from genome assemblies.

PubMed

Zeng, Lu; Kortschak, R Daniel; Raison, Joy M; Bertozzi, Terry; Adelson, David L

2018-01-01

Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive sequences in genome assemblies. The pipeline begins with a pairwise alignment using krishna, a custom aligner. Single linkage clustering is then carried out to produce families of repetitive elements. Consensus sequences are then filtered for protein coding genes and then annotated using Repbase and a custom library of retrovirus and reverse transcriptase sequences. This process yields three types of family: fully annotated, partially annotated and unannotated. Fully annotated families reflect recently diverged/young known TEs present in Repbase. The remaining two types of families contain a mixture of novel TEs and segmental duplications. These can be resolved by aligning these consensus sequences back to the genome to assess copy number vs. length distribution. Our pipeline has three significant advantages compared to other methods for ab initio repeat identification: 1) we generate not only consensus sequences, but keep the genomic intervals for the original aligned sequences, allowing straightforward analysis of evolutionary dynamics, 2) consensus sequences represent low-divergence, recently/currently active TE families, 3) segmental duplications are annotated as a useful by-product. We have compared our ab initio repeat annotations for 7 genome assemblies to other methods and demonstrate that CARP compares favourably with RepeatModeler, the most widely used repeat annotation package.
Superior ab initio identification, annotation and characterisation of TEs and segmental duplications from genome assemblies

PubMed Central

Zeng, Lu; Kortschak, R. Daniel; Raison, Joy M.

2018-01-01

Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive sequences in genome assemblies. The pipeline begins with a pairwise alignment using krishna, a custom aligner. Single linkage clustering is then carried out to produce families of repetitive elements. Consensus sequences are then filtered for protein coding genes and then annotated using Repbase and a custom library of retrovirus and reverse transcriptase sequences. This process yields three types of family: fully annotated, partially annotated and unannotated. Fully annotated families reflect recently diverged/young known TEs present in Repbase. The remaining two types of families contain a mixture of novel TEs and segmental duplications. These can be resolved by aligning these consensus sequences back to the genome to assess copy number vs. length distribution. Our pipeline has three significant advantages compared to other methods for ab initio repeat identification: 1) we generate not only consensus sequences, but keep the genomic intervals for the original aligned sequences, allowing straightforward analysis of evolutionary dynamics, 2) consensus sequences represent low-divergence, recently/currently active TE families, 3) segmental duplications are annotated as a useful by-product. We have compared our ab initio repeat annotations for 7 genome assemblies to other methods and demonstrate that CARP compares favourably with RepeatModeler, the most widely used repeat annotation package. PMID:29538441
Origin of the CMS gene locus in rapeseed cybrid mitochondria: active and inactive recombination produces the complex CMS gene region in the mitochondrial genomes of Brassicaceae.

PubMed

Oshima, Masao; Kikuchi, Rie; Imamura, Jun; Handa, Hirokazu

2010-01-01

CMS (cytoplasmic male sterile) rapeseed is produced by asymmetrical somatic cell fusion between the Brassica napus cv. Westar and the Raphanus sativus Kosena CMS line (Kosena radish). The CMS rapeseed contains a CMS gene, orf125, which is derived from Kosena radish. Our sequence analyses revealed that the orf125 region in CMS rapeseed originated from recombination between the orf125/orfB region and the nad1C/ccmFN1 region by way of a 63 bp repeat. A precise sequence comparison among the related sequences in CMS rapeseed, Kosena radish and normal rapeseed showed that the orf125 region in CMS rapeseed consisted of the Kosena orf125/orfB region and the rapeseed nad1C/ccmFN1 region, even though Kosena radish had both the orf125/orfB region and the nad1C/ccmFN1 region in its mitochondrial genome. We also identified three tandem repeat sequences in the regions surrounding orf125, including a 63 bp repeat, which were involved in several recombination events. Interestingly, differences in the recombination activity for each repeat sequence were observed, even though these sequences were located adjacent to each other in the mitochondrial genome. We report results indicating that recombination events within the mitochondrial genomes are regulated at the level of specific repeat sequences depending on the cellular environment.
Analysis of Two Cosmid Clones from Chromosome 4 of Drosophila melanogaster Reveals Two New Genes Amid an Unusual Arrangement of Repeated Sequences

PubMed Central

Locke, John; Podemski, Lynn; Roy, Ken; Pilgrim, David; Hodgetts, Ross

1999-01-01

Chromosome 4 from Drosophila melanogaster has several unusual features that distinguish it from the other chromosomes. These include a diffuse appearance in salivary gland polytene chromosomes, an absence of recombination, and the variegated expression of P-element transgenes. As part of a larger project to understand these properties, we are assembling a physical map of this chromosome. Here we report the sequence of two cosmids representing ∼5% of the polytenized region. Both cosmid clones contain numerous repeated DNA sequences, as identified by cross hybridization with labeled genomic DNA, BLAST searches, and dot matrix analysis, which are positioned between and within the transcribed sequences. The repetitive sequences include three copies of the mobile element Hoppel, one copy of the mobile element HB, and 18 DINE repeats. DINE is a novel, short repeated sequence dispersed throughout both cosmid sequences. One cosmid includes the previously described cubitus interruptus (ci) gene and two new genes: that a gene with a predicted amino acid sequence similar to ribosomal protein S3a which is consistent with the Minute(4)101 locus thought to be in the region, and a novel member of the protein family that includes plexin and met–hepatocyte growth factor receptor. The other cosmid contains only the two short 5′-most exons from the zinc-finger-homolog-2 (zfh-2) gene. This is the first extensive sequence analysis of noncoding DNA from chromosome 4. The distribution of the various repeats suggests its organization is similar to the β-heterochromatic regions near the base of the major chromosome arms. Such a pattern may account for the diffuse banding of the polytene chromosome 4 and the variegation of many P-element transgenes on the chromosome. PMID:10022978
Experimental definition of a clustered regularly interspaced short palindromic duplicon in Escherichia coli.

PubMed

Goren, Moran G; Yosef, Ido; Auster, Oren; Qimron, Udi

2012-10-12

We analyzed sequences of newly inserted repeats in an Escherichia coli CRISPR (clustered regularly interspaced short palindromic repeats) array in vivo and showed that a base previously thought to belong to the repeat is actually derived from a protospacer. Based on further experimental results, we propose to use the term "duplicon" for a repeated sequence in a CRISPR array that serves as a template for a new duplicon. Our findings suggest the possibility of redrawing the borders between repeats, spacers, and protospacer adjacent motifs. Copyright © 2012 Elsevier Ltd. All rights reserved.
Phylogeny and strain typing of Escherichia coli, inferred from variation at mononucleotide repeat loci.

PubMed

Diamant, Eran; Palti, Yniv; Gur-Arie, Riva; Cohen, Helit; Hallerman, Eric M; Kashi, Yechezkel

2004-04-01

Multilocus sequencing of housekeeping genes has been used previously for bacterial strain typing and for inferring evolutionary relationships among strains of Escherichia coli. In this study, we used shorter intergenic sequences that contained simple sequence repeats (SSRs) of repeating mononucleotide motifs (mononucleotide repeats [MNRs]) to infer the phylogeny of pathogenic and commensal E. coli strains. Seven noncoding loci (four MNRs and three non-SSRs) were sequenced in 27 strains, including enterohemorrhagic (six isolates of O157:H7), enteropathogenic, enterotoxigenic, B, and K-12 strains. The four MNRs were also sequenced in 20 representative strains of the E. coli reference (ECOR) collection. Sequence polymorphism was significantly higher at the MNR loci, including the flanking sequences, indicating a higher mutation rate in the sequences flanking the MNR tracts. The four MNR loci were amplifiable by PCR in the standard ECOR A, B1, and D groups, but only one (yaiN) in the B2 group was amplified, which is consistent with previous studies that suggested that B2 is the most ancient group. High sequence compatibility was found between the four MNR loci, indicating that they are in the same clonal frame. The phylogenetic trees that were constructed from the sequence data were in good agreement with those of previous studies that used multilocus enzyme electrophoresis. The results demonstrate that MNR loci are useful for inferring phylogenetic relationships and provide much higher sequence variation than housekeeping genes. Therefore, the use of MNR loci for multilocus sequence typing should prove efficient for clinical diagnostics, epidemiology, and evolutionary study of bacteria.
Phylogeny and Strain Typing of Escherichia coli, Inferred from Variation at Mononucleotide Repeat Loci

PubMed Central

Diamant, Eran; Palti, Yniv; Gur-Arie, Riva; Cohen, Helit; Hallerman, Eric M.; Kashi, Yechezkel

2004-01-01

Multilocus sequencing of housekeeping genes has been used previously for bacterial strain typing and for inferring evolutionary relationships among strains of Escherichia coli. In this study, we used shorter intergenic sequences that contained simple sequence repeats (SSRs) of repeating mononucleotide motifs (mononucleotide repeats [MNRs]) to infer the phylogeny of pathogenic and commensal E. coli strains. Seven noncoding loci (four MNRs and three non-SSRs) were sequenced in 27 strains, including enterohemorrhagic (six isolates of O157:H7), enteropathogenic, enterotoxigenic, B, and K-12 strains. The four MNRs were also sequenced in 20 representative strains of the E. coli reference (ECOR) collection. Sequence polymorphism was significantly higher at the MNR loci, including the flanking sequences, indicating a higher mutation rate in the sequences flanking the MNR tracts. The four MNR loci were amplifiable by PCR in the standard ECOR A, B1, and D groups, but only one (yaiN) in the B2 group was amplified, which is consistent with previous studies that suggested that B2 is the most ancient group. High sequence compatibility was found between the four MNR loci, indicating that they are in the same clonal frame. The phylogenetic trees that were constructed from the sequence data were in good agreement with those of previous studies that used multilocus enzyme electrophoresis. The results demonstrate that MNR loci are useful for inferring phylogenetic relationships and provide much higher sequence variation than housekeeping genes. Therefore, the use of MNR loci for multilocus sequence typing should prove efficient for clinical diagnostics, epidemiology, and evolutionary study of bacteria. PMID:15066845
IGF-1: elixir for motor neuron diseases.

PubMed

Papanikolaou, Theodora; Ellerby, Lisa M

2009-08-13

Modulation of testosterone levels is a therapeutic approach for spinal and bulbar muscular atrophy (SBMA), a polyglutamine disorder that affects the motor neurons. The article by Palazzolo et al. in this issue of Neuron provides compelling evidence that the expression of insulin growth hormone is a potential therapeutic for SBMA.
Lack of huntingtin promotes neural stem cells differentiation into glial cells while neurons expressing huntingtin with expanded polyglutamine tracts undergo cell death.

PubMed

Conforti, Paola; Camnasio, Stefano; Mutti, Cesare; Valenza, Marta; Thompson, Morgan; Fossale, Elisa; Zeitlin, Scott; MacDonald, Marcy E; Zuccato, Chiara; Cattaneo, Elena

2013-02-01

Huntington's disease (HD) is a neurodegenerative disorder that affects muscle coordination and diminishes cognitive abilities. The genetic basis of the disease is an expansion of CAG repeats in the Huntingtin (Htt) gene. Here we aimed to generate a series of mouse neural stem (NS) cell lines that carried varying numbers of CAG repeats in the mouse Htt gene (Hdh CAG knock-in NS cells) or that had Hdh null alleles (Hdh knock-out NS cells). Towards this end, Hdh CAG knock-in mouse ES cell lines that carried an Htt gene with 20, 50, 111, or 140 CAG repeats or that were Htt null were neuralized and converted into self-renewing NS cells. The resulting NS cell lines were immunopositive for the neural stem cell markers NESTIN, SOX2, and BLBP and had similar proliferative rates and cell cycle distributions. After 14 days in vitro, wild-type NS cells gave rise to cultures composed of 70% MAP2(+) neurons and 30% GFAP(+) astrocytes. In contrast, NS cells with expanded CAG repeats underwent neuronal cell death, with only 38%±15% of the MAP2(+) cells remaining at the end of the differentiation period. Cell death was verified by increased caspase 3/7 activity on day 14 of the neuronal differentiation protocol. Interestingly, Hdh knock-out NS cells treated using the same neuronal differentiation protocol showed a dramatic increase in the number of GFAP(+) cells on day 14 (61%±20% versus 24%±10% in controls), and a massive decrease of MAP2(+) neurons (30%±11% versus 64%±17% in controls). Both Hdh CAG knock-in NS cells and Hdh knock-out NS cells showed reduced levels of Bdnf mRNA during neuronal differentiation, in agreement with data obtained previously in HD mouse models and in post-mortem brain samples from HD patients. We concluded that Hdh CAG knock-in and Hdh knock-out NS cells have potential as tools for investigating the roles of normal and mutant HTT in differentiated neurons and glial cells of the brain. Copyright © 2012 Elsevier Inc. All rights reserved.

Characterization and assessment of an avian repetitive DNA sequence as an icterid phylogenetic marker.

PubMed

Quinn, J S; Guglich, E; Seutin, G; Lau, R; Marsolais, J; Parna, L; Boag, P T; White, B N

1992-02-01

The first tandemly repeated sequence examined in a passerine bird, a 431-bp PstI fragment named pMAT1, has been cloned from the genome of the brown-headed cowbird (Molothrus ater). The sequence represents about 5-10% of the genome (about 4 x 10(5) copies) and yields prominent ethidium bromide stained bands when genomic DNA cut with a variety of restriction enzymes is electrophoresed in agarose gels. A particularly striking ladder of fragments is apparent when the DNA is cut with HinfI, indicative of a tandem arrangement of the monomer. The cloned PstI monomer has been sequenced, revealing no internal repeated structure. There are sequences that hybridize with pMAT1 found in related nine-primaried oscines but not in more distantly related oscines, suboscines, or nonpasserine species. Little sequence similarity to tandemly repeated PstI cut sequences from the merlin (Falco columbarius), saurus crane (Grus antigone), or Puerto Rican parrot (Amazona vittata) or to HinfI digested sequence from the Toulouse goose (Anser anser) was detected. The isolated sequence was used as a probe to examine DNA samples of eight members of the tribe Icterini. This examination revealed phylogenetically informative characters. The repeat contains cutting sites from a number of restriction enzymes, which, if sufficiently polymorphic, would provide new phylogenetic characters. Sequences like these, conserved within a species, but variable between closely related species, may be very useful for phylogenetic studies of closely related taxa.
“One code to find them all”: a perl tool to conveniently parse RepeatMasker output files

PubMed Central

2014-01-01

Background Of the different bioinformatic methods used to recover transposable elements (TEs) in genome sequences, one of the most commonly used procedures is the homology-based method proposed by the RepeatMasker program. RepeatMasker generates several output files, including the .out file, which provides annotations for all detected repeats in a query sequence. However, a remaining challenge consists of identifying the different copies of TEs that correspond to the identified hits. This step is essential for any evolutionary/comparative analysis of the different copies within a family. Different possibilities can lead to multiple hits corresponding to a unique copy of an element, such as the presence of large deletions/insertions or undetermined bases, and distinct consensus corresponding to a single full-length sequence (like for long terminal repeat (LTR)-retrotransposons). These possibilities must be taken into account to determine the exact number of TE copies. Results We have developed a perl tool that parses the RepeatMasker .out file to better determine the number and positions of TE copies in the query sequence, in addition to computing quantitative information for the different families. To determine the accuracy of the program, we tested it on several RepeatMasker .out files corresponding to two organisms (Drosophila melanogaster and Homo sapiens) for which the TE content has already been largely described and which present great differences in genome size, TE content, and TE families. Conclusions Our tool provides access to detailed information concerning the TE content in a genome at the family level from the .out file of RepeatMasker. This information includes the exact position and orientation of each copy, its proportion in the query sequence, and its quality compared to the reference element. In addition, our tool allows a user to directly retrieve the sequence of each copy and obtain the same detailed information at the family level when a local library with incomplete TE class/subclass information was used with RepeatMasker. We hope that this tool will be helpful for people working on the distribution and evolution of TEs within genomes.
Effects of "D"-Amphetamine and Ethanol on Variable and Repetitive Key-Peck Sequences in Pigeons

ERIC Educational Resources Information Center

Ward, Ryan D.; Bailey, Ericka M.; Odum, Amy L.

2006-01-01

This experiment assessed the effects of "d"-Amphetamine and ethanol on reinforced variable and repetitive key-peck sequences in pigeons. Pigeons responded on two keys under a multiple schedule of Repeat and Vary components. In the Repeat component, completion of a target sequence of right, right, left, left resulted in food. In the Vary component,…
The mitochondrial genome of the legume Vigna radiata and the analysis of recombination across short mitochondrial repeats.

PubMed

Alverson, Andrew J; Zhuo, Shi; Rice, Danny W; Sloan, Daniel B; Palmer, Jeffrey D

2011-01-20

The mitochondrial genomes of seed plants are exceptionally fluid in size, structure, and sequence content, with the accumulation and activity of repetitive sequences underlying much of this variation. We report the first fully sequenced mitochondrial genome of a legume, Vigna radiata (mung bean), and show that despite its unexceptional size (401,262 nt), the genome is unusually depauperate in repetitive DNA and "promiscuous" sequences from the chloroplast and nuclear genomes. Although Vigna lacks the large, recombinationally active repeats typical of most other seed plants, a PCR survey of its modest repertoire of short (38-297 nt) repeats nevertheless revealed evidence for recombination across all of them. A set of novel control assays showed, however, that these results could instead reflect, in part or entirely, artifacts of PCR-mediated recombination. Consequently, we recommend that other methods, especially high-depth genome sequencing, be used instead of PCR to infer patterns of plant mitochondrial recombination. The average-sized but repeat- and feature-poor mitochondrial genome of Vigna makes it ever more difficult to generalize about the factors shaping the size and sequence content of plant mitochondrial genomes.
Highly Informative Simple Sequence Repeat (SSR) Markers for Fingerprinting Hazelnut

USDA-ARS?s Scientific Manuscript database

Simple sequence repeat (SSR) or microsatellite markers have many applications in breeding and genetic studies of plants, including fingerprinting of cultivars and investigations of genetic diversity, and therefore provide information for better management of germplasm collections. They are repeatab...
Distribution and sequence homogeneity of an abundant satellite DNA in the beetle, Tenebrio molitor.

PubMed Central

Davis, C A; Wyatt, G R

1989-01-01

The mealworm beetle, Tenebrio molitor, contains an unusually abundant and homogeneous satellite DNA which constitutes up to 60% of its genome. The satellite DNA is shown to be present in all of the chromosomes by in situ hybridization. 18 dimers of the repeat unit were cloned and sequenced. The consensus sequence is 142 nt long and lacks any internal repeat structure. Monomers of the sequence are very similar, showing on average a 2% divergence from the calculated consensus. Variant nucleotides are scattered randomly throughout the sequence although some variants are more common than others. Neighboring repeat units are no more alike than randomly chosen ones. The results suggest that some mechanism, perhaps gene conversion, is acting to maintain the homogeneity of the satellite DNA despite its abundance and distribution on all of the chromosomes. Images PMID:2762148
Functional characterization of rare FOXP2 variants in neurodevelopmental disorder.

PubMed

Estruch, Sara B; Graham, Sarah A; Chinnappa, Swathi M; Deriziotis, Pelagia; Fisher, Simon E

2016-01-01

Heterozygous disruption of FOXP2 causes a rare form of speech and language impairment. Screens of the FOXP2 sequence in individuals with speech/language-related disorders have identified several rare protein-altering variants, but their phenotypic relevance is often unclear. FOXP2 encodes a transcription factor with a forkhead box DNA-binding domain, but little is known about the functions of protein regions outside this domain. We performed detailed functional analyses of seven rare FOXP2 variants found in affected cases, including three which have not been previously characterized, testing intracellular localization, transcriptional regulation, dimerization, and interaction with other proteins. To shed further light on molecular functions of FOXP2, we characterized the interaction between this transcription factor and co-repressor proteins of the C-terminal binding protein (CTBP) family. Finally, we analysed the functional significance of the polyglutamine tracts in FOXP2, since tract length variations have been reported in cases of neurodevelopmental disorder. We confirmed etiological roles of multiple FOXP2 variants. Of three variants that have been suggested to cause speech/language disorder, but never before been characterized, only one showed functional effects. For the other two, we found no effects on protein function in any assays, suggesting that they are incidental to the phenotype. We identified a CTBP-binding region within the N-terminal portion of FOXP2. This region includes two amino acid substitutions that occurred on the human lineage following the split from chimpanzees. However, we did not observe any effects of these amino acid changes on CTBP binding or other core aspects of FOXP2 function. Finally, we found that FOXP2 variants with reduced polyglutamine tracts did not exhibit altered behaviour in cellular assays, indicating that such tracts are non-essential for core aspects of FOXP2 function, and that tract variation is unlikely to be a highly penetrant cause of speech/language disorder. Our findings highlight the importance of functional characterization of novel rare variants in FOXP2 in assessing the contribution of such variants to speech/language disorder and provide further insights into the molecular function of the FOXP2 protein.
Amino acid sequence analysis of the annexin super-gene family of proteins.

PubMed

Barton, G J; Newman, R H; Freemont, P S; Crumpton, M J

1991-06-15

The annexins are a widespread family of calcium-dependent membrane-binding proteins. No common function has been identified for the family and, until recently, no crystallographic data existed for an annexin. In this paper we draw together 22 available annexin sequences consisting of 88 similar repeat units, and apply the techniques of multiple sequence alignment, pattern matching, secondary structure prediction and conservation analysis to the characterisation of the molecules. The analysis clearly shows that the repeats cluster into four distinct families and that greatest variation occurs within the repeat 3 units. Multiple alignment of the 88 repeats shows amino acids with conserved physicochemical properties at 22 positions, with only Gly at position 23 being absolutely conserved in all repeats. Secondary structure prediction techniques identify five conserved helices in each repeat unit and patterns of conserved hydrophobic amino acids are consistent with one face of a helix packing against the protein core in predicted helices a, c, d, e. Helix b is generally hydrophobic in all repeats, but contains a striking pattern of repeat-specific residue conservation at position 31, with Arg in repeats 4 and Glu in repeats 2, but unconserved amino acids in repeats 1 and 3. This suggests repeats 2 and 4 may interact via a buried saltbridge. The loop between predicted helices a and b of repeat 3 shows features distinct from the equivalent loop in repeats 1, 2 and 4, suggesting an important structural and/or functional role for this region. No compelling evidence emerges from this study for uteroglobin and the annexins sharing similar tertiary structures, or for uteroglobin representing a derivative of a primordial one-repeat structure that underwent duplication to give the present day annexins. The analyses performed in this paper are re-evaluated in the Appendix, in the light of the recently published X-ray structure for human annexin V. The structure confirms most of the predictions and shows the power of techniques for the determination of tertiary structural information from the amino acid sequences of an aligned protein family.
Discovery of Escherichia coli CRISPR sequences in an undergraduate laboratory.

PubMed

Militello, Kevin T; Lazatin, Justine C

2017-05-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) represent a novel type of adaptive immune system found in eubacteria and archaebacteria. CRISPRs have recently generated a lot of attention due to their unique ability to catalog foreign nucleic acids, their ability to destroy foreign nucleic acids in a mechanism that shares some similarity to RNA interference, and the ability to utilize reconstituted CRISPR systems for genome editing in numerous organisms. In order to introduce CRISPR biology into an undergraduate upper-level laboratory, a five-week set of exercises was designed to allow students to examine the CRISPR status of uncharacterized Escherichia coli strains and to allow the discovery of new repeats and spacers. Students started the project by isolating genomic DNA from E. coli and amplifying the iap CRISPR locus using the polymerase chain reaction (PCR). The PCR products were analyzed by Sanger DNA sequencing, and the sequences were examined for the presence of CRISPR repeat sequences. The regions between the repeats, the spacers, were extracted and analyzed with BLASTN searches. Overall, CRISPR loci were sequenced from several previously uncharacterized E. coli strains and one E. coli K-12 strain. Sanger DNA sequencing resulted in the discovery of 36 spacer sequences and their corresponding surrounding repeat sequences. Five of the spacers were homologous to foreign (non-E. coli) DNA. Assessment of the laboratory indicates that improvements were made in the ability of students to answer questions relating to the structure and function of CRISPRs. Future directions of the laboratory are presented and discussed. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(3):262-269, 2017. © 2016 The International Union of Biochemistry and Molecular Biology.
Sequence investigation of 34 forensic autosomal STRs with massively parallel sequencing.

PubMed

Zhang, Suhua; Niu, Yong; Bian, Yingnan; Dong, Rixia; Liu, Xiling; Bao, Yun; Jin, Chao; Zheng, Hancheng; Li, Chengtao

2018-05-01

STRs vary not only in the length of the repeat units and the number of repeats but also in the region with which they conform to an incremental repeat pattern. Massively parallel sequencing (MPS) offers new possibilities in the analysis of STRs since they can simultaneously sequence multiple targets in a single reaction and capture potential internal sequence variations. Here, we sequenced 34 STRs applied in the forensic community of China with a custom-designed panel. MPS performance were evaluated from sequencing reads analysis, concordance study and sensitivity testing. High coverage sequencing data were obtained to determine the constitute ratios and heterozygous balance. No actual inconsistent genotypes were observed between capillary electrophoresis (CE) and MPS, demonstrating the reliability of the panel and the MPS technology. With the sequencing data from the 200 investigated individuals, 346 and 418 alleles were obtained via CE and MPS technologies at the 34 STRs, indicating MPS technology provides higher discrimination than CE detection. The whole study demonstrated that STR genotyping with the custom panel and MPS technology has the potential not only to reveal length and sequence variations but also to satisfy the demands of high throughput and high multiplexing with acceptable sensitivity.
Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor.

PubMed

Kohany, Oleksiy; Gentles, Andrew J; Hankus, Lukasz; Jurka, Jerzy

2006-10-25

Repbase is a reference database of eukaryotic repetitive DNA, which includes prototypic sequences of repeats and basic information described in annotations. Updating and maintenance of the database requires specialized tools, which we have created and made available for use with Repbase, and which may be useful as a template for other curated databases. We describe the software tools RepbaseSubmitter and Censor, which are designed to facilitate updating and screening the content of Repbase. RepbaseSubmitter is a java-based interface for formatting and annotating Repbase entries. It eliminates many common formatting errors, and automates actions such as calculation of sequence lengths and composition, thus facilitating curation of Repbase sequences. In addition, it has several features for predicting protein coding regions in sequences; searching and including Pubmed references in Repbase entries; and searching the NCBI taxonomy database for correct inclusion of species information and taxonomic position. Censor is a tool to rapidly identify repetitive elements by comparison to known repeats. It uses WU-BLAST for speed and sensitivity, and can conduct DNA-DNA, DNA-protein, or translated DNA-translated DNA searches of genomic sequence. Defragmented output includes a map of repeats present in the query sequence, with the options to report masked query sequence(s), repeat sequences found in the query, and alignments. Censor and RepbaseSubmitter are available as both web-based services and downloadable versions. They can be found at http://www.girinst.org/repbase/submission.html (RepbaseSubmitter) and http://www.girinst.org/censor/index.php (Censor).
Centromere location in Arabidopsis is unaltered by extreme divergence in CENH3 protein sequence

PubMed Central

2017-01-01

During cell division, spindle fibers attach to chromosomes at centromeres. The DNA sequence at regional centromeres is fast evolving with no conserved genetic signature for centromere identity. Instead CENH3, a centromere-specific histone H3 variant, is the epigenetic signature that specifies centromere location across both plant and animal kingdoms. Paradoxically, CENH3 is also adaptively evolving. An ongoing question is whether CENH3 evolution is driven by a functional relationship with the underlying DNA sequence. Here, we demonstrate that despite extensive protein sequence divergence, CENH3 histones from distant species assemble centromeres on the same underlying DNA sequence. We first characterized the organization and diversity of centromere repeats in wild-type Arabidopsis thaliana. We show that A. thaliana CENH3-containing nucleosomes exhibit a strong preference for a unique subset of centromeric repeats. These sequences are largely missing from the genome assemblies and represent the youngest and most homogeneous class of repeats. Next, we tested the evolutionary specificity of this interaction in a background in which the native A. thaliana CENH3 is replaced with CENH3s from distant species. Strikingly, we find that CENH3 from Lepidium oleraceum and Zea mays, although specifying epigenetically weaker centromeres that result in genome elimination upon outcrossing, show a binding pattern on A. thaliana centromere repeats that is indistinguishable from the native CENH3. Our results demonstrate positional stability of a highly diverged CENH3 on independently evolved repeats, suggesting that the sequence specificity of centromeres is determined by a mechanism independent of CENH3. PMID:28223399
Repeating aftershocks of the great 2004 Sumatra and 2005 Nias earthquakes

NASA Astrophysics Data System (ADS)

Yu, Wen-che; Song, Teh-Ru Alex; Silver, Paul G.

2013-05-01

We investigate repeating aftershocks associated with the great 2004 Sumatra-Andaman (Mw 9.2) and 2005 Nias-Simeulue (Mw 8.6) earthquakes by cross-correlating waveforms recorded by the regional seismographic station PSI and teleseismic stations. We identify 10 and 18 correlated aftershock sequences associated with the great 2004 Sumatra and 2005 Nias earthquakes, respectively. The majority of the correlated aftershock sequences are located near the down-dip end of a large afterslip patch. We determine the precise relative locations of event pairs among these sequences and estimate the source rupture areas. The correlated event pairs identified are appropriately referred to as repeating aftershocks, in that the source rupture areas are comparable and significantly overlap within a sequence. We use the repeating aftershocks to estimate afterslip based on the slip-seismic moment scaling relationship and to infer the temporal decay rate of the recurrence interval. The estimated afterslip resembles that measured from the near-field geodetic data to the first order. The decay rate of repeating aftershocks as a function of lapse time t follows a power-law decay 1/tp with the exponent p in the range 0.8-1.1. Both types of observations indicate that repeating aftershocks are governed by post-seismic afterslip.
Genome-Wide Stochastic Adaptive DNA Amplification at Direct and Inverted DNA Repeats in the Parasite Leishmania

PubMed Central

Plourde, Marie; Gingras, Hélène; Roy, Gaétan; Lapointe, Andréanne; Leprohon, Philippe; Papadopoulou, Barbara; Corbeil, Jacques; Ouellette, Marc

2014-01-01

Gene amplification of specific loci has been described in all kingdoms of life. In the protozoan parasite Leishmania, the product of amplification is usually part of extrachromosomal circular or linear amplicons that are formed at the level of direct or inverted repeated sequences. A bioinformatics screen revealed that repeated sequences are widely distributed in the Leishmania genome and the repeats are chromosome-specific, conserved among species, and generally present in low copy number. Using sensitive PCR assays, we provide evidence that the Leishmania genome is continuously being rearranged at the level of these repeated sequences, which serve as a functional platform for constitutive and stochastic amplification (and deletion) of genomic segments in the population. This process is adaptive as the copy number of advantageous extrachromosomal circular or linear elements increases upon selective pressure and is reversible when selection is removed. We also provide mechanistic insights on the formation of circular and linear amplicons through RAD51 recombinase-dependent and -independent mechanisms, respectively. The whole genome of Leishmania is thus stochastically rearranged at the level of repeated sequences, and the selection of parasite subpopulations with changes in the copy number of specific loci is used as a strategy to respond to a changing environment. PMID:24844805
Population-scale whole genome sequencing identifies 271 highly polymorphic short tandem repeats from Japanese population.

PubMed

Hirata, Satoshi; Kojima, Kaname; Misawa, Kazuharu; Gervais, Olivier; Kawai, Yosuke; Nagasaki, Masao

2018-05-01

Forensic DNA typing is widely used to identify missing persons and plays a central role in forensic profiling. DNA typing usually uses capillary electrophoresis fragment analysis of PCR amplification products to detect the length of short tandem repeat (STR) markers. Here, we analyzed whole genome data from 1,070 Japanese individuals generated using massively parallel short-read sequencing of 162 paired-end bases. We have analyzed 843,473 STR loci with two to six basepair repeat units and cataloged highly polymorphic STR loci in the Japanese population. To evaluate the performance of the cataloged STR loci, we compared 23 STR loci, widely used in forensic DNA typing, with capillary electrophoresis based STR genotyping results in the Japanese population. Seventeen loci had high correlations and high call rates. The other six loci had low call rates or low correlations due to either the limitations of short-read sequencing technology, the bioinformatics tool used, or the complexity of repeat patterns. With these analyses, we have also purified the suitable 218 STR loci with four basepair repeat units and 53 loci with five basepair repeat units both for short read sequencing and PCR based technologies, which would be candidates to the actual forensic DNA typing in Japanese population.
Comparative molecular cytogenetics of major repetitive sequence families of three Dendrobium species (Orchidaceae) from Bangladesh

PubMed Central

Begum, Rabeya; Alam, Sheikh Shamimul; Menzel, Gerhard; Schmidt, Thomas

2009-01-01

Background and Aims Dendrobium species show tremendous morphological diversity and have broad geographical distribution. As repetitive sequence analysis is a useful tool to investigate the evolution of chromosomes and genomes, the aim of the present study was the characterization of repetitive sequences from Dendrobium moschatum for comparative molecular and cytogenetic studies in the related species Dendrobium aphyllum, Dendrobium aggregatum and representatives from other orchid genera. Methods In order to isolate highly repetitive sequences, a c0t-1 DNA plasmid library was established. Repeats were sequenced and used as probes for Southern hybridization. Sequence divergence was analysed using bioinformatic tools. Repetitive sequences were localized along orchid chromosomes by fluorescence in situ hybridization (FISH). Key Results Characterization of the c0t-1 library resulted in the detection of repetitive sequences including the (GA)n dinucleotide DmoO11, numerous Arabidopsis-like telomeric repeats and the highly amplified dispersed repeat DmoF14. The DmoF14 repeat is conserved in six Dendrobium species but diversified in representative species of three other orchid genera. FISH analyses showed the genome-wide distribution of DmoF14 in D. moschatum, D. aphyllum and D. aggregatum. Hybridization with the telomeric repeats demonstrated Arabidopsis-like telomeres at the chromosome ends of Dendrobium species. However, FISH using the telomeric probe revealed two pairs of chromosomes with strong intercalary signals in D. aphyllum. FISH showed the terminal position of 5S and 18S–5·8S–25S rRNA genes and a characteristic number of rDNA sites in the three Dendrobium species. Conclusions The repeated sequences isolated from D. moschatum c0t-1 DNA constitute major DNA families of the D. moschatum, D. aphyllum and D. aggregatum genomes with DmoF14 representing an ancient component of orchid genomes. Large intercalary telomere-like arrays suggest chromosomal rearrangements in D. aphyllum while the number and localization of rRNA genes as well as the species-specific distribution pattern of an abundant microsatellite reflect the genomic diversity of the three Dendrobium species. PMID:19635741
Impaired ERAD and ER stress are early and specific events in polyglutamine toxicity

PubMed Central

Duennwald, Martin L.; Lindquist, Susan

2008-01-01

Protein misfolding, whether caused by aging, environmental factors, or genetic mutations, is a common basis for neurodegenerative diseases. The misfolding of proteins with abnormally long polyglutamine (polyQ) expansions causes several neurodegenerative disorders, such as Huntington’s disease (HD). Although many cellular pathways have been documented to be impaired in HD, the primary triggers of polyQ toxicity remain elusive. We report that yeast cells and neuron-like PC12 cells expressing polyQ-expanded huntingtin (htt) fragments display a surprisingly specific, immediate, and drastic defect in endoplasmic reticulum (ER)-associated degradation (ERAD). We further decipher the mechanistic basis for this defect in ERAD: the entrapment of the essential ERAD proteins Npl4, Ufd1, and p97 by polyQ-expanded htt fragments. In both yeast and mammalian neuron-like cells, overexpression of Npl4 and Ufd1 ameliorates polyQ toxicity. Our results establish that impaired ER protein homeostasis is a broad and highly conserved contributor to polyQ toxicity in yeast, in PC12 cells, and, importantly, in striatal cells expressing full-length polyQ-expanded huntingtin. PMID:19015277
Tadpole-like Conformations of Huntingtin Exon 1 Are Characterized by Conformational Heterogeneity that Persists regardless of Polyglutamine Length.

PubMed

Newcombe, Estella A; Ruff, Kiersten M; Sethi, Ashish; Ormsby, Angelique R; Ramdzan, Yasmin M; Fox, Archa; Purcell, Anthony W; Gooley, Paul R; Pappu, Rohit V; Hatters, Danny M

2018-05-11

Soluble huntingtin exon 1 (Httex1) with expanded polyglutamine (polyQ) engenders neurotoxicity in Huntington's disease. To uncover the physical basis of this toxicity, we performed structural studies of soluble Httex1 for wild-type and mutant polyQ lengths. Nuclear magnetic resonance experiments show evidence for conformational rigidity across the polyQ region. In contrast, hydrogen-deuterium exchange shows absence of backbone amide protection, suggesting negligible persistence of hydrogen bonds. The seemingly conflicting results are explained by all-atom simulations, which show that Httex1 adopts tadpole-like structures with a globular head encompassing the N-terminal amphipathic and polyQ regions and the tail encompassing the C-terminal proline-rich region. The surface area of the globular domain increases monotonically with polyQ length. This stimulates sharp increases in gain-of-function interactions in cells for expanded polyQ, and one of these interactions is with the stress-granule protein Fus. Our results highlight plausible connections between Httex1 structure and routes to neurotoxicity. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Interaction with Polyglutamine-expanded Huntingtin Alters Cellular Distribution and RNA Processing of Huntingtin Yeast Two-hybrid Protein A (HYPA)*

PubMed Central

Jiang, Ya-Jun; Che, Mei-Xia; Yuan, Jin-Qiao; Xie, Yuan-Yuan; Yan, Xian-Zhong; Hu, Hong-Yu

2011-01-01

Huntington disease (HD) is an autosomal inherited disorder that causes the deterioration of brain cells. The polyglutamine (polyQ) expansion of huntingtin (Htt) is implicated in the pathogenesis of HD via interaction with an RNA splicing factor, Htt yeast two-hybrid protein A/forming-binding protein 11 (HYPA/FBP11). Besides the pathogenic polyQ expansion, Htt also contains a proline-rich region (PRR) located exactly in the C terminus to the polyQ tract. However, how the polyQ expansion influences the PRR-mediated protein interaction and how this abnormal interaction leads to the biological consequence remain elusive. Our NMR structural analysis indicates that the PRR motif of Htt cooperatively interacts with the tandem WW domains of HYPA through domain chaperoning effect of WW1 on WW2. The polyQ-expanded Htt sequesters HYPA to the cytosolic location and then significantly reduces the efficiency of pre-mRNA splicing. We propose that the toxic gain-of-function of the polyQ-expanded Htt that causes dysfunction of cellular RNA processing contributes to the pathogenesis of HD. PMID:21566141
Interaction with polyglutamine-expanded huntingtin alters cellular distribution and RNA processing of huntingtin yeast two-hybrid protein A (HYPA).

PubMed

Jiang, Ya-Jun; Che, Mei-Xia; Yuan, Jin-Qiao; Xie, Yuan-Yuan; Yan, Xian-Zhong; Hu, Hong-Yu

2011-07-15

Huntington disease (HD) is an autosomal inherited disorder that causes the deterioration of brain cells. The polyglutamine (polyQ) expansion of huntingtin (Htt) is implicated in the pathogenesis of HD via interaction with an RNA splicing factor, Htt yeast two-hybrid protein A/forming-binding protein 11 (HYPA/FBP11). Besides the pathogenic polyQ expansion, Htt also contains a proline-rich region (PRR) located exactly in the C terminus to the polyQ tract. However, how the polyQ expansion influences the PRR-mediated protein interaction and how this abnormal interaction leads to the biological consequence remain elusive. Our NMR structural analysis indicates that the PRR motif of Htt cooperatively interacts with the tandem WW domains of HYPA through domain chaperoning effect of WW1 on WW2. The polyQ-expanded Htt sequesters HYPA to the cytosolic location and then significantly reduces the efficiency of pre-mRNA splicing. We propose that the toxic gain-of-function of the polyQ-expanded Htt that causes dysfunction of cellular RNA processing contributes to the pathogenesis of HD.

Comparative genomics and repetitive sequence divergence in the species of diploid Nicotiana section Alatae.

PubMed

Lim, K Yoong; Kovarik, Ales; Matyasek, Roman; Chase, Mark W; Knapp, Sandra; McCarthy, Elizabeth; Clarkson, James J; Leitch, Andrew R

2006-12-01

Combining phylogenetic reconstructions of species relationships with comparative genomic approaches is a powerful way to decipher evolutionary events associated with genome divergence. Here, we reconstruct the history of karyotype and tandem repeat evolution in species of diploid Nicotiana section Alatae. By analysis of plastid DNA, we resolved two clades with high bootstrap support, one containing N. alata, N. langsdorffii, N. forgetiana and N. bonariensis (called the n = 9 group) and another containing N. plumbaginifolia and N. longiflora (called the n = 10 group). Despite little plastid DNA sequence divergence, we observed, via fluorescent in situ hybridization, substantial chromosomal repatterning, including altered chromosome numbers, structure and distribution of repeats. Effort was focussed on 35S and 5S nuclear ribosomal DNA (rDNA) and the HRS60 satellite family of tandem repeats comprising the elements HRS60, NP3R and NP4R. We compared divergence of these repeats in diploids and polyploids of Nicotiana. There are dramatic shifts in the distribution of the satellite repeats and complete replacement of intergenic spacers (IGSs) of 35S rDNA associated with divergence of the species in section Alatae. We suggest that sequence homogenization has replaced HRS60 family repeats at sub-telomeric regions, but that this process may not occur, or occurs more slowly, when the repeats are found at intercalary locations. Sequence homogenization acts more rapidly (at least two orders of magnitude) on 35S rDNA than 5S rDNA and sub-telomeric satellite sequences. This rapid rate of divergence is analogous to that found in polyploid species, and is therefore, in plants, not only associated with polyploidy.
Correlation between fibroin amino acid sequence and physical silk properties.

PubMed

Fedic, Robert; Zurovec, Michal; Sehnal, Frantisek

2003-09-12

The fiber properties of lepidopteran silk depend on the amino acid repeats that interact during H-fibroin polymerization. The aim of our research was to relate repeat composition to insect biology and fiber strength. Representative regions of the H-fibroin genes were sequenced and analyzed in three pyralid species: wax moth (Galleria mellonella), European flour moth (Ephestia kuehniella), and Indian meal moth (Plodia interpunctella). The amino acid repeats are species-specific, evidently a diversification of an ancestral region of 43 residues, and include three types of regularly dispersed motifs: modifications of GSSAASAA sequence, stretches of tripeptides GXZ where X and Z represent bulky residues, and sequences similar to PVIVIEE. No concatenations of GX dipeptide or alanine, which are typical for Bombyx silkworms and Antheraea silk moths, respectively, were found. Despite different repeat structure, the silks of G. mellonella and E. kuehniella exhibit similar tensile strength as the Bombyx and Antheraea silks. We suggest that in these latter two species, variations in the repeat length obstruct repeat alignment, but sufficiently long stretches of iterated residues get superposed to interact. In the pyralid H-fibroins, interactions of the widely separated and diverse motifs depend on the precision of repeat matching; silk is strong in G. mellonella and E. kuehniella, with 2-3 types of long homogeneous repeats, and nearly 10 times weaker in P. interpunctella, with seven types of shorter erratic repeats. The high proportion of large amino acids in the H-fibroin of pyralids has probably evolved in connection with the spinning habit of caterpillars that live in protective silk tubes and spin continuously, enlarging the tubes on one end and partly devouring the other one. The silk serves as a depot of energetically rich and essential amino acids that may be scarce in the diet.
Evidence for Long-Timescale Patterns of Synaptic Inputs in CA1 of Awake Behaving Mice.

PubMed

Kolb, Ilya; Talei Franzesi, Giovanni; Wang, Michael; Kodandaramaiah, Suhasa B; Forest, Craig R; Boyden, Edward S; Singer, Annabelle C

2018-02-14

Repeated sequences of neural activity are a pervasive feature of neural networks in vivo and in vitro In the hippocampus, sequential firing of many neurons over periods of 100-300 ms reoccurs during behavior and during periods of quiescence. However, it is not known whether the hippocampus produces longer sequences of activity or whether such sequences are restricted to specific network states. Furthermore, whether long repeated patterns of activity are transmitted to single cells downstream is unclear. To answer these questions, we recorded intracellularly from hippocampal CA1 of awake, behaving male mice to examine both subthreshold activity and spiking output in single neurons. In eight of nine recordings, we discovered long (900 ms) reoccurring subthreshold fluctuations or "repeats." Repeats generally were high-amplitude, nonoscillatory events reoccurring with 10 ms precision. Using statistical controls, we determined that repeats occurred more often than would be expected from unstructured network activity (e.g., by chance). Most spikes occurred during a repeat, and when a repeat contained a spike, the spike reoccurred with precision on the order of ≤20 ms, showing that long repeated patterns of subthreshold activity are strongly connected to spike output. Unexpectedly, we found that repeats occurred independently of classic hippocampal network states like theta oscillations or sharp-wave ripples. Together, these results reveal surprisingly long patterns of repeated activity in the hippocampal network that occur nonstochastically, are transmitted to single downstream neurons, and strongly shape their output. This suggests that the timescale of information transmission in the hippocampal network is much longer than previously thought. SIGNIFICANCE STATEMENT We found long (≥900 ms), repeated, subthreshold patterns of activity in CA1 of awake, behaving mice. These repeated patterns ("repeats") occurred more often than expected by chance and with 10 ms precision. Most spikes occurred within repeats and reoccurred with a precision on the order of 20 ms. Surprisingly, there was no correlation between repeat occurrence and classical network states such as theta oscillations and sharp-wave ripples. These results provide strong evidence that long patterns of activity are repeated and transmitted to downstream neurons, suggesting that the hippocampus can generate longer sequences of repeated activity than previously thought. Copyright © 2018 the authors 0270-6474/18/381822-14$15.00/0.
CRF: detection of CRISPR arrays using random forest.

PubMed

Wang, Kai; Liang, Chun

2017-01-01

CRISPRs (clustered regularly interspaced short palindromic repeats) are particular repeat sequences found in wide range of bacteria and archaea genomes. Several tools are available for detecting CRISPR arrays in the genomes of both domains. Here we developed a new web-based CRISPR detection tool named CRF (CRISPR Finder by Random Forest). Different from other CRISPR detection tools, a random forest classifier was used in CRF to filter out invalid CRISPR arrays from all putative candidates and accordingly enhanced detection accuracy. In CRF, particularly, triplet elements that combine both sequence content and structure information were extracted from CRISPR repeats for classifier training. The classifier achieved high accuracy and sensitivity. Moreover, CRF offers a highly interactive web interface for robust data visualization that is not available among other CRISPR detection tools. After detection, the query sequence, CRISPR array architecture, and the sequences and secondary structures of CRISPR repeats and spacers can be visualized for visual examination and validation. CRF is freely available at http://bioinfolab.miamioh.edu/crf/home.php.
Expanded complexity of unstable repeat diseases

PubMed Central

Polak, Urszula; McIvor, Elizabeth; Dent, Sharon Y.R.; Wells, Robert D.; Napierala, Marek

2015-01-01

Unstable Repeat Diseases (URDs) share a common mutational phenomenon of changes in the copy number of short, tandemly repeated DNA sequences. More than 20 human neurological diseases are caused by instability, predominantly expansion, of microsatellite sequences. Changes in the repeat size initiate a cascade of pathological processes, frequently characteristic of a unique disease or a small subgroup of the URDs. Understanding of both the mechanism of repeat instability and molecular consequences of the repeat expansions is critical to developing successful therapies for these diseases. Recent technological breakthroughs in whole genome, transcriptome and proteome analyses will almost certainly lead to new discoveries regarding the mechanisms of repeat instability, the pathogenesis of URDs, and will facilitate development of novel therapeutic approaches. The aim of this review is to give a general overview of unstable repeats diseases, highlight the complexities of these diseases, and feature the emerging discoveries in the field. PMID:23233240
Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana.

PubMed

Mayer, K; Schüller, C; Wambutt, R; Murphy, G; Volckaert, G; Pohl, T; Düsterhöft, A; Stiekema, W; Entian, K D; Terryn, N; Harris, B; Ansorge, W; Brandt, P; Grivell, L; Rieger, M; Weichselgartner, M; de Simone, V; Obermaier, B; Mache, R; Müller, M; Kreis, M; Delseny, M; Puigdomenech, P; Watson, M; Schmidtheini, T; Reichert, B; Portatelle, D; Perez-Alonso, M; Boutry, M; Bancroft, I; Vos, P; Hoheisel, J; Zimmermann, W; Wedler, H; Ridley, P; Langham, S A; McCullagh, B; Bilham, L; Robben, J; Van der Schueren, J; Grymonprez, B; Chuang, Y J; Vandenbussche, F; Braeken, M; Weltjens, I; Voet, M; Bastiaens, I; Aert, R; Defoor, E; Weitzenegger, T; Bothe, G; Ramsperger, U; Hilbert, H; Braun, M; Holzer, E; Brandt, A; Peters, S; van Staveren, M; Dirske, W; Mooijman, P; Klein Lankhorst, R; Rose, M; Hauf, J; Kötter, P; Berneiser, S; Hempel, S; Feldpausch, M; Lamberth, S; Van den Daele, H; De Keyser, A; Buysshaert, C; Gielen, J; Villarroel, R; De Clercq, R; Van Montagu, M; Rogers, J; Cronin, A; Quail, M; Bray-Allen, S; Clark, L; Doggett, J; Hall, S; Kay, M; Lennard, N; McLay, K; Mayes, R; Pettett, A; Rajandream, M A; Lyne, M; Benes, V; Rechmann, S; Borkova, D; Blöcker, H; Scharfe, M; Grimm, M; Löhnert, T H; Dose, S; de Haan, M; Maarse, A; Schäfer, M; Müller-Auer, S; Gabel, C; Fuchs, M; Fartmann, B; Granderath, K; Dauner, D; Herzl, A; Neumann, S; Argiriou, A; Vitale, D; Liguori, R; Piravandi, E; Massenet, O; Quigley, F; Clabauld, G; Mündlein, A; Felber, R; Schnabl, S; Hiller, R; Schmidt, W; Lecharny, A; Aubourg, S; Chefdor, F; Cooke, R; Berger, C; Montfort, A; Casacuberta, E; Gibbons, T; Weber, N; Vandenbol, M; Bargues, M; Terol, J; Torres, A; Perez-Perez, A; Purnelle, B; Bent, E; Johnson, S; Tacon, D; Jesse, T; Heijnen, L; Schwarz, S; Scholler, P; Heber, S; Francs, P; Bielke, C; Frishman, D; Haase, D; Lemcke, K; Mewes, H W; Stocker, S; Zaccaria, P; Bevan, M; Wilson, R K; de la Bastide, M; Habermann, K; Parnell, L; Dedhia, N; Gnoj, L; Schutz, K; Huang, E; Spiegel, L; Sehkon, M; Murray, J; Sheet, P; Cordes, M; Abu-Threideh, J; Stoneking, T; Kalicki, J; Graves, T; Harmon, G; Edwards, J; Latreille, P; Courtney, L; Cloud, J; Abbott, A; Scott, K; Johnson, D; Minx, P; Bentley, D; Fulton, B; Miller, N; Greco, T; Kemp, K; Kramer, J; Fulton, L; Mardis, E; Dante, M; Pepin, K; Hillier, L; Nelson, J; Spieth, J; Ryan, E; Andrews, S; Geisel, C; Layman, D; Du, H; Ali, J; Berghoff, A; Jones, K; Drone, K; Cotton, M; Joshu, C; Antonoiu, B; Zidanic, M; Strong, C; Sun, H; Lamar, B; Yordan, C; Ma, P; Zhong, J; Preston, R; Vil, D; Shekher, M; Matero, A; Shah, R; Swaby, I K; O'Shaughnessy, A; Rodriguez, M; Hoffmann, J; Till, S; Granat, S; Shohdy, N; Hasegawa, A; Hameed, A; Lodhi, M; Johnson, A; Chen, E; Marra, M; Martienssen, R; McCombie, W R

1999-12-16

The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.
Changes in the folding landscape of the WW domain provide a molecular mechanism for an inherited genetic syndrome

PubMed Central

Pucheta-Martinez, Encarna; D’Amelio, Nicola; Lelli, Moreno; Martinez-Torrecuadrada, Jorge L.; Sudol, Marius; Saladino, Giorgio; Gervasio, Francesco Luigi

2016-01-01

WW domains are small domains present in many human proteins with a wide array of functions and acting through the recognition of proline-rich sequences. The WW domain belonging to polyglutamine tract-binding protein 1 (PQBP1) is of particular interest due to its direct involvement in several X chromosome-linked intellectual disabilities, including Golabi-Ito-Hall (GIH) syndrome, where a single point mutation (Y65C) correlates with the development of the disease. The mutant cannot bind to its natural ligand WBP11, which regulates mRNA processing. In this work we use high-field high-resolution NMR and enhanced sampling molecular dynamics simulations to gain insight into the molecular causes the disease. We find that the wild type protein is partially unfolded exchanging among multiple beta-strand-like conformations in solution. The Y65C mutation further destabilizes the residual fold and primes the protein for the formation of a disulphide bridge, which could be at the origin of the loss of function. PMID:27456546
Characterization of species-specific repeated DNA sequences from B. nigra.

PubMed

Gupta, V; Lakshmisita, G; Shaila, M S; Jagannathan, V; Lakshmikumaran, M S

1992-07-01

The construction and characterization of two genome-specific recombinant DNA clones from B. nigra are described. Southern analysis showed that the two clones belong to a dispersed repeat family. They differ from each other in their length, distribution and sequence, though the average GC content is nearly the same (45%). These B genome-specific repeats have been used to analyse the phylogenetic relationships between cultivated and wild species of the family Brassicaceae.
[Convergent origin of repeats in genes coding for globular proteins. An analysis of the factors determining the presence of inverted and symmetrical repeats].

PubMed

Solov'ev, V V; Kel', A E; Kolchanov, N A

1989-01-01

The factors, determining the presence of inverted and symmetrical repeats in genes coding for globular proteins, have been analysed. An interesting property of genetical code has been revealed in the analysis of symmetrical repeats: the pairs of symmetrical codons corresponded to pairs of amino acids with mostly similar physical-chemical parameters. This property may explain the presence of symmetrical repeats and palindromes only in genes coding for beta-structural proteins-polypeptides, where amino acids with similar physical-chemical properties occupy symmetrical positions. A stochastic model of evolution of polynucleotide sequences has been used for analysis of inverted repeats. The modelling demonstrated that only limiting of sequences (uneven frequencies of used codons) is enough for arising of nonrandom inverted repeats in genes.
Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain

PubMed Central

de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas

2014-01-01

The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. PMID:24792163
ACMES: fast multiple-genome searches for short repeat sequences with concurrent cross-species information retrieval

PubMed Central

Reneker, Jeff; Shyu, Chi-Ren; Zeng, Peiyu; Polacco, Joseph C.; Gassmann, Walter

2004-01-01

We have developed a web server for the life sciences community to use to search for short repeats of DNA sequence of length between 3 and 10 000 bases within multiple species. This search employs a unique and fast hash function approach. Our system also applies information retrieval algorithms to discover knowledge of cross-species conservation of repeat sequences. Furthermore, we have incorporated a part of the Gene Ontology database into our information retrieval algorithms to broaden the coverage of the search. Our web server and tutorial can be found at http://acmes.rnet.missouri.edu. PMID:15215469
Genetic characterization of UCS region of Pneumocystis jirovecii and construction of allelic profiles of Indian isolates based on sequence typing at three regions.

PubMed

Gupta, Rashmi; Mirdha, Bijay Ranjan; Guleria, Randeep; Kumar, Lalit; Luthra, Kalpana; Agarwal, Sanjay Kumar; Sreenivas, Vishnubhatla

2013-01-01

Pneumocystis jirovecii is an opportunistic pathogen that causes severe pneumonia in immunocompromised patients. To study the genetic diversity of P. jirovecii in India the upstream conserved sequence (UCS) region of Pneumocystis genome was amplified, sequenced and genotyped from a set of respiratory specimens obtained from 50 patients with a positive result for nested mitochondrial large subunit ribosomal RNA (mtLSU rRNA) PCR during the years 2005-2008. Of these 50 cases, 45 showed a positive PCR for UCS region. Variations in the tandem repeats in UCS region were characterized by sequencing all the positive cases. Of the 45 cases, one case showed five repeats, 11 cases showed four repeats, 29 cases showed three repeats and four cases showed two repeats. By running amplified DNA from all these cases on a high-resolution gel, mixed infection was observed in 12 cases (26.7%, 12/45). Forty three of 45 cases included in this study had previously been typed at mtLSU rRNA and internal transcribed spacer (ITS) region by our group. In the present study, the genotypes at those two regions were combined with UCS repeat patterns to construct allelic profiles of 43 cases. A total of 36 allelic profiles were observed in 43 isolates indicating high genetic variability. A statistically significant association was observed between mtLSU rRNA genotype 1, ITS type Ea and UCS repeat pattern 4. Copyright © 2012 Elsevier B.V. All rights reserved.
Evolution and selection of Rhg1, a copy-number variant nematode-resistance locus

PubMed Central

Lee, Tong Geon; Kumar, Indrajit; Diers, Brian W; Hudson, Matthew E

2015-01-01

The soybean cyst nematode (SCN) resistance locus Rhg1 is a tandem repeat of a 31.2 kb unit of the soybean genome. Each 31.2-kb unit contains four genes. One allele of Rhg1, Rhg1-b, is responsible for protecting most US soybean production from SCN. Whole-genome sequencing was performed, and PCR assays were developed to investigate allelic variation in sequence and copy number of the Rhg1 locus across a population of soybean germplasm accessions. Four distinct sequences of the 31.2-kb repeat unit were identified, and some Rhg1 alleles carry up to three different types of repeat unit. The total number of copies of the repeat varies from 1 to 10 per haploid genome. Both copy number and sequence of the repeat correlate with the resistance phenotype, and the Rhg1 locus shows strong signatures of selection. Significant linkage disequilibrium in the genome outside the boundaries of the repeat allowed the Rhg1 genotype to be inferred using high-density single nucleotide polymorphism genotyping of 15 996 accessions. Over 860 germplasm accessions were found likely to possess Rhg1 alleles. The regions surrounding the repeat show indications of non-neutral evolution and high genetic variability in populations from different geographic locations, but without evidence of fixation of the resistant genotype. A compelling explanation of these results is that balancing selection is in operation at Rhg1. PMID:25735447
SSR allelic variation in almond (Prunus dulcis Mill.).

PubMed

Xie, Hua; Sui, Yi; Chang, Feng-Qi; Xu, Yong; Ma, Rong-Cai

2006-01-01

Sixteen SSR markers including eight EST-SSR and eight genomic SSRs were used for genetic diversity analysis of 23 Chinese and 15 international almond cultivars. EST- and genomic SSR markers previously reported in species of Prunus, mainly peach, proved to be useful for almond genetic analysis. DNA sequences of 117 alleles of six of the 16 SSR loci were analysed to reveal sequence variation among the 38 almond accessions. For the four SSR loci with AG/CT repeats, no insertions or deletions were observed in the flanking regions of the 98 alleles sequenced. Allelic size variation of these loci resulted exclusively from differences in the structures of repeat motifs, which involved interruptions or occurrences of new motif repeats in addition to varying number of AG/CT repeats. Some alleles had a high number of uninterrupted repeat motifs, indicating that SSR mutational patterns differ among alleles at a given SSR locus within the almond species. Allelic homoplasy was observed in the SSR loci because of base substitutions, interruptions or compound repeat motifs. Substitutions in the repeat regions were found at two SSR loci, suggesting that point mutations operate on SSRs and hinder the further SSR expansion by introducing repeat interruptions to stabilize SSR loci. Furthermore, it was shown that some potential point mutations in the flanking regions are linked with new SSR repeat motif variation in almond and peach.
The 28S–18S rDNA intergenic spacer from Crithidia fasciculata: repeated sequences, length heterogeneity, putative processing sites and potential interactions between U3 small nucleolar RNA and the ribosomal RNA precursor

PubMed Central

Schnare, Murray N.; Collings, James C.; Spencer, David F.; Gray, Michael W.

2000-01-01

In Crithidia fasciculata, the ribosomal RNA (rRNA) gene repeats range in size from ∼11 to 12 kb. This length heterogeneity is localized to a region of the intergenic spacer (IGS) that contains tandemly repeated copies of a 19mer sequence. The IGS also contains four copies of an ∼55 nt repeat that has an internal inverted repeat and is also present in the IGS of Leishmania species. We have mapped the C.fasciculata transcription initiation site as well as two other reverse transcriptase stop sites that may be analogous to the A0 and A′ pre-rRNA processing sites within the 5′ external transcribed spacer (ETS) of other eukaryotes. Features that could influence processing at these sites include two stretches of conserved primary sequence and three secondary structure elements present in the 5′ ETS. We also characterized the C.fasciculata U3 snoRNA, which has the potential for base-pairing with pre-rRNA sequences. Finally, we demonstrate that biosynthesis of large subunit rRNA in both C.fasciculata and Trypanosoma brucei involves 3′-terminal addition of three A residues that are not present in the corresponding DNA sequences. PMID:10982863
Molecular identification and characterization of clustered regularly interspaced short palindromic repeats (CRISPRs) in a urease-positive thermophilic Campylobacter sp. (UPTC).

PubMed

Tasaki, E; Hirayama, J; Tazumi, A; Hayashi, K; Hara, Y; Ueno, H; Moore, J E; Millar, B C; Matsuda, M

2012-02-01

Novel clustered regularly-interspaced short palindromic repeats (CRISPRs) locus [7,500 base pairs (bp) in length] occurred in the urease-positive thermophilic Campylobacter (UPTC) Japanese isolate, CF89-12. The 7,500 bp gene loci consisted of the 5'-methylaminomethyl-2-thiouridylate methyltransferase gene, putative (P) CRISPR associated (p-Cas), putative open reading frames, Cas1 and Cas2, leader sequence region (146 bp), 12 CRISPRs consensus sequence repeats (each 36 bp) separated by a non-repetitive unique spacer region of similar length (26-31 bp) and the phosphatidyl glycerophosphatase A gene. When the CRISPRs loci in the UPTC CF89-12 and five C. jejuni isolates were compared with one another, these six isolates contained p-Cas, Cas1 and Cas2 within the loci. Four to 12 CRISPRs consensus sequence repeats separated by a non-repetitive unique spacer region occurred in six isolates and the nucleotide sequences of those repeats gave approximately 92-100% similarity with each other. However, no sequence similarity occurred in the unique spacer regions among these isolates. The putative σ(70) transcriptional promoter and the hypothetical ρ-independent terminator structures for the CRISPRs and Cas were detected. No in vivo transcription of p-Cas, Cas1 and Cas2 was confirmed in the UPTC cells.
Comparative Chloroplast Genomics of Gossypium Species: Insights Into Repeat Sequence Variations and Phylogeny

PubMed Central

Wu, Ying; Liu, Fang; Yang, Dai-Gang; Li, Wei; Zhou, Xiao-Jian; Pei, Xiao-Yu; Liu, Yan-Gai; He, Kun-Lun; Zhang, Wen-Sheng; Ren, Zhong-Ying; Zhou, Ke-Hai; Ma, Xiong-Feng; Li, Zhong-Hu

2018-01-01

Cotton is one of the most economically important fiber crop plants worldwide. The genus Gossypium contains a single allotetraploid group (AD) and eight diploid genome groups (A–G and K). However, the evolution of repeat sequences in the chloroplast genomes and the phylogenetic relationships of Gossypium species are unclear. Thus, we determined the variations in the repeat sequences and the evolutionary relationships of 40 cotton chloroplast genomes, which represented the most diverse in the genus, including five newly sequenced diploid species, i.e., G. nandewarense (C1-n), G. armourianum (D2-1), G. lobatum (D7), G. trilobum (D8), and G. schwendimanii (D11), and an important semi-wild race of upland cotton, G. hirsutum race latifolium (AD1). The genome structure, gene order, and GC content of cotton species were similar to those of other higher plant plastid genomes. In total, 2860 long sequence repeats (>10 bp in length) were identified, where the F-genome species had the largest number of repeats (G. longicalyx F1: 108) and E-genome species had the lowest (G. stocksii E1: 53). Large-scale repeat sequences possibly enrich the genetic information and maintain genome stability in cotton species. We also identified 10 divergence hotspot regions, i.e., rpl33-rps18, psbZ-trnG (GCC), rps4-trnT (UGU), trnL (UAG)-rpl32, trnE (UUC)-trnT (GGU), atpE, ndhI, rps2, ycf1, and ndhF, which could be useful molecular genetic markers for future population genetics and phylogenetic studies. Site-specific selection analysis showed that some of the coding sites of 10 chloroplast genes (atpB, atpE, rps2, rps3, petB, petD, ccsA, cemA, ycf1, and rbcL) were under protein sequence evolution. Phylogenetic analysis based on the whole plastomes suggested that the Gossypium species grouped into six previously identified genetic clades. Interestingly, all 13 D-genome species clustered into a strong monophyletic clade. Unexpectedly, the cotton species with C, G, and K-genomes were admixed and nested in a large clade, which could have been due to their recent radiation, incomplete lineage sorting, and introgression hybridization among different cotton lineages. In conclusion, the results of this study provide new insights into the evolution of repeat sequences in chloroplast genomes and interspecific relationships in the genus Gossypium. PMID:29619041
[Molecular cloning and characterization of a novel Clonorchis sinensis antigenic protein containing tandem repeat sequences].

PubMed

Liu, Qian; Xu, Xue-Nian; Zhou, Yan; Cheng, Na; Dong, Yu-Ting; Zheng, Hua-Jun; Zhu, Yong-Qiang; Zhu, Yong-Qiang

2013-08-01

To find and clone new antigen genes from the lambda-ZAP cDNA expression library of adult Clonorchis sinensis, and determine the immunological characteristics of the recombinant proteins. The cDNA expression library of adult C. sinensis was screened by pooled sera of clonorchiasis patients. The sequences of the positive phage clones were compared with the sequences in EST database, and the full-length sequence of the gene (Cs22 gene) was obtained by RT-PCR. cDNA fragments containing 2 and 3 times tandem repeat sequences were generated by jumping PCR. The sequence encoding the mature peptide or the tandem repeat sequence was respectively cloned into the prokaryotic expression vector pET28a (+), and then transformed into E. coli Rosetta DE3 cells for expression. The recombinant proteins (rCs22-2r, rCs22-3r, rCs22M-2r, and rCs22M-3r) were purified by His-bind-resin (Ni-NTA) affinity chromatography. The immunogenicity of rCs22-2r and rCs22-3r was identified by ELISA. To evaluate the immunological diagnostic value of rCs22-2r and rCs22-3r, serum samples from 35 clonorchiasis patients, 31 healthy individuals, 15 schistosomiasis patients, 15 paragonimiasis westermani patients and 13 cysticercosis patients were examined by ELISA. To locate antigenic determinants, the pooled sera of clonorchiasis patients and healthy persons were analyzed for specific antibodies by ELISA with recombinant protein rCs22M-2r and rCs22M-3r containing the tandem repeat sequences. The full-length sequence of Cs22 antigen gene of C. sinensis was obtained. It contained 13 times tandem repeat sequences of EQQDGDEEGMGGDGGRGKEKGKVEGEDGAGEQKEQA. Bioinformatics analysis indicated that the protein (Cs22) belonged to GPI-anchored proteins family. The recombinant proteins rCs22-2r and rCs22-3r showed a certain level of immunogenicity. The positive rate by ELISA coated with the purified PrCs22-2r and PrCs22-3r for sera of clonorchiasis patients both were 45.7% (16/35), and 3.2% (1/31) for those of healthy persons. There was no cross reaction with sera of schistosomiasis and cysticercosis patients. The cross reaction with sera of paragonimiasis westermani patients was 1/15. The recombinant proteins rCs22M-2r and rCs22M-3r which only contained tandem repeats were specifically recognized by pooled sera of clonorchiasis patients. The Cs22 antigen gene of Clonorchis sinensis is obtained, and the recombinant proteins have certain diagnostic value. The antigenic determinant is located in tandem repeat sequences.
Centromere location in Arabidopsis is unaltered by extreme divergence in CENH3 protein sequence.

PubMed

Maheshwari, Shamoni; Ishii, Takayoshi; Brown, C Titus; Houben, Andreas; Comai, Luca

2017-03-01

During cell division, spindle fibers attach to chromosomes at centromeres. The DNA sequence at regional centromeres is fast evolving with no conserved genetic signature for centromere identity. Instead CENH3, a centromere-specific histone H3 variant, is the epigenetic signature that specifies centromere location across both plant and animal kingdoms. Paradoxically, CENH3 is also adaptively evolving. An ongoing question is whether CENH3 evolution is driven by a functional relationship with the underlying DNA sequence. Here, we demonstrate that despite extensive protein sequence divergence, CENH3 histones from distant species assemble centromeres on the same underlying DNA sequence. We first characterized the organization and diversity of centromere repeats in wild-type Arabidopsis thaliana We show that A. thaliana CENH3-containing nucleosomes exhibit a strong preference for a unique subset of centromeric repeats. These sequences are largely missing from the genome assemblies and represent the youngest and most homogeneous class of repeats. Next, we tested the evolutionary specificity of this interaction in a background in which the native A. thaliana CENH3 is replaced with CENH3s from distant species. Strikingly, we find that CENH3 from Lepidium oleraceum and Zea mays , although specifying epigenetically weaker centromeres that result in genome elimination upon outcrossing, show a binding pattern on A. thaliana centromere repeats that is indistinguishable from the native CENH3. Our results demonstrate positional stability of a highly diverged CENH3 on independently evolved repeats, suggesting that the sequence specificity of centromeres is determined by a mechanism independent of CENH3. © 2017 Maheshwari et al.; Published by Cold Spring Harbor Laboratory Press.
Structural analysis of two length variants of the rDNA intergenic spacer from Eruca sativa.

PubMed

Lakshmikumaran, M; Negi, M S

1994-03-01

Restriction enzyme analysis of the rRNA genes of Eruca sativa indicated the presence of many length variants within a single plant and also between different cultivars which is unusual for most crucifers studied so far. Two length variants of the rDNA intergenic spacer (IGS) from a single individual E. sativa (cv. Itsa) plant were cloned and characterized. The complete nucleotide sequences of both the variants (3 kb and 4 kb) were determined. The intergenic spacer contains three families of tandemly repeated DNA sequences denoted as A, B and C. However, the long (4 kb) variant shows the presence of an additional repeat, denoted as D, which is a duplication of a 224 bp sequence just upstream of the putative transcription initiation site. Repeat units belonging to the three different families (A, B and C) were in the size range of 22 to 30 bp. Such short repeat elements are present in the IGS of most of the crucifers analysed so far. Sequence analysis of the variants (3 kb and 4 kb) revealed that the length heterogeneity of the spacer is located at three different regions and is due to the varying copy numbers of repeat units belonging to families A and B. Length variation of the spacer is also due to the presence of a large duplication (D repeats) in the 4 kb variant which is absent in the 3 kb variant. The putative transcription initiation site was identified by comparisons with the rDNA sequences from other plant species.

The yeast DNA ligase gene CDC9 is controlled by six orientation specific upstream activating sequences that respond to cellular proliferation but which alone cannot mediate cell cycle regulation.

PubMed Central

White, J H; Johnson, A L; Lowndes, N F; Johnston, L H

1991-01-01

By fusing the CDC9 structural gene to the PGK upstream sequences and the CDC9 upstream to lacZ, we showed that the cell cycle expression of CDC9 is largely due to transcriptional regulation. To investigate the role of six ATGATT upstream repeats in CDC9 regulation, synthetic copies of the sequence were attached to a heterologous gene. The repeats stimulated transcription strongly and additively, but, unlike conventional yeast UAS elements, only when present in one orientation. Transcription driven by the repeats declines in cells held at START of the cell cycle or in stationary phase, as occurs with CDC9. However, the repeats by themselves cannot impart cell cycle regulation to a heterologous gene. CDC9 may therefore be controlled by an activating system operating through the repeats that is sensitive to cellular proliferation and a separate mechanism that governs the periodic expression in the cell cycle. Images PMID:1901644
Deep landscape update of dispersed and tandem repeats in the genome model of the red jungle fowl, Gallus gallus, using a series of de novo investigating tools.

PubMed

Guizard, Sébastien; Piégu, Benoît; Arensburger, Peter; Guillou, Florian; Bigot, Yves

2016-08-19

The program RepeatMasker and the database Repbase-ISB are part of the most widely used strategy for annotating repeats in animal genomes. They have been used to show that avian genomes have a lower repeat content (8-12 %) than the sequenced genomes of many vertebrate species (30-55 %). However, the efficiency of such a library-based strategies is dependent on the quality and completeness of the sequences in the database that is used. An alternative to these library based methods are methods that identify repeats de novo. These alternative methods have existed for a least a decade and may be more powerful than the library based methods. We have used an annotation strategy involving several complementary de novo tools to determine the repeat content of the model genome galGal4 (1.04 Gbp), including identifying simple sequence repeats (SSRs), tandem repeats and transposable elements (TEs). We annotated over one Gbp. of the galGal4 genome and showed that it is composed of approximately 19 % SSRs and TEs repeats. Furthermore, we estimate that the actual genome of the red jungle fowl contains about 31-35 % repeats. We find that library-based methods tend to overestimate TE diversity. These results have a major impact on the current understanding of repeats distributions throughout chromosomes in the red jungle fowl. Our results are a proof of concept of the reliability of using de novo tools to annotate repeats in large animal genomes. They have also revealed issues that will need to be resolved in order to develop gold-standard methodologies for annotating repeats in eukaryote genomes.
Rational design of alpha-helical tandem repeat proteins with closed architectures

PubMed Central

Doyle, Lindsey; Hallinan, Jazmine; Bolduc, Jill; Parmeggiani, Fabio; Baker, David; Stoddard, Barry L.; Bradley, Philip

2015-01-01

Tandem repeat proteins, which are formed by repetition of modular units of protein sequence and structure, play important biological roles as macromolecular binding and scaffolding domains, enzymes, and building blocks for the assembly of fibrous materials1,2. The modular nature of repeat proteins enables the rapid construction and diversification of extended binding surfaces by duplication and recombination of simple building blocks3,4. The overall architecture of tandem repeat protein structures – which is dictated by the internal geometry and local packing of the repeat building blocks – is highly diverse, ranging from extended, super-helical folds that bind peptide, DNA, and RNA partners5–9, to closed and compact conformations with internal cavities suitable for small molecule binding and catalysis10. Here we report the development and validation of computational methods for de novo design of tandem repeat protein architectures driven purely by geometric criteria defining the inter-repeat geometry, without reference to the sequences and structures of existing repeat protein families. We have applied these methods to design a series of closed alpha-solenoid11 repeat structures (alpha-toroids) in which the inter-repeat packing geometry is constrained so as to juxtapose the N- and C-termini; several of these designed structures have been validated by X-ray crystallography. Unlike previous approaches to tandem repeat protein engineering12–20, our design procedure does not rely on template sequence or structural information taken from natural repeat proteins and hence can produce structures unlike those seen in nature. As an example, we have successfully designed and validated closed alpha-solenoid repeats with a left-handed helical architecture that – to our knowledge – is not yet present in the protein structure database21. PMID:26675735
The complete chloroplast genome sequences of Lychnis wilfordii and Silene capitata and comparative analyses with other Caryophyllaceae genomes.

PubMed

Kang, Jong-Soo; Lee, Byoung Yoon; Kwak, Myounghai

2017-01-01

The complete chloroplast genomes of Lychnis wilfordii and Silene capitata were determined and compared with ten previously reported Caryophyllaceae chloroplast genomes. The chloroplast genome sequences of L. wilfordii and S. capitata contain 152,320 bp and 150,224 bp, respectively. The gene contents and orders among 12 Caryophyllaceae species are consistent, but several microstructural changes have occurred. Expansion of the inverted repeat (IR) regions at the large single copy (LSC)/IRb and small single copy (SSC)/IR boundaries led to partial or entire gene duplications. Additionally, rearrangements of the LSC region were caused by gene inversions and/or transpositions. The 18 kb inversions, which occurred three times in different lineages of tribe Sileneae, were thought to be facilitated by the intermolecular duplicated sequences. Sequence analyses of the L. wilfordii and S. capitata genomes revealed 39 and 43 repeats, respectively, including forward, palindromic, and reverse repeats. In addition, a total of 67 and 56 simple sequence repeats were discovered in the L. wilfordii and S. capitata chloroplast genomes, respectively. Finally, we constructed phylogenetic trees of the 12 Caryophyllaceae species and two Amaranthaceae species based on 73 protein-coding genes using both maximum parsimony and likelihood methods.
Development of expressed sequence tag-simple sequence repeat markers for genetic characterization and population structure analysis of Praxelis clematidea (Asteraceae).

PubMed

Wang, Q Z; Huang, M; Downie, S R; Chen, Z X

2016-05-23

Invasive plants tend to spread aggressively in new habitats and an understanding of their genetic diversity and population structure is useful for their management. In this study, expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed for the invasive plant species Praxelis clematidea (Asteraceae) from 5548 Stevia rebaudiana (Asteraceae) expressed sequence tags (ESTs). A total of 133 microsatellite-containing ESTs (2.4%) were identified, of which 56 (42.1%) were hexanucleotide repeat motifs and 50 (37.6%) were trinucleotide repeat motifs. Of the 24 primer pairs designed from these 133 ESTs, 7 (29.2%) resulted in significant polymorphisms. The number of alleles per locus ranged from 5 to 9. The relatively high genetic diversity (H = 0.2667, I = 0.4212, and P = 100%) of P. clematidea was related to high gene flow (Nm = 1.4996) among populations. The coefficient of population differentiation (GST = 0.2500) indicated that most genetic variation occurred within populations. A Mantel test suggested that there was significant correlation between genetic distance and geographical distribution (r = 0.3192, P = 0.012). These results further support the transferability of EST-SSR markers between closely related genera of the same family.
Assessing Diversity of DNA Structure-Related Sequence Features in Prokaryotic Genomes

PubMed Central

Huang, Yongjie; Mrázek, Jan

2014-01-01

Prokaryotic genomes are diverse in terms of their nucleotide and oligonucleotide composition as well as presence of various sequence features that can affect physical properties of the DNA molecule. We present a survey of local sequence patterns which have a potential to promote non-canonical DNA conformations (i.e. different from standard B-DNA double helix) and interpret the results in terms of relationships with organisms' habitats, phylogenetic classifications, and other characteristics. Our present work differs from earlier similar surveys not only by investigating a wider range of sequence patterns in a large number of genomes but also by using a more realistic null model to assess significant deviations. Our results show that simple sequence repeats and Z-DNA-promoting patterns are generally suppressed in prokaryotic genomes, whereas palindromes and inverted repeats are over-represented. Representation of patterns that promote Z-DNA and intrinsic DNA curvature increases with increasing optimal growth temperature (OGT), and decreases with increasing oxygen requirement. Additionally, representations of close direct repeats, palindromes and inverted repeats exhibit clear negative trends with increasing OGT. The observed relationships with environmental characteristics, particularly OGT, suggest possible evolutionary scenarios of structural adaptation of DNA to particular environmental niches. PMID:24408877
Simple sequence repeat markers that identify Claviceps species and strains

USDA-ARS?s Scientific Manuscript database

Claviceps purpurea is a pathogen that infects most members of the Pooideae subfamily and causes ergot, a floral disease in which the ovary is replaced with a sclerotium. This study was initiated to develop Simple Sequence Repeat (SSRs) markers for rapid identification of C. purpurea. SSRs were desi...
Biological sequence compression algorithms.

PubMed

Matsumoto, T; Sadakane, K; Imai, H

2000-01-01

Today, more and more DNA sequences are becoming available. The information about DNA sequences are stored in molecular biology databases. The size and importance of these databases will be bigger and bigger in the future, therefore this information must be stored or communicated efficiently. Furthermore, sequence compression can be used to define similarities between biological sequences. The standard compression algorithms such as gzip or compress cannot compress DNA sequences, but only expand them in size. On the other hand, CTW (Context Tree Weighting Method) can compress DNA sequences less than two bits per symbol. These algorithms do not use special structures of biological sequences. Two characteristic structures of DNA sequences are known. One is called palindromes or reverse complements and the other structure is approximate repeats. Several specific algorithms for DNA sequences that use these structures can compress them less than two bits per symbol. In this paper, we improve the CTW so that characteristic structures of DNA sequences are available. Before encoding the next symbol, the algorithm searches an approximate repeat and palindrome using hash and dynamic programming. If there is a palindrome or an approximate repeat with enough length then our algorithm represents it with length and distance. By using this preprocessing, a new program achieves a little higher compression ratio than that of existing DNA-oriented compression algorithms. We also describe new compression algorithm for protein sequences.
Alu repeats: A source for the genesis of primate microsatellites

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arcot, S.S.; Batzer, M.A.; Wang, Zhenyuan

1995-09-01

As a result of their abundance, relatively uniform distribution, and high degree of polymorphism, microsatellites and minisatellites have become valuable tools in genetic mapping, forensic identity testing, and population studies. In recent years, a number of microsatellite repeats have been found to be associated with Alu interspersed repeated DNA elements. The association of an Alu element with a microsatellite repeat could result from the integration of an Alu element within a preexisting microsatellite repeat. Alternatively, Alu elements could have a direct role in the origin of microsatellite repeats. Errors introduced during reverse transcription of the primary transcript derived from anmore » Alu {open_quotes}master{close_quote} gene or the accumulation of random mutations in the middle A-rich regions and oligo(dA)-rich tails of Alu elements after insertion and subsequent expansion and contraction of these sequences could result in the genesis of a microsatellite repeat. We have tested these hypotheses by a direct evolutionary comparison of the sequences of some recent Alu elements that are found only in humans and are absent from nonhuman primates, as well as some older Alu elements that are present at orthologous positions in a number of nonhuman primates. The origin of {open_quotes}young{close_quotes} Alu insertions, absence of sequences that resemble microsatellite repeats at the orthologous loci in chimpanzees, and the gradual expansion of microsatellite repeats in some old Alu repeats at orthologous positions within the genomes of a number of nonhuman primates suggest that Alu elements are a source for the genesis of primate microsatellite repeats. 48 refs., 5 figs., 3 tabs.« less
Short-Sequence DNA Repeats in Prokaryotic Genomes

PubMed Central

van Belkum, Alex; Scherer, Stewart; van Alphen, Loek; Verbrugh, Henri

1998-01-01

Short-sequence DNA repeat (SSR) loci can be identified in all eukaryotic and many prokaryotic genomes. These loci harbor short or long stretches of repeated nucleotide sequence motifs. DNA sequence motifs in a single locus can be identical and/or heterogeneous. SSRs are encountered in many different branches of the prokaryote kingdom. They are found in genes encoding products as diverse as microbial surface components recognizing adhesive matrix molecules and specific bacterial virulence factors such as lipopolysaccharide-modifying enzymes or adhesins. SSRs enable genetic and consequently phenotypic flexibility. SSRs function at various levels of gene expression regulation. Variations in the number of repeat units per locus or changes in the nature of the individual repeat sequences may result from recombination processes or polymerase inadequacy such as slipped-strand mispairing (SSM), either alone or in combination with DNA repair deficiencies. These rather complex phenomena can occur with relative ease, with SSM approaching a frequency of 10−4 per bacterial cell division and allowing high-frequency genetic switching. Bacteria use this random strategy to adapt their genetic repertoire in response to selective environmental pressure. SSR-mediated variation has important implications for bacterial pathogenesis and evolutionary fitness. Molecular analysis of changes in SSRs allows epidemiological studies on the spread of pathogenic bacteria. The occurrence, evolution and function of SSRs, and the molecular methods used to analyze them are discussed in the context of responsiveness to environmental factors, bacterial pathogenicity, epidemiology, and the availability of full-genome sequences for increasing numbers of microorganisms, especially those that are medically relevant. PMID:9618442
HD iPSC-derived neural progenitors accumulate in culture and are susceptible to BDNF withdrawal due to glutamate toxicity

PubMed Central

Mattis, Virginia B.; Tom, Colton; Akimov, Sergey; Saeedian, Jasmine; Østergaard, Michael E.; Southwell, Amber L.; Doty, Crystal N.; Ornelas, Loren; Sahabian, Anais; Lenaeus, Lindsay; Mandefro, Berhan; Sareen, Dhruv; Arjomand, Jamshid; Hayden, Michael R.; Ross, Christopher A.; Svendsen, Clive N.

2015-01-01

Huntington's disease (HD) is a fatal neurodegenerative disease, caused by expansion of polyglutamine repeats in the Huntingtin gene, with longer expansions leading to earlier ages of onset. The HD iPSC Consortium has recently reported a new in vitro model of HD based on the generation of induced pluripotent stem cells (iPSCs) from HD patients and controls. The current study has furthered the disease in a dish model of HD by generating new non-integrating HD and control iPSC lines. Both HD and control iPSC lines can be efficiently differentiated into neurons/glia; however, the HD-derived cells maintained a significantly greater number of nestin-expressing neural progenitor cells compared with control cells. This cell population showed enhanced vulnerability to brain-derived neurotrophic factor (BDNF) withdrawal in the juvenile-onset HD (JHD) lines, which appeared to be CAG repeat-dependent and mediated by the loss of signaling from the TrkB receptor. It was postulated that this increased death following BDNF withdrawal may be due to glutamate toxicity, as the N-methyl-d-aspartate (NMDA) receptor subunit NR2B was up-regulated in the cultures. Indeed, blocking glutamate signaling, not just through the NMDA but also mGlu and AMPA/Kainate receptors, completely reversed the cell death phenotype. This study suggests that the pathogenesis of JHD may involve in part a population of ‘persistent’ neural progenitors that are selectively vulnerable to BDNF withdrawal. Similar results were seen in adult hippocampal-derived neural progenitors isolated from the BACHD model mouse. Together, these results provide important insight into HD mechanisms at early developmental time points, which may suggest novel approaches to HD therapeutics. PMID:25740845
The role of testosterone in coordinating male life history strategies: The moderating effects of the androgen receptor CAG repeat polymorphism.

PubMed

Gettler, Lee T; Ryan, Calen P; Eisenberg, Dan T A; Rzhetskaya, Margarita; Hayes, M Geoffrey; Feranil, Alan B; Bechayda, Sonny Agustin; Kuzawa, Christopher W

2017-01-01

Partnered fathers often have lower testosterone than single non-parents, which is theorized to relate to elevated testosterone (T) facilitating competitive behaviors and lower T contributing to nurturing. Cultural- and individual-factors moderate the expression of such psychobiological profiles. Less is known about genetic variation's role in individual psychobiological responses to partnering and fathering, particularly as related to T. We examined the exon 1 CAG (polyglutamine) repeat (CAGn) within the androgen receptor (AR) gene. AR CAGn shapes T's effects after it binds to AR by affecting AR transcriptional activity. Thus, this polymorphism is a strong candidate to influence individual-level profiles of "androgenicity." While males with a highly androgenic profile are expected to engage in a more competitive-oriented life history strategy, low androgenic men are at increased risk of depression, which could lead to similar outcomes for certain familial dynamics, such as marriage stability and parenting. Here, in a large longitudinal study of Filipino men (n=683), we found that men who had high androgenicity (elevated T and shorter CAGn) or low androgenicity (lower T and longer CAGn) showed elevated likelihood of relationship instability over the 4.5-year study period and were also more likely be relatively uninvolved with childcare as fathers. We did not find that CAGn moderated men's T responses to the fatherhood transition. In total, our results provide evidence for invested fathering and relationship stability at intermediate levels of androgenicity and help inform our understanding of variation in male reproductive strategies and the individual hormonal and genetic differences that underlie it. Copyright © 2016 Elsevier Inc. All rights reserved.
GATA simple sequence repeats function as enhancer blocker boundaries.

PubMed

Kumar, Ram P; Krishnan, Jaya; Pratap Singh, Narendra; Singh, Lalji; Mishra, Rakesh K

2013-01-01

Simple sequence repeats (SSRs) account for ~3% of the human genome, but their functional significance still remains unclear. One of the prominent SSRs the GATA tetranucleotide repeat has preferentially accumulated in complex organisms. GATA repeats are particularly enriched on the human Y chromosome, and their non-random distribution and exclusive association with genes expressed during early development indicate their role in coordinated gene regulation. Here we show that GATA repeats have enhancer blocker activity in Drosophila and human cells. This enhancer blocker activity is seen in transgenic as well as native context of the enhancers at various developmental stages. These findings ascribe functional significance to SSRs and offer an explanation as to why SSRs, especially GATA, may have accumulated in complex organisms.
CRISPRDetect: A flexible algorithm to define CRISPR arrays.

PubMed

Biswas, Ambarish; Staals, Raymond H J; Morales, Sergio E; Fineran, Peter C; Brown, Chris M

2016-05-17

CRISPR (clustered regularly interspaced short palindromic repeats) RNAs provide the specificity for noncoding RNA-guided adaptive immune defence systems in prokaryotes. CRISPR arrays consist of repeat sequences separated by specific spacer sequences. CRISPR arrays have previously been identified in a large proportion of prokaryotic genomes. However, currently available detection algorithms do not utilise recently discovered features regarding CRISPR loci. We have developed a new approach to automatically detect, predict and interactively refine CRISPR arrays. It is available as a web program and command line from bioanalysis.otago.ac.nz/CRISPRDetect. CRISPRDetect discovers putative arrays, extends the array by detecting additional variant repeats, corrects the direction of arrays, refines the repeat/spacer boundaries, and annotates different types of sequence variations (e.g. insertion/deletion) in near identical repeats. Due to these features, CRISPRDetect has significant advantages when compared to existing identification tools. As well as further support for small medium and large repeats, CRISPRDetect identified a class of arrays with 'extra-large' repeats in bacteria (repeats 44-50 nt). The CRISPRDetect output is integrated with other analysis tools. Notably, the predicted spacers can be directly utilised by CRISPRTarget to predict targets. CRISPRDetect enables more accurate detection of arrays and spacers and its gff output is suitable for inclusion in genome annotation pipelines and visualisation. It has been used to analyse all complete bacterial and archaeal reference genomes.
Divergence in centromere structure distinguishes related genomes in Coix lacryma-jobi and its wild relative.

PubMed

Han, Yonghua; Wang, Guixiang; Liu, Zhao; Liu, Jinhua; Yue, Wei; Song, Rentao; Zhang, Xueyong; Jin, Weiwei

2010-02-01

Knowledge about the composition and structure of centromeres is critical for understanding how centromeres perform their functional roles. Here, we report the sequences of one centromere-associated bacterial artificial chromosome clone from a Coix lacryma-jobi library. Two Ty3/gypsy-class retrotransposons, centromeric retrotransposon of C. lacryma-jobi (CRC) and peri-centromeric retrotransposon of C. lacryma-jobi, and a (peri)centromere-specific tandem repeat with a unit length of 153 bp were identified. The CRC is highly homologous to centromere-specific retrotransposons reported in grass species. An 80-bp DNA region in the 153-bp satellite repeat was found to be conserved to centromeric satellite repeats from maize, rice, and pearl millet. Fluorescence in situ hybridization showed that the three repetitive sequences were located in (peri-)centromeric regions of both C. lacryma-jobi and Coix aquatica. However, the 153-bp satellite repeat was only detected on 20 out of the 30 chromosomes in C. aquatica. Immunostaining with an antibody against rice CENH3 indicates that the 153-bp satellite repeat and CRC might be both the major components for functional centromeres, but not all the 153-bp satellite repeats or CRC sequences are associated with CENH3. The evolution of centromeric repeats of C. lacryma-jobi during the polyploidization was discussed.
CRISPR Detection From Short Reads Using Partial Overlap Graphs.

PubMed

Ben-Bassat, Ilan; Chor, Benny

2016-06-01

Clustered regularly interspaced short palindromic repeats (CRISPR) are structured regions in bacterial and archaeal genomes, which are part of an adaptive immune system against phages. CRISPRs are important for many microbial studies and are playing an essential role in current gene editing techniques. As such, they attract substantial research interest. The exponential growth in the amount of bacterial sequence data in recent years enables the exploration of CRISPR loci in more and more species. Most of the automated tools that detect CRISPR loci rely on fully assembled genomes. However, many assemblers do not handle repetitive regions successfully. The first tool to work directly on raw sequence data is Crass, which requires reads that are long enough to contain two copies of the same repeat. We present a method to identify CRISPR repeats from raw sequence data of short reads. The algorithm is based on an observation differentiating CRISPR repeats from other types of repeats, and it involves a series of partial constructions of the overlap graph. This enables us to avoid many of the difficulties that assemblers face, as we merely aim to identify the repeats that belong to CRISPR loci. A preliminary implementation of the algorithm shows good results and detects CRISPR repeats in cases where other existing tools fail to do so.
ChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants

PubMed Central

Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh

2014-01-01

Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1–6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ PMID:25380781
ChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants.

PubMed

Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh

2014-01-01

Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1-6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ © The Author(s) 2014. Published by Oxford University Press.
Mining and validation of pyrosequenced simple sequence repeats (SSRs) from American cranberry (Vaccinium macrocarpon Ait.).

PubMed

Zhu, H; Senalik, D; McCown, B H; Zeldin, E L; Speers, J; Hyman, J; Bassil, N; Hummer, K; Simon, P W; Zalapa, J E

2012-01-01

The American cranberry (Vaccinium macrocarpon Ait.) is a major commercial fruit crop in North America, but limited genetic resources have been developed for the species. Furthermore, the paucity of codominant DNA markers has hampered the advance of genetic research in cranberry and the Ericaceae family in general. Therefore, we used Roche 454 sequencing technology to perform low-coverage whole genome shotgun sequencing of the cranberry cultivar 'HyRed'. After de novo assembly, the obtained sequence covered 266.3 Mb of the estimated 540-590 Mb in cranberry genome. A total of 107,244 SSR loci were detected with an overall density across the genome of 403 SSR/Mb. The AG repeat was the most frequent motif in cranberry accounting for 35% of all SSRs and together with AAG and AAAT accounted for 46% of all loci discovered. To validate the SSR loci, we designed 96 primer-pairs using contig sequence data containing perfect SSR repeats, and studied the genetic diversity of 25 cranberry genotypes. We identified 48 polymorphic SSR loci with 2-15 alleles per locus for a total of 323 alleles in the 25 cranberry genotypes. Genetic clustering by principal coordinates and genetic structure analyzes confirmed the heterogeneous nature of cranberries. The parentage composition of several hybrid cultivars was evident from the structure analyzes. Whole genome shotgun 454 sequencing was a cost-effective and efficient way to identify numerous SSR repeats in the cranberry sequence for marker development.
Target Site Recognition by a Diversity-Generating Retroelement

PubMed Central

Guo, Huatao; Tse, Longping V.; Nieh, Angela W.; Czornyj, Elizabeth; Williams, Steven; Oukil, Sabrina; Liu, Vincent B.; Miller, Jeff F.

2011-01-01

Diversity-generating retroelements (DGRs) are in vivo sequence diversification machines that are widely distributed in bacterial, phage, and plasmid genomes. They function to introduce vast amounts of targeted diversity into protein-encoding DNA sequences via mutagenic homing. Adenine residues are converted to random nucleotides in a retrotransposition process from a donor template repeat (TR) to a recipient variable repeat (VR). Using the Bordetella bacteriophage BPP-1 element as a prototype, we have characterized requirements for DGR target site function. Although sequences upstream of VR are dispensable, a 24 bp sequence immediately downstream of VR, which contains short inverted repeats, is required for efficient retrohoming. The inverted repeats form a hairpin or cruciform structure and mutational analysis demonstrated that, while the structure of the stem is important, its sequence can vary. In contrast, the loop has a sequence-dependent function. Structure-specific nuclease digestion confirmed the existence of a DNA hairpin/cruciform, and marker coconversion assays demonstrated that it influences the efficiency, but not the site of cDNA integration. Comparisons with other phage DGRs suggested that similar structures are a conserved feature of target sequences. Using a kanamycin resistance determinant as a reporter, we found that transplantation of the IMH and hairpin/cruciform-forming region was sufficient to target the DGR diversification machinery to a heterologous gene. In addition to furthering our understanding of DGR retrohoming, our results suggest that DGRs may provide unique tools for directed protein evolution via in vivo DNA diversification. PMID:22194701

Centromere and telomere sequence alterations reflect the rapid genome evolution within the carnivorous plant genus Genlisea.

PubMed

Tran, Trung D; Cao, Hieu X; Jovtchev, Gabriele; Neumann, Pavel; Novák, Petr; Fojtová, Miloslava; Vu, Giang T H; Macas, Jiří; Fajkus, Jiří; Schubert, Ingo; Fuchs, Joerg

2015-12-01

Linear chromosomes of eukaryotic organisms invariably possess centromeres and telomeres to ensure proper chromosome segregation during nuclear divisions and to protect the chromosome ends from deterioration and fusion, respectively. While centromeric sequences may differ between species, with arrays of tandemly repeated sequences and retrotransposons being the most abundant sequence types in plant centromeres, telomeric sequences are usually highly conserved among plants and other organisms. The genome size of the carnivorous genus Genlisea (Lentibulariaceae) is highly variable. Here we study evolutionary sequence plasticity of these chromosomal domains at an intrageneric level. We show that Genlisea nigrocaulis (1C = 86 Mbp; 2n = 40) and G. hispidula (1C = 1550 Mbp; 2n = 40) differ as to their DNA composition at centromeres and telomeres. G. nigrocaulis and its close relative G. pygmaea revealed mainly 161 bp tandem repeats, while G. hispidula and its close relative G. subglabra displayed a combination of four retroelements at centromeric positions. G. nigrocaulis and G. pygmaea chromosome ends are characterized by the Arabidopsis-type telomeric repeats (TTTAGGG); G. hispidula and G. subglabra instead revealed two intermingled sequence variants (TTCAGG and TTTCAGG). These differences in centromeric and, surprisingly, also in telomeric DNA sequences, uncovered between groups with on average a > 9-fold genome size difference, emphasize the fast genome evolution within this genus. Such intrageneric evolutionary alteration of telomeric repeats with cytosine in the guanine-rich strand, not yet known for plants, might impact the epigenetic telomere chromatin modification. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.
Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera, and comparative analyses with other grass genomes

PubMed Central

Saski, Christopher; Lee, Seung-Bum; Fjellheim, Siri; Guda, Chittibabu; Jansen, Robert K.; Luo, Hong; Tomkins, Jeffrey; Rognli, Odd Arne; Clarke, Jihong Liu

2009-01-01

Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5′ end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19–37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16–21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C–U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae. PMID:17534593
Repeated extragenic sequences in prokaryotic genomes: a proposal for the origin and dynamics of the RUP element in Streptococcus pneumoniae.

PubMed

Oggioni, M R; Claverys, J P

1999-10-01

A survey of all Streptococcus pneumoniae GenBank/EMBL DNA sequence entries and of the public domain sequence (representing more than 90% of the genome) of an S. pneumoniae type 4 strain allowed identification of 108 copies of a 107-bp-long highly repeated intergenic element called RUP (for repeat unit of pneumococcus). Several features of the element, revealed in this study, led to the proposal that RUP is an insertion sequence (IS)-derivative that could still be mobile. Among these features are: (1) a highly significant homology between the terminal inverted repeats (IRs) of RUPs and of IS630-Spn1, a new putative IS of S. pneumoniae; and (2) insertion at a TA dinucleotide, a characteristic target of several members of the IS630 family. Trans-mobilization of RUP is therefore proposed to be mediated by the transposase of IS630-Spn1. To account for the observation that RUPs are distributed among four subtypes which exhibit different degrees of sequence homogeneity, a scenario is invoked based on successive stages of RUP mobility and non-mobility, depending on whether an active transposase is present or absent. In the latter situation, an active transposase could be reintroduced into the species through natural transformation. Examination of sequences flanking RUP revealed a preferential association with ISs. It also provided evidence that RUPs promote sequence rearrangements, thereby contributing to genome flexibility. The possibility that RUP preferentially targets transforming DNA of foreign origin and subsequently favours disruption/rearrangement of exogenous sequences is discussed.
Analysis of SINE and LINE repeat content of Y chromosomes in the platypus, Ornithorhynchus anatinus.

PubMed

Kortschak, R Daniel; Tsend-Ayush, Enkhjargal; Grützner, Frank

2009-01-01

Monotremes feature an extraordinary sex-chromosome system that consists of five X and five Y chromosomes in males. These sex chromosomes share homology with bird sex chromosomes but no homology with the therian X. The genome of a female platypus was recently completed, providing unique insights into sequence and gene content of autosomes and X chromosomes, but no Y-specific sequence has so far been analysed. Here we report the isolation, sequencing and analysis of approximately 700 kb of sequence of the non-recombining regions of Y2, Y3 and Y5, which revealed differences in base composition and repeat content between autosomes and sex chromosomes, and within the sex chromosomes themselves. This provides the first insights into repeat content of Y chromosomes in platypus, which overall show similar patterns of repeat composition to Y chromosomes in other species. Interestingly, we also observed differences between the various Y chromosomes, and in combination with timing and activity patterns we provide an approach that can be used to examine the evolutionary history of the platypus sex-chromosome chain.
Efficient production of artificially designed gelatins with a Bacillus brevis system.

PubMed

Kajino, T; Takahashi, H; Hirai, M; Yamada, Y

2000-01-01

Artificially designed gelatins comprising tandemly repeated 30-amino-acid peptide units derived from human alphaI collagen were successfully produced with a Bacillus brevis system. The DNA encoding the peptide unit was synthesized by taking into consideration the codon usage of the host cells, but no clones having a tandemly repeated gene were obtained through the above-mentioned strategy. Minirepeat genes could be selected in vivo from a mixture of every possible sequence encoding an artificial gelatin by randomly ligating the mixed sequence unit and transforming it into Escherichia coli. Larger repeat genes constructed by connecting minirepeat genes obtained by in vivo selection were also stable in the expression host cells. Gelatins derived from the eight-unit and six-unit repeat genes were extracellularly produced at the level of 0.5 g/liter and easily purified by ammonium sulfate fractionation and anion-exchange chromatography. The purified artificial gelatins had the predicted N-terminal sequences and amino acid compositions and a solgel property similar to that of the native gelatin. These results suggest that the selection of a repeat unit sequence stable in an expression host is a shortcut for the efficient production of repetitive proteins and that it can conveniently be achieved by the in vivo selection method. This study revealed the possible industrial application of artificially designed repetitive proteins.
Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain.

PubMed

de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas

2014-06-01

The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Characterization of genetic sequence variation of 58 STR loci in four major population groups.

PubMed

Novroski, Nicole M M; King, Jonathan L; Churchill, Jennifer D; Seah, Lay Hong; Budowle, Bruce

2016-11-01

Massively parallel sequencing (MPS) can identify sequence variation within short tandem repeat (STR) alleles as well as their nominal allele lengths that traditionally have been obtained by capillary electrophoresis. Using the MiSeq FGx Forensic Genomics System (Illumina), STRait Razor, and in-house excel workbooks, genetic variation was characterized within STR repeat and flanking regions of 27 autosomal, 7 X-chromosome and 24 Y-chromosome STR markers in 777 unrelated individuals from four population groups. Seven hundred and forty six autosomal, 227 X-chromosome, and 324 Y-chromosome STR alleles were identified by sequence compared with 357 autosomal, 107 X-chromosome, and 189 Y-chromosome STR alleles that were identified by length. Within the observed sequence variation, 227 autosomal, 156 X-chromosome, and 112 Y-chromosome novel alleles were identified and described. One hundred and seventy six autosomal, 123 X-chromosome, and 93 Y-chromosome sequence variants resided within STR repeat regions, and 86 autosomal, 39 X-chromosome, and 20 Y-chromosome variants were located in STR flanking regions. Three markers, D18S51, DXS10135, and DYS385a-b had 1, 4, and 1 alleles, respectively, which contained both a novel repeat region variant and a flanking sequence variant in the same nucleotide sequence. There were 50 markers that demonstrated a relative increase in diversity with the variant sequence alleles compared with those of traditional nominal length alleles. These population data illustrate the genetic variation that exists in the commonly used STR markers in the selected population samples and provide allele frequencies for statistical calculations related to STR profiling with MPS data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Construction of a small Mus musculus repetitive DNA library: identification of a new satellite sequence in Mus musculus.

PubMed Central

Pietras, D F; Bennett, K L; Siracusa, L D; Woodworth-Gutai, M; Chapman, V M; Gross, K W; Kane-Haas, C; Hastie, N D

1983-01-01

We report the construction of a small library of recombinant plasmids containing Mus musculus repetitive DNA inserts. The repetitive cloned fraction was derived from denatured genomic DNA by reassociation to a Cot value at which repetitive, but not unique, sequences have reannealed followed by exhaustive S1 nuclease treatment to degrade single stranded DNA. Initial characterizations of this library by colony filter hybridizations have led to the identification of a previously undetected M. musculus minor satellite as well as to clones containing M. musculus major satellite sequences. This new satellite is repeated 10-20 times less than the major satellite in the M. musculus genome. It has a repeat length of 130 nucleotides compared with the M. musculus major satellite with a repeat length of 234 nucleotides. Sequence analysis of the minor satellite has shown that it has a 29 base pair region with extensive homology to one of the major satellite repeating subunits. We also show by in situ hybridization that this minor satellite sequence is located at the centromeres and possibly the arms of at least half the M musculus chromosomes. Sequences related to the minor satellite have been found in the DNA of a related Mus species, Mus spretus, and may represent the major satellite of that species. Images PMID:6314268
Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum) and Comparative Analysis with Common Buckwheat (F. esculentum)

PubMed Central

Cho, Kwang-Soo; Yun, Bong-Kyoung; Yoon, Young-Ho; Hong, Su-Young; Mekapogu, Manjulatha; Kim, Kyung-Hee; Yang, Tae-Jin

2015-01-01

We report the chloroplast (cp) genome sequence of tartary buckwheat (Fagopyrum tataricum) obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale) cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp) were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats) and F. esculentum (one repeat), and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes—rpoC2, ycf3, accD, and clpP—have high synonymous (Ks) value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum. PMID:25966355
Two new miniature inverted-repeat transposable elements in the genome of the clam Donax trunculus.

PubMed

Šatović, Eva; Plohl, Miroslav

2017-10-01

Repetitive sequences are important components of eukaryotic genomes that drive their evolution. Among them are different types of mobile elements that share the ability to spread throughout the genome and form interspersed repeats. To broaden the generally scarce knowledge on bivalves at the genome level, in the clam Donax trunculus we described two new non-autonomous DNA transposons, miniature inverted-repeat transposable elements (MITEs), named DTC M1 and DTC M2. Like other MITEs, they are characterized by their small size, their A + T richness, and the presence of terminal inverted repeats (TIRs). DTC M1 and DTC M2 are 261 and 286 bp long, respectively, and in addition to TIRs, both of them contain a long imperfect palindrome sequence in their central parts. These elements are present in complete and truncated versions within the genome of the clam D. trunculus. The two new MITEs share only structural similarity, but lack any nucleotide sequence similarity to each other. In a search for related elements in databases, blast search revealed within the Crassostrea gigas genome a larger element sharing sequence similarity only to DTC M1 in its TIR sequences. The lack of sequence similarity with any previously published mobile elements indicates that DTC M1 and DTC M2 elements may be unique to D. trunculus.
Short intronic repeat sequences facilitate circular RNA production

PubMed Central

Liang, Dongming

2014-01-01

Recent deep sequencing studies have revealed thousands of circular noncoding RNAs generated from protein-coding genes. These RNAs are produced when the precursor messenger RNA (pre-mRNA) splicing machinery “backsplices” and covalently joins, for example, the two ends of a single exon. However, the mechanism by which the spliceosome selects only certain exons to circularize is largely unknown. Using extensive mutagenesis of expression plasmids, we show that miniature introns containing the splice sites along with short (∼30- to 40-nucleotide) inverted repeats, such as Alu elements, are sufficient to allow the intervening exons to circularize in cells. The intronic repeats must base-pair to one another, thereby bringing the splice sites into close proximity to each other. More than simple thermodynamics is clearly at play, however, as not all repeats support circularization, and increasing the stability of the hairpin between the repeats can sometimes inhibit circular RNA biogenesis. The intronic repeats and exonic sequences must collaborate with one another, and a functional 3′ end processing signal is required, suggesting that circularization may occur post-transcriptionally. These results suggest detailed and generalizable models that explain how the splicing machinery determines whether to produce a circular noncoding RNA or a linear mRNA. PMID:25281217
Genes implicated in the pathogenesis of spinocerebellar ataxias.

PubMed

Wüllner, Ullrich

2003-12-01

The degenerative ataxias comprise a number of heterogeneous diseases, many of which are genetically determined. Loss of cerebellar Purkinje and brainstem neurons as well as degeneration of spinal pathways are the major morphological findings of most ataxias, but neuronal loss may also affect the basal ganglia and the retina. While the degenerative ataxias initially were classified on a neuropathological basis, more recent classifications focused on clinical hallmarks and the mode of inheritance, separating inherited, sporadic and symptomatic ataxias. Genetic linkage analysis and molecular genetic studies identified various genotypes and revealed genetic heterogeneity of the autosomal dominant ataxias (ADCA), which on the basis of the genotypes are now classified as spinocerebellar ataxias (SCA1-22). Based on pathogenesis these disorders fall into three discrete groups: the polyglutamine disorders, SCA1-3, 7 and 17; the channelopathies, SCA6 and episodic ataxia types 1 and 2 (EA1-2); and SCA8, 10 and 12, which result from repeat expansions outside the coding regions and reduce gene expression. The etiologies of SCAs 4, 5, 9, 11, 13-16, 19, 21 and 22 remain unknown as of today. The recent advances in the identification of the underlying gene defects of most of the inherited ataxias have opened new avenues to a better understanding of the molecular mechanisms leading to cellular dysfunction and cell death.
Ex vivo delivery of GDNF maintains motor function and prevents neuronal loss in a transgenic mouse model of Huntington's disease.

PubMed

Ebert, Allison D; Barber, Amelia E; Heins, Brittany M; Svendsen, Clive N

2010-07-01

Huntington's disease (HD) is an autosomal dominant disorder caused by expansion of polyglutamine repeats in the huntingtin gene leading to loss of striatal and cortical neurons followed by deficits in cognition and choreic movements. Growth factor delivery to the brain has shown promise in various models of neurodegenerative diseases, including HD, by reducing neuronal death and thus limiting motor impairment. Here we used mouse neural progenitor cells (mNPCs) as growth factor delivery vehicles in the N171-82Q transgenic mouse model of HD. mNPCs derived from the developing mouse striatum were isolated and infected with lentivirus expressing either glial cell line-derived neurotrophic factor (GDNF) or green fluorescent protein (GFP). Next, mNPCs(GDNF) or mNPCs(GFP) were transplanted bilaterally into the striatum of pre-symptomatic N171-82Q mice. We found that mNPCs(GDNF), but not mNPCs(GFP), maintained rotarod function and increased striatal neuron survival out to 3months post-transplantation. Importantly, histological analysis showed GDNF expression through the duration of the experiment. Our data show that mNPCs(GDNF) can survive transplantation, secrete GDNF for several weeks and are able to maintain motor function in this model of HD. Copyright 2010 Elsevier Inc. All rights reserved.
Neonatal iron supplementation potentiates oxidative stress, energetic dysfunction and neurodegeneration in the R6/2 mouse model of Huntington's disease

PubMed Central

Berggren, Kiersten L.; Chen, Jianfang; Fox, Julia; Miller, Jonathan; Dodds, Lindsay; Dugas, Bryan; Vargas, Liset; Lothian, Amber; McAllum, Erin; Volitakis, Irene; Roberts, Blaine; Bush, Ashley I.; Fox, Jonathan H.

2015-01-01

Huntington’s disease (HD) is a progressive neurodegenerative disorder caused by a CAG repeat expansion that encodes a polyglutamine tract in huntingtin (htt) protein. Dysregulation of brain iron homeostasis, oxidative stress and neurodegeneration are consistent features of the HD phenotype. Therefore, environmental factors that exacerbate oxidative stress and iron dysregulation may potentiate HD. Iron supplementation in the human population is common during infant and adult-life stages. In this study, iron supplementation in neonatal HD mice resulted in deterioration of spontaneous motor running activity, elevated levels of brain lactate and oxidized glutathione consistent with increased energetic dysfunction and oxidative stress, and increased striatal and motor cortical neuronal atrophy, collectively demonstrating potentiation of the disease phenotype. Oxidative stress, energetic, and anatomic markers of degeneration were not affected in wild-type littermate iron-supplemented mice. Further, there was no effect of elevated iron intake on disease outcomes in adult HD mice. We have demonstrated an interaction between the mutant huntingtin gene and iron supplementation in neonatal HD mice. Findings indicate that elevated neonatal iron intake potentiates mouse HD and promotes oxidative stress and energetic dysfunction in brain. Neonatal-infant dietary iron intake level may be an environmental modifier of human HD. PMID:25703232
Basis of altered RNA-binding specificity by PUF proteins revealed by crystal structures of yeast Puf4p

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miller, Matthew T.; Higgin, Joshua J.; Hall, Traci M.Tanaka

2008-06-06

Pumilio/FBF (PUF) family proteins are found in eukaryotic organisms and regulate gene expression post-transcriptionally by binding to sequences in the 3' untranslated region of target transcripts. PUF proteins contain an RNA binding domain that typically comprises eight {alpha}-helical repeats, each of which recognizes one RNA base. Some PUF proteins, including yeast Puf4p, have altered RNA binding specificity and use their eight repeats to bind to RNA sequences with nine or ten bases. Here we report the crystal structures of Puf4p alone and in complex with a 9-nucleotide (nt) target RNA sequence, revealing that Puf4p accommodates an 'extra' nucleotide by modestmore » adaptations allowing one base to be turned away from the RNA binding surface. Using structural information and sequence comparisons, we created a mutant Puf4p protein that preferentially binds to an 8-nt target RNA sequence over a 9-nt sequence and restores binding of each protein repeat to one RNA base.« less
Inverted repeats in the promoter as an autoregulatory sequence for TcrX in Mycobacterium tuberculosis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhattacharya, Monolekha; Das, Amit Kumar, E-mail: amitk@hijli.iitkgp.ernet.in

Highlights: Black-Right-Pointing-Pointer The regulatory sequences recognized by TcrX have been identified. Black-Right-Pointing-Pointer The regulatory region comprises of inverted repeats segregated by 30 bp region. Black-Right-Pointing-Pointer The mode of binding of TcrX with regulatory sequence is unique. Black-Right-Pointing-Pointer In silico TcrX-DNA docked model binds one of the inverted repeats. Black-Right-Pointing-Pointer Both phosphorylated and unphosphorylated TcrX binds regulatory sequence in vitro. -- Abstract: TcrY, a histidine kinase, and TcrX, a response regulator, constitute a two-component system in Mycobacterium tuberculosis. tcrX, which is expressed during iron scarcity, is instrumental in the survival of iron-dependent M. tuberculosis. However, the regulator of tcrX/Y has notmore » been fully characterized. Crosslinking studies of TcrX reveal that it can form oligomers in vitro. Electrophoretic mobility shift assays (EMSAs) show that TcrX recognizes two regions in the promoter that are comprised of inverted repeats separated by {approx}30 bp. The dimeric in silico model of TcrX predicts binding to one of these inverted repeat regions. Site-directed mutagenesis and radioactive phosphorylation indicate that D54 of TcrX is phosphorylated by H256 of TcrY. However, phosphorylated and unphosphorylated TcrX bind the regulatory sequence with equal efficiency, which was shown with an EMSA using the D54A TcrX mutant.« less
Structural features of the rice chromosome 4 centromere.

PubMed

Zhang, Yu; Huang, Yuchen; Zhang, Lei; Li, Ying; Lu, Tingting; Lu, Yiqi; Feng, Qi; Zhao, Qiang; Cheng, Zhukuan; Xue, Yongbiao; Wing, Rod A; Han, Bin

2004-01-01

A complete sequence of a chromosome centromere is necessary for fully understanding centromere function. We reported the sequence structures of the first complete rice chromosome centromere through sequencing a large insert bacterial artificial chromosome clone-based contig, which covered the rice chromosome 4 centromere. Complete sequencing of the 124-kb rice chromosome 4 centromere revealed that it consisted of 18 tracts of 379 tandemly arrayed repeats known as CentO and a total of 19 centromeric retroelements (CRs) but no unique sequences were detected. Four tracts, composed of 65 CentO repeats, were located in the opposite orientation, and 18 CentO tracts were flanked by 19 retroelements. The CRs were classified into four types, and the type I retroelements appeared to be more specific to rice centromeres. The preferential insert of the CRs among CentO repeats indicated that the centromere-specific retroelements may contribute to centromere expansion during evolution. The presence of three intact retrotransposons in the centromere suggests that they may be responsible for functional centromere initiation through a transcription-mediated mechanism.
Chromosome rearrangements via template switching between diverged repeated sequences

PubMed Central

Anand, Ranjith P.; Tsaponina, Olga; Greenwell, Patricia W.; Lee, Cheng-Sheng; Du, Wei; Petes, Thomas D.

2014-01-01

Recent high-resolution genome analyses of cancer and other diseases have revealed the occurrence of microhomology-mediated chromosome rearrangements and copy number changes. Although some of these rearrangements appear to involve nonhomologous end-joining, many must have involved mechanisms requiring new DNA synthesis. Models such as microhomology-mediated break-induced replication (MM-BIR) have been invoked to explain these rearrangements. We examined BIR and template switching between highly diverged sequences in Saccharomyces cerevisiae, induced during repair of a site-specific double-strand break (DSB). Our data show that such template switches are robust mechanisms that give rise to complex rearrangements. Template switches between highly divergent sequences appear to be mechanistically distinct from the initial strand invasions that establish BIR. In particular, such jumps are less constrained by sequence divergence and exhibit a different pattern of microhomology junctions. BIR traversing repeated DNA sequences frequently results in complex translocations analogous to those seen in mammalian cells. These results suggest that template switching among repeated genes is a potent driver of genome instability and evolution. PMID:25367035
Simple sequence repeat marker loci discovery using SSR primer.

PubMed

Robinson, Andrew J; Love, Christopher G; Batley, Jacqueline; Barker, Gary; Edwards, David

2004-06-12

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. With the increase in the availability of DNA sequence information, an automated process to identify and design PCR primers for amplification of SSR loci would be a useful tool in plant breeding programs. We report an application that integrates SPUTNIK, an SSR repeat finder, with Primer3, a PCR primer design program, into one pipeline tool, SSR Primer. On submission of multiple FASTA formatted sequences, the script screens each sequence for SSRs using SPUTNIK. The results are parsed to Primer3 for locus-specific primer design. The script makes use of a Web-based interface, enabling remote use. This program has been written in PERL and is freely available for non-commercial users by request from the authors. The Web-based version may be accessed at http://hornbill.cspp.latrobe.edu.au/
Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

PubMed

Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

2014-01-01

Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.

The central domain of bovine submaxillary mucin consists of over 50 tandem repeats of 329 amino acids. Chromosomal localization of the BSM1 gene and relations to ovine and porcine counterparts.

PubMed

Jiang, W; Gupta, D; Gallagher, D; Davis, S; Bhavanandan, V P

2000-04-01

We previously elucidated five distinct protein domains (I-V) for bovine submaxillary mucin, which is encoded by two genes, BSM1 and BSM2. Using Southern blot analysis, genomic cloning and sequencing of the BSM1 gene, we now show that the central domain (V) consists of approximately 55 tandem repeats of 329 amino acids and that domains III-V are encoded by a 58.4-kb exon, the largest exon known for all genes to date. The BSM1 gene was mapped by fluorescence in situ hybridization to the proximal half of chromosome 5 at bands q2. 2-q2.3. The amino-acid sequence of six tandem repeats (two full and four partial) were found to have only 92-94% identities. We propose that the variability in the amino-acid sequences of the mucin tandem repeat is important for generating the combinatorial library of saccharides that are necessary for the protective function of mucins. The deduced peptide sequences of the central domain match those determined from the purified bovine submaxillary mucin and also show 68-94% identity to published peptide sequences of ovine submaxillary mucin. This indicates that the core protein of ovine submaxillary mucin is closely related to that of bovine submaxillary mucin and contains similar tandem repeats in the central domain. In contrast, the central domain of porcine submaxillary mucin is reported to consist of 81-amino-acid tandem repeats. However, both bovine submaxillary mucin and porcine submaxillary mucin contain similar N-terminal and C-terminal domains and the corresponding genes are in the conserved linkage regions of the respective genomes.
Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers

PubMed Central

Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

2014-01-01

Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551
Pstl repeat: a family of short interspersed nucleotide element (SINE)-like sequences in the genomes of cattle, goat, and buffalo.

PubMed

Sheikh, Faruk G; Mukhopadhyay, Sudit S; Gupta, Prabhakar

2002-02-01

The PstI family of elements are short, highly repetitive DNA sequences interspersed throughout the genome of the Bovidae. We have cloned and sequenced some members of the PstI family from cattle, goat, and buffalo. These elements are approximately 500 bp, have a copy number of 2 x 10(5) - 4 x 10(5), and comprise about 4% of the haploid genome. Studies of nucleotide sequence homology indicate that the buffalo and goat PstI repeats (type II) are similar types of short interspersed nucleotide element (SINE) sequences, but the cattle PstI repeat (type I) is considerably more divergent. Additionally, the goat PstI sequence showed significant sequence homology with bovine serine tRNA, and is therefore likely derived from serine tRNA. Interestingly, Southern hybridization suggests that both types of SINEs (I and II) are present in all the species of Bovidae. Dendrogram analysis indicates that cattle PstI SINE is similar to bovine Alu-like SINEs. Goat and buffalo SINEs formed a separate cluster, suggesting that these two types of SINEs evolved separately in the genome of the Bovidae.
Repeat-aware modeling and correction of short read errors.

PubMed

Yang, Xiao; Aluru, Srinivas; Dorman, Karin S

2011-02-15

High-throughput short read sequencing is revolutionizing genomics and systems biology research by enabling cost-effective deep coverage sequencing of genomes and transcriptomes. Error detection and correction are crucial to many short read sequencing applications including de novo genome sequencing, genome resequencing, and digital gene expression analysis. Short read error detection is typically carried out by counting the observed frequencies of kmers in reads and validating those with frequencies exceeding a threshold. In case of genomes with high repeat content, an erroneous kmer may be frequently observed if it has few nucleotide differences with valid kmers with multiple occurrences in the genome. Error detection and correction were mostly applied to genomes with low repeat content and this remains a challenging problem for genomes with high repeat content. We develop a statistical model and a computational method for error detection and correction in the presence of genomic repeats. We propose a method to infer genomic frequencies of kmers from their observed frequencies by analyzing the misread relationships among observed kmers. We also propose a method to estimate the threshold useful for validating kmers whose estimated genomic frequency exceeds the threshold. We demonstrate that superior error detection is achieved using these methods. Furthermore, we break away from the common assumption of uniformly distributed errors within a read, and provide a framework to model position-dependent error occurrence frequencies common to many short read platforms. Lastly, we achieve better error correction in genomes with high repeat content. The software is implemented in C++ and is freely available under GNU GPL3 license and Boost Software V1.0 license at "http://aluru-sun.ece.iastate.edu/doku.php?id = redeem". We introduce a statistical framework to model sequencing errors in next-generation reads, which led to promising results in detecting and correcting errors for genomes with high repeat content.
Computational prediction of CRISPR cassettes in gut metagenome samples from Chinese type-2 diabetic patients and healthy controls.

PubMed

Mangericao, Tatiana C; Peng, Zhanhao; Zhang, Xuegong

2016-01-11

CRISPR has been becoming a hot topic as a powerful technique for genome editing for human and other higher organisms. The original CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats coupled with CRISPR-associated proteins) is an important adaptive defence system for prokaryotes that provides resistance against invading elements such as viruses and plasmids. A CRISPR cassette contains short nucleotide sequences called spacers. These unique regions retain a history of the interactions between prokaryotes and their invaders in individual strains and ecosystems. One important ecosystem in the human body is the human gut, a rich habitat populated by a great diversity of microorganisms. Gut microbiomes are important for human physiology and health. Metagenome sequencing has been widely applied for studying the gut microbiomes. Most efforts in metagenome study has been focused on profiling taxa compositions and gene catalogues and identifying their associations with human health. Less attention has been paid to the analysis of the ecosystems of microbiomes themselves especially their CRISPR composition. We conducted a preliminary analysis of CRISPR sequences in a human gut metagenomic data set of Chinese individuals of type-2 diabetes patients and healthy controls. Applying an available CRISPR-identification algorithm, PILER-CR, we identified 3169 CRISPR cassettes in the data, from which we constructed a set of 1302 unique repeat sequences and 36,709 spacers. A more extensive analysis was made for the CRISPR repeats: these repeats were submitted to a more comprehensive clustering and classification using the web server tool CRISPRmap. All repeats were compared with known CRISPRs in the database CRISPRdb. A total of 784 repeats had matches in the database, and the remaining 518 repeats from our set are potentially novel ones. The computational analysis of CRISPR composition based contigs of metagenome sequencing data is feasible. It provides an efficient approach for finding potential novel CRISPR arrays and for analysing the ecosystem and history of human microbiomes.
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis

PubMed Central

Zheng, Jin-shuang; Sun, Cheng-zhen; Zhang, Shu-ning; Hou, Xi-lin; Bonnema, Guusje

2016-01-01

A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis. PMID:27507974
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis.

PubMed

Zheng, Jin-Shuang; Sun, Cheng-Zhen; Zhang, Shu-Ning; Hou, Xi-Lin; Bonnema, Guusje

2016-01-01

A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis.
Molecular characterization of long direct repeat (LDR) sequences expressing a stable mRNA encoding for a 35-amino-acid cell-killing peptide and a cis-encoded small antisense RNA in Escherichia coli.

PubMed

Kawano, Mitsuoki; Oshima, Taku; Kasai, Hiroaki; Mori, Hirotada

2002-07-01

Genome sequence analyses of Escherichia coli K-12 revealed four copies of long repetitive elements. These sequences are designated as long direct repeat (LDR) sequences. Three of the repeats (LDR-A, -B, -C), each approximately 500 bp in length, are located as tandem repeats at 27.4 min on the genetic map. Another copy (LDR-D), 450 bp in length and nearly identical to LDR-A, -B and -C, is located at 79.7 min, a position that is directly opposite the position of LDR-A, -B and -C. In this study, we demonstrate that LDR-D encodes a 35-amino-acid peptide, LdrD, the overexpression of which causes rapid cell killing and nucleoid condensation of the host cell. Northern blot and primer extension analysis showed constitutive transcription of a stable mRNA (approximately 370 nucleotides) encoding LdrD and an unstable cis-encoded antisense RNA (approximately 60 nucleotides), which functions as a trans-acting regulator of ldrD translation. We propose that LDR encodes a toxin-antitoxin module. LDR-homologous sequences are not pre-sent on any known plasmids but are conserved in Salmonella and other enterobacterial species.
Evolutional dynamics of 45S and 5S ribosomal DNA in ancient allohexaploid Atropa belladonna.

PubMed

Volkov, Roman A; Panchuk, Irina I; Borisjuk, Nikolai V; Hosiawa-Baranska, Marta; Maluszynska, Jolanta; Hemleben, Vera

2017-01-23

Polyploid hybrids represent a rich natural resource to study molecular evolution of plant genes and genomes. Here, we applied a combination of karyological and molecular methods to investigate chromosomal structure, molecular organization and evolution of ribosomal DNA (rDNA) in nightshade, Atropa belladonna (fam. Solanaceae), one of the oldest known allohexaploids among flowering plants. Because of their abundance and specific molecular organization (evolutionarily conserved coding regions linked to variable intergenic spacers, IGS), 45S and 5S rDNA are widely used in plant taxonomic and evolutionary studies. Molecular cloning and nucleotide sequencing of A. belladonna 45S rDNA repeats revealed a general structure characteristic of other Solanaceae species, and a very high sequence similarity of two length variants, with the only difference in number of short IGS subrepeats. These results combined with the detection of three pairs of 45S rDNA loci on separate chromosomes, presumably inherited from both tetraploid and diploid ancestor species, example intensive sequence homogenization that led to substitution/elimination of rDNA repeats of one parent. Chromosome silver-staining revealed that only four out of six 45S rDNA sites are frequently transcriptionally active, demonstrating nucleolar dominance. For 5S rDNA, three size variants of repeats were detected, with the major class represented by repeats containing all functional IGS elements required for transcription, the intermediate size repeats containing partially deleted IGS sequences, and the short 5S repeats containing severe defects both in the IGS and coding sequences. While shorter variants demonstrate increased rate of based substitution, probably in their transition into pseudogenes, the functional 5S rDNA variants are nearly identical at the sequence level, pointing to their origin from a single parental species. Localization of the 5S rDNA genes on two chromosome pairs further supports uniparental inheritance from the tetraploid progenitor. The obtained molecular, cytogenetic and phylogenetic data demonstrate complex evolutionary dynamics of rDNA loci in allohexaploid species of Atropa belladonna. The high level of sequence unification revealed in 45S and 5S rDNA loci of this ancient hybrid species have been seemingly achieved by different molecular mechanisms.
Identification of presumed ancestral DNA sequences of phaseolin in Phaseolus vulgaris.

PubMed Central

Kami, J; Velásquez, V B; Debouck, D G; Gepts, P

1995-01-01

Common bean (Phaseolus vulgaris) consists of two major geographic gene pools, one distributed in Mexico, Central America, and Colombia and the other in the southern Andes (southern Peru, Bolivia, and Argentina). Amplification and sequencing of members of the multigene family coding for phaseolin, the major seed storage protein of the common bean, provide evidence for accumulation of tandem direct repeats in both introns and exons during evolution of the multigene family in this species. The presumed ancestral phaseolin sequences, without tandem repeats, were found in recently discovered but nearly extinct wild common bean populations of Ecuador and northern Peru that are intermediate between the two major gene pools of the species based on geographical and molecular arguments. Our results illustrate the usefulness of tandem direct repeats in establishing the polarity of DNA sequence divergence and therefore in proposing phylogenies. Images Fig. 1 Fig. 3 PMID:7862642
Functional centromeres in Astragalus sinicus include a compact centromere-specific histone H3 and a 20-bp tandem repeat.

PubMed

Tek, Ahmet L; Kashihara, Kazunari; Murata, Minoru; Nagaki, Kiyotaka

2011-11-01

The centromere plays an essential role for proper chromosome segregation during cell division and usually harbors long arrays of tandem repeated satellite DNA sequences. Although this function is conserved among eukaryotes, the sequences of centromeric DNA repeats are variable. Most of our understanding of functional centromeres, which are defined by localization of a centromere-specific histone H3 (CENH3) protein, comes from model organisms. The components of the functional centromere in legumes are poorly known. The genus Astragalus is a member of the legumes and bears the largest numbers of species among angiosperms. Therefore, we studied the components of centromeres in Astragalus sinicus. We identified the CenH3 homolog of A. sinicus, AsCenH3 that is the most compact in size among higher eukaryotes. A CENH3-based assay revealed the functional centromeric DNA sequences from A. sinicus, called CentAs. The CentAs repeat is localized in A. sinicus centromeres, and comprises an AT-rich tandem repeat with a monomer size of 20 nucleotides.
Identification and characterization of dinucleotide repeat (CA)[sub n] markers for genetic mapping in dog

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ostrander, E.A.; Sprague, G.F. Jr.; Rine, J.

1993-04-01

A large block of simple sequence repeat (SSR) polymorphisms for the dog genome has been isolated and characterized. Screening of primary libraries by conventional hybridization methods as well as by screening of enriched marker-selected libraries led to the isolation of a large number of genomic clones that contained (CA)[sub n] repeats. The sequences of 101 clones showed that the size and complexity of (CA)[sub n] repeats in the dog genome were similar to those reported for these markers in the human genome. Detailed analysis of a representative subset of these markers revealed that most markers were moderately to highly polymorphic,more » with PIC values exceeding 0.70 for 33% of the markers tested. An association between higher PIC values and markers containing longer (CA)[sub n] repeats was observed in these studies, as previously noted for similar markers in the human genome. A list of primer sequences that tag each characterized marker is provided, and a comprehensive system of nomenclature for the dog genome is suggested. 28 refs., 4 figs., 2 tabs.« less
Horseradish peroxidase-labeled oligonucleotides and fluorescent tyramides for rapid detection of chromosome-specific repeat sequences.

PubMed

van Gijlswijk, R P; Wiegant, J; Vervenne, R; Lasan, R; Tanke, H J; Raap, A K

1996-01-01

We present a sensitive and rapid fluorescence in situ hybridization (FISH) strategy for detecting chromosome-specific repeat sequences. It uses horseradish peroxidase (HRP)-labeled oligonucleotide sequences in combination with fluorescent tyramide-based detection. After in situ hybridization, the HRP conjugated to the oligonucleotide probe is used to deposit fluorescently labeled tyramide molecules at the site of hybridization. The method features full chemical synthesis of probes, strong FISH signals, and short processing periods, as well as multicolor capabilities.
Cultivar identification, pedigree verification, and diversity analysis among Peach (Prunus persica L. Batsch) Cultivars based on Simple Sequence Repeat markers

USDA-ARS?s Scientific Manuscript database

The genetic relationships and pedigree inferences among peach (Prunus persica (L.) Batsch) accessions and breeding lines used in genetic improvement were evaluated using 15 simple sequence repeat (SSR) markers. A total of 80 alleles were detected among the 37 peach accessions with an average of 5.53...
THE USE OF INTER SIMPLE SEQUENCE REPEATS (ISSR) IN DISTINGUISHING NEIGHBORING DOUGLAS-FIR TREES AS A MEANS TO IDENTIFYING TREE ROOTS WITH ABOVE-GROUND BIOMASS

EPA Science Inventory

We are attempting to identify specific root fragments from soil cores with individual trees. We successfully used Inter Simple Sequence Repeats (ISSR) to distinguish neighboring old-growth Douglas-fir trees from one another, while maintaining identity among each tree's parts. W...
Cross-species transferability and mapping of genomic and cDNA SSRs in pines

Treesearch

D. Chagne; P. Chaumeil; A. Ramboer; C. Collada; A. Guevara; M. T. Cervera; G. G. Vendramin; V. Garcia; J-M. Frigerio; Craig Echt; T. Richardson; Christophe Plomion

2004-01-01

Two unigene datasets of Pinus taeda and Pinus pinaster were screened to detect di-, tri and tetranucleotide repeated motifs using the SSRIT script. A total of 419 simple sequence repeats (SSRs) were identified, from which only 12.8% overlapped between the two sets. The position of the SSRs within the coding sequence were predicted...
An integrated genetic linkage map of watermelon and genetic diversity based on single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers

USDA-ARS?s Scientific Manuscript database

Watermelon (Citrullus lanatus var. lanatus) is an important vegetable fruit throughout the world. A high number of single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers should provide large coverage of the watermelon genome and high phylogenetic resolution of germplasm acces...
A Repeat Look at Repeating Patterns

ERIC Educational Resources Information Center

Markworth, Kimberly A.

2016-01-01

A "repeating pattern" is a cyclical repetition of an identifiable core. Children in the primary grades usually begin pattern work with fairly simple patterns, such as AB, ABC, or ABB patterns. The unique letters represent unique elements, whereas the sequence of letters represents the core that is repeated. Based on color, shape,…
Rapid and accurate synthesis of TALE genes from synthetic oligonucleotides.

PubMed

Wang, Fenghua; Zhang, Hefei; Gao, Jingxia; Chen, Fengjiao; Chen, Sijie; Zhang, Cuizhen; Peng, Gang

2016-01-01

Custom synthesis of transcription activator-like effector (TALE) genes has relied upon plasmid libraries of pre-fabricated TALE-repeat monomers or oligomers. Here we describe a novel synthesis method that directly incorporates annealed synthetic oligonucleotides into the TALE-repeat units. Our approach utilizes iterative sets of oligonucleotides and a translational frame check strategy to ensure the high efficiency and accuracy of TALE-gene synthesis. TALE arrays of more than 20 repeats can be constructed, and the majority of the synthesized constructs have perfect sequences. In addition, this novel oligonucleotide-based method can readily accommodate design changes to the TALE repeats. We demonstrated an increased gene targeting efficiency against a genomic site containing a potentially methylated cytosine by incorporating non-conventional repeat variable di-residue (RVD) sequences.
PaASK1, a mitogen-activated protein kinase kinase kinase that controls cell degeneration and cell differentiation in Podospora anserina.

PubMed Central

Kicka, Sébastien; Silar, Philippe

2004-01-01

MAPKKK are kinases involved in cell signaling. In fungi, these kinases are known to regulate development, pathogenicity, and the sensing of external conditions. We show here that Podospora anserina strains mutated in PaASK1, a MAPKKK of the MEK family, are impaired in the development of crippled growth, a cell degeneration process caused by C, a nonconventional infectious element. They also display defects in mycelium pigmentation, differentiation of aerial hyphae, and making of fruiting bodies, three hallmarks of cell differentiation during stationary phase in P. anserina. Overexpression of PaASK1 results in exacerbation of crippled growth. PaASK1 is a large protein of 1832 amino acids with several domains, including a region rich in proline and a 60-amino-acid-long polyglutamine stretch. Deletion analysis reveals that the polyglutamine stretch is dispensable for PaASK1 activity, whereas the region that contains the prolines is essential but insufficient to promote full activity. We discuss a model based on the hysteresis of a signal transduction cascade to account for the role of PaASK1 in both cell degeneration and stationary-phase cell differentiation. PMID:15082544

PaASK1, a mitogen-activated protein kinase kinase kinase that controls cell degeneration and cell differentiation in Podospora anserina.

PubMed

Kicka, Sébastien; Silar, Philippe

2004-03-01

MAPKKK are kinases involved in cell signaling. In fungi, these kinases are known to regulate development, pathogenicity, and the sensing of external conditions. We show here that Podospora anserina strains mutated in PaASK1, a MAPKKK of the MEK family, are impaired in the development of crippled growth, a cell degeneration process caused by C, a nonconventional infectious element. They also display defects in mycelium pigmentation, differentiation of aerial hyphae, and making of fruiting bodies, three hallmarks of cell differentiation during stationary phase in P. anserina. Overexpression of PaASK1 results in exacerbation of crippled growth. PaASK1 is a large protein of 1832 amino acids with several domains, including a region rich in proline and a 60-amino-acid-long polyglutamine stretch. Deletion analysis reveals that the polyglutamine stretch is dispensable for PaASK1 activity, whereas the region that contains the prolines is essential but insufficient to promote full activity. We discuss a model based on the hysteresis of a signal transduction cascade to account for the role of PaASK1 in both cell degeneration and stationary-phase cell differentiation.
Characterization of C-terminal adaptors, UFD-2 and UFD-3, of CDC-48 on the polyglutamine aggregation in C. elegans.

PubMed

Murayama, Yuki; Ogura, Teru; Yamanaka, Kunitoshi

2015-03-27

CDC-48 (also called VCP or p97 in mammals and Cdc48p in yeast) is a AAA (ATPases associated with diverse cellular activities) chaperone and participates in a wide range of cellular activities including modulation of protein complexes and protein aggregates. UFD-2 and UFD-3, C-terminal adaptors for CDC-48, reportedly bind to CDC-48 in a mutually exclusive manner and they may modulate the fate of substrates for CDC-48. However, their cellular functions have not yet been elucidated. In this study, we found that CDC-48 preferentially interacts with UFD-3 in Caenorhabditis elegans. We also found that the number of polyglutamine (polyQ) aggregates was reduced in the ufd-3 deletion mutant but not in the ufd-2 deletion mutant. Furthermore, the lifespan and motility of the ufd-3 deletion mutant, where polyQ40::GFP was expressed, were greatly decreased. Taken together, we propose that UFD-3 may promote the formation of polyQ aggregates to reduce the polyQ toxicity in C. elegans. Copyright © 2015 Elsevier Inc. All rights reserved.
A genetic modifier suggests that endurance exercise exacerbates Huntington's disease

PubMed Central

Corrochano, Silvia; Blanco, Gonzalo; Williams, Debbie; Wettstein, Jessica; Simon, Michelle; Kumar, Saumya; Moir, Lee; Agnew, Thomas; Stewart, Michelle; Landman, Allison; Kotiadis, Vassilios N; Duchen, Michael R; Wackerhage, Henning; Rubinsztein, David C; Brown, Steve D M

2018-01-01

Abstract Polyglutamine expansions in the huntingtin gene cause Huntington’s disease (HD). Huntingtin is ubiquitously expressed, leading to pathological alterations also in peripheral organs. Variations in the length of the polyglutamine tract explain up to 70% of the age-at-onset variance, with the rest of the variance attributed to genetic and environmental modifiers. To identify novel disease modifiers, we performed an unbiased mutagenesis screen on an HD mouse model, identifying a mutation in the skeletal muscle voltage-gated sodium channel (Scn4a, termed ‘draggen’ mutation) as a novel disease enhancer. Double mutant mice (HD; Scn4aDgn/+) had decreased survival, weight loss and muscle atrophy. Expression patterns show that the main tissue affected is skeletal muscle. Intriguingly, muscles from HD; Scn4aDgn/+ mice showed adaptive changes similar to those found in endurance exercise, including AMPK activation, fibre type switching and upregulation of mitochondrial biogenesis. Therefore, we evaluated the effects of endurance training on HD mice. Crucially, this training regime also led to detrimental effects on HD mice. Overall, these results reveal a novel role for skeletal muscle in modulating systemic HD pathogenesis, suggesting that some forms of physical exercise could be deleterious in neurodegeneration. PMID:29509900
Aggregation of polyglutamine-expanded ataxin-3 sequesters its specific interacting partners into inclusions: Implication in a loss-of-function pathology

PubMed Central

Yang, Hui; Li, Jing-Jing; Liu, Shuai; Zhao, Jian; Jiang, Ya-Jun; Song, Ai-Xin; Hu, Hong-Yu

2014-01-01

Expansion of polyglutamine (polyQ) tract may cause protein misfolding and aggregation that lead to cytotoxicity and neurodegeneration, but the underlying mechanism remains to be elucidated. We applied ataxin-3 (Atx3), a polyQ tract-containing protein, as a model to study sequestration of normal cellular proteins. We found that the aggregates formed by polyQ-expanded Atx3 sequester its interacting partners, such as P97/VCP and ubiquitin conjugates, into the protein inclusions through specific interactions both in vitro and in cells. Moreover, this specific sequestration impairs the normal cellular function of P97 in down-regulating neddylation. However, expansion of polyQ tract in Atx3 does not alter the conformation of its surrounding regions and the interaction affinities with the interacting partners, although it indeed facilitates misfolding and aggregation of the Atx3 protein. Thus, we propose a loss-of-function pathology for polyQ diseases that sequestration of the cellular essential proteins via specific interactions into inclusions by the polyQ aggregates causes dysfunction of the corresponding proteins, and consequently leads to neurodegeneration. PMID:25231079
Huntingtin-interacting protein 1 influences worm and mouse presynaptic function and protects Caenorhabditis elegans neurons against mutant polyglutamine toxicity.

PubMed

Parker, J Alex; Metzler, Martina; Georgiou, John; Mage, Marilyne; Roder, John C; Rose, Ann M; Hayden, Michael R; Néri, Christian

2007-10-10

Huntingtin-interacting protein 1 (HIP1) was identified through its interaction with htt (huntingtin), the Huntington's disease (HD) protein. HIP1 is an endocytic protein that influences transport and function of AMPA and NMDA receptors in the brain. However, little is known about its contribution to neuronal dysfunction in HD. We report that the Caenorhabditis elegans HIP1 homolog hipr-1 modulates presynaptic activity and the abundance of synaptobrevin, a protein involved in synaptic vesicle fusion. Presynaptic function was also altered in hippocampal brain slices of HIP1-/- mice demonstrating delayed recovery from synaptic depression and a reduction in paired-pulse facilitation, a form of presynaptic plasticity. Interestingly, neuronal dysfunction in transgenic nematodes expressing mutant N-terminal huntingtin was specifically enhanced by hipr-1 loss of function. A similar effect was observed with several other mutant proteins that are expressed at the synapse and involved in endocytosis, such as unc-11/AP180, unc-26/synaptojanin, and unc-57/endophilin. Thus, HIP1 is involved in presynaptic nerve terminal activity and modulation of mutant polyglutamine-induced neuronal dysfunction. Moreover, synaptic proteins involved in endocytosis may protect neurons against amino acid homopolymer expansion.
[Polymorphic loci and polymorphism analysis of short tandem repeats within XNP gene].

PubMed

Liu, Qi-Ji; Gong, Yao-Qin; Guo, Chen-Hong; Chen, Bing-Xi; Li, Jiang-Xia; Guo, Yi-Shou

2002-01-01

To select polymorphic short tandem repeat markers within X-linked nuclear protein (XNP) gene, genomic clones which contain XNP gene were recognized by homologous analysis with XNP cDNA. By comparing the cDNA with genomic DNA, non-exonic sequences were identified, and short tandem repeats were selected from non-exonic sequences by using BCM search Launcher. Polymorphisms of the short tandem repeats in Chinese population were evaluated by PCR amplification and PAGE. Five short tandem repeats were identified from XNP gene, two of which were polymorphic. Four and 11 alleles were observed in Chinese population for XNPSTR1 and XNPSTR4, respectively. Heterozygosities were 47% for XNPSTR1 and 70% for XNPSTR4. XNPSTR1 and XNPSTR4 localized within 3' end and intron 10, respectively. Two polymorphic short tandem repeats have been identified within XNP gene and will be useful for linkage analysis and gene diagnosis of XNP gene.
Use of the LUS in sequence allele designations to facilitate probabilistic genotyping of NGS-based STR typing results.

PubMed

Just, Rebecca S; Irwin, Jodi A

2018-05-01

Some of the expected advantages of next generation sequencing (NGS) for short tandem repeat (STR) typing include enhanced mixture detection and genotype resolution via sequence variation among non-homologous alleles of the same length. However, at the same time that NGS methods for forensic DNA typing have advanced in recent years, many caseworking laboratories have implemented or are transitioning to probabilistic genotyping to assist the interpretation of complex autosomal STR typing results. Current probabilistic software programs are designed for length-based data, and were not intended to accommodate sequence strings as the product input. Yet to leverage the benefits of NGS for enhanced genotyping and mixture deconvolution, the sequence variation among same-length products must be utilized in some form. Here, we propose use of the longest uninterrupted stretch (LUS) in allele designations as a simple method to represent sequence variation within the STR repeat regions and facilitate - in the nearterm - probabilistic interpretation of NGS-based typing results. An examination of published population data indicated that a reference LUS region is straightforward to define for most autosomal STR loci, and that using repeat unit plus LUS length as the allele designator can represent greater than 80% of the alleles detected by sequencing. A proof of concept study performed using a freely available probabilistic software demonstrated that the LUS length can be used in allele designations when a program does not require alleles to be integers, and that utilizing sequence information improves interpretation of both single-source and mixed contributor STR typing results as compared to using repeat unit information alone. The LUS concept for allele designation maintains the repeat-based allele nomenclature that will permit backward compatibility to extant STR databases, and the LUS lengths themselves will be concordant regardless of the NGS assay or analysis tools employed. Further, these biologically based, easy-to-derive designations uphold clear relationships between parent alleles and their stutter products, enabling analysis in fully continuous probabilistic programs that model stutter while avoiding the algorithmic complexities that come with string based searches. Though using repeat unit plus LUS length as the allele designator does not capture variation that occurs outside of the core repeat regions, this straightforward approach would permit the large majority of known STR sequence variation to be used for mixture deconvolution and, in turn, result in more informative mixture statistics in the near term. Ultimately, the method could bridge the gap from current length-based probabilistic systems to facilitate broader adoption of NGS by forensic DNA testing laboratories. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Genome-Wide Characterization and Linkage Mapping of Simple Sequence Repeats in Mei (Prunus mume Sieb. et Zucc.)

PubMed Central

Sun, Lidan; Yang, Weiru; Zhang, Qixiang; Cheng, Tangren; Pan, Huitang; Xu, Zongda; Zhang, Jie; Chen, Chuguang

2013-01-01

Because of its popularity as an ornamental plant in East Asia, mei (Prunus mume Sieb. et Zucc.) has received increasing attention in genetic and genomic research with the recent shotgun sequencing of its genome. Here, we performed the genome-wide characterization of simple sequence repeats (SSRs) in the mei genome and detected a total of 188,149 SSRs occurring at a frequency of 794 SSR/Mb. Mononucleotide repeats were the most common type of SSR in genomic regions, followed by di- and tetranucleotide repeats. Most of the SSRs in coding sequences (CDS) were composed of tri- or hexanucleotide repeat motifs, but mononucleotide repeats were always the most common in intergenic regions. Genome-wide comparison of SSR patterns among the mei, strawberry (Fragaria vesca), and apple (Malus×domestica) genomes showed mei to have the highest density of SSRs, slightly higher than that of strawberry (608 SSR/Mb) and almost twice as high as that of apple (398 SSR/Mb). Mononucleotide repeats were the dominant SSR motifs in the three Rosaceae species. Using 144 SSR markers, we constructed a 670 cM-long linkage map of mei delimited into eight linkage groups (LGs), with an average marker distance of 5 cM. Seventy one scaffolds covering about 27.9% of the assembled mei genome were anchored to the genetic map, depending on which the macro-colinearity between the mei genome and Prunus T×E reference map was identified. The framework map of mei constructed provides a first step into subsequent high-resolution genetic mapping and marker-assisted selection for this ornamental species. PMID:23555708
Sequence of retrovirus provirus resembles that of bacterial transposable elements

NASA Astrophysics Data System (ADS)

Shimotohno, Kunitada; Mizutani, Satoshi; Temin, Howard M.

1980-06-01

The nucleotide sequences of the terminal regions of an infectious integrated retrovirus cloned in the modified λ phage cloning vector Charon 4A have been elucidated. There is a 569-base pair direct repeat at both ends of the viral DNA. The cell-virus junctions at each end consist of a 5-base pair direct repeat of cell DNA next to a 3-base pair inverted repeat of viral DNA. This structure resembles that of a transposable element and is consistent with the protovirus hypothesis that retroviruses evolved from the cell genome.
Evolutionary force of AT-rich repeats to trap genomic and episomal DNAs into the rice genome: lessons from endogenous pararetrovirus.

PubMed

Liu, Ruifang; Koyanagi, Kanako O; Chen, Sunlu; Kishima, Yuji

2012-12-01

In plant genomes, the incorporation of DNA segments is not a common method of artificial gene transfer. Nevertheless, various segments of pararetroviruses have been found in plant genomes in recent decades. The rice genome contains a number of segments of endogenous rice tungro bacilliform virus-like sequences (ERTBVs), many of which are present between AT dinucleotide repeats (ATrs). Comparison of genomic sequences between two closely related rice subspecies, japonica and indica, allowed us to verify the preferential insertion of ERTBVs into ATrs. In addition to ERTBVs, the comparative analyses showed that ATrs occasionally incorporate repeat sequences including transposable elements, and a wide range of other sequences. Besides the known genomic sequences, the insertion sequences also represented DNAs of unclear origins together with ERTBVs, suggesting that ATrs have integrated episomal DNAs that would have been suspended in the nucleus. Such insertion DNAs might be trapped by ATrs in the genome in a host-dependent manner. Conversely, other simple mono- and dinucleotide sequence repeats (SSR) were less frequently involved in insertion events relative to ATrs. Therefore, ATrs could be regarded as hot spots of double-strand breaks that induce non-homologous end joining. The insertions within ATrs occasionally generated new gene-related sequences or involved structural modifications of existing genes. Likewise, in a comparison between Arabidopsis thaliana and Arabidopsis lyrata, the insertions preferred ATrs to other SSRs. Therefore ATrs in plant genomes could be considered as genomic dumping sites that have trapped various DNA molecules and may have exerted a powerful evolutionary force. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.
Human telomeres that contain (CTAGGG)n repeats show replication dependent instability in somatic cells and the male germline

PubMed Central

Mendez-Bermudez, Aaron; Hills, Mark; Pickett, Hilda A.; Phan, Anh Tuân; Mergny, Jean-Louis; Riou, Jean-François; Royle, Nicola J.

2009-01-01

A number of different processes that impact on telomere length dynamics have been identified but factors that affect the turnover of repeats located proximally within the telomeric DNA are poorly defined. We have identified a particular repeat type (CTAGGG) that is associated with an extraordinarily high mutation rate (20% per gamete) in the male germline. The mutation rate is affected by the length and sequence homogeneity of the (CTAGGG)n array. This level of instability was not seen with other sequence-variant repeats, including the TCAGGG repeat type that has the same composition. Telomeres carrying a (CTAGGG)n array are also highly unstable in somatic cells with the mutation process resulting in small gains or losses of repeats that also occasionally result in the deletion of the whole (CTAGGG)n array. These sequences are prone to quadruplex formation in vitro but adopt a different topology from (TTAGGG)n (see accompanying article). Interestingly, short (CTAGGG)2 oligonucleotides induce a DNA damage response (γH2AX foci) as efficiently as (TTAGGG)2 oligos in normal fibroblast cells, suggesting they recruit POT1 from the telomere. Moreover, in vitro assays show that (CTAGGG)n repeats bind POT1 more efficiently than (TTAGGG)n or (TCAGGG)n. We estimate that 7% of human telomeres contain (CTAGGG)n repeats and when present, they create additional problems that probably arise during telomere replication. PMID:19656953
TRDistiller: a rapid filter for enrichment of sequence datasets with proteins containing tandem repeats.

PubMed

Richard, François D; Kajava, Andrey V

2014-06-01

The dramatic growth of sequencing data evokes an urgent need to improve bioinformatics tools for large-scale proteome analysis. Over the last two decades, the foremost efforts of computer scientists were devoted to proteins with aperiodic sequences having globular 3D structures. However, a large portion of proteins contain periodic sequences representing arrays of repeats that are directly adjacent to each other (so called tandem repeats or TRs). These proteins frequently fold into elongated fibrous structures carrying different fundamental functions. Algorithms specific to the analysis of these regions are urgently required since the conventional approaches developed for globular domains have had limited success when applied to the TR regions. The protein TRs are frequently not perfect, containing a number of mutations, and some of them cannot be easily identified. To detect such "hidden" repeats several algorithms have been developed. However, the most sensitive among them are time-consuming and, therefore, inappropriate for large scale proteome analysis. To speed up the TR detection we developed a rapid filter that is based on the comparison of composition and order of short strings in the adjacent sequence motifs. Tests show that our filter discards up to 22.5% of proteins which are known to be without TRs while keeping almost all (99.2%) TR-containing sequences. Thus, we are able to decrease the size of the initial sequence dataset enriching it with TR-containing proteins which allows a faster subsequent TR detection by other methods. The program is available upon request. Copyright © 2014 Elsevier Inc. All rights reserved.
Repeats of base oligomers as the primordial coding sequences of the primeval earth and their vestiges in modern genes.

PubMed

Ohno, S

1984-01-01

Three outstanding properties uniquely qualify repeats of base oligomers as the primordial coding sequences of all polypeptide chains. First, when compared with randomly generated base sequences in general, they are more likely to have long open reading frames. Second, periodical polypeptide chains specified by such repeats are more likely to assume either alpha-helical or beta-sheet secondary structures than are polypeptide chains of random sequence. Third, provided that the number of bases in the oligomeric unit is not a multiple of 3, these internally repetitious coding sequences are impervious to randomly sustained base substitutions, deletions, and insertions. This is because the recurring periodicity of their polypeptide chains is given by three consecutive copies of the oligomeric unit translated in three different reading frames. Accordingly, when one reading frame is open, the other two are automatically open as well, all three being capable of coding for polypeptide chains of identical periodicity. Under this circumstance, a frame shift due to the deletion or insertion of a number of bases that is not a multiple of 3 fails to alter the down-stream amino acid sequence, and even a base change causing premature chain-termination can silence only one of the three potential coding units. Newly arisen coding sequences in modern organisms are oligomeric repeats, and most of the older genes retain various vestiges of their original internal repetitions. Some of the genes (e.g., oncogenes) have even inherited the property of being impervious to randomly sustained base changes.
Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

PubMed Central

2011-01-01

Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS) of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA containing putative SNPs was amplified by PCR from AL8/78 and AS75 and resequenced with the ABI 3730 xl. In a sample of 302 randomly selected putative SNPs, 84.0% in gene regions, 88.0% in repeat junctions, and 81.3% in uncharacterized regions were validated. Conclusion An annotation-based genome-wide SNP discovery pipeline for NGS platforms was developed. The pipeline is suitable for SNP discovery in genomic libraries of complex genomes and does not require a reference genome sequence. The pipeline is applicable to all current NGS platforms, provided that at least one such platform generates relatively long reads. The pipeline package, AGSNP, and the discovered 497,118 Ae. tauschii SNPs can be accessed at (http://avena.pw.usda.gov/wheatD/agsnp.shtml). PMID:21266061
REPPER—repeats and their periodicities in fibrous proteins

PubMed Central

Gruber, Markus; Söding, Johannes; Lupas, Andrei N.

2005-01-01

REPPER (REPeats and their PERiodicities) is an integrated server that detects and analyzes regions with short gapless repeats in protein sequences or alignments. It finds periodicities by Fourier Transform (FTwin) and internal similarity analysis (REPwin). FTwin assigns numerical values to amino acids that reflect certain properties, for instance hydrophobicity, and gives information on corresponding periodicities. REPwin uses self-alignments and displays repeats that reveal significant internal similarities. Both programs use a sliding window to ensure that different periodic regions within the same protein are detected independently. FTwin and REPwin are complemented by secondary structure prediction (PSIPRED) and coiled coil prediction (COILS), making the server a versatile analysis tool for sequences of fibrous proteins. REPPER is available at . PMID:15980460
An examination of the origin and evolution of additional tandem repeats in the mitochondrial DNA control region of Japanese sika deer (Cervus Nippon).

PubMed

Ba, Hengxing; Wu, Lang; Liu, Zongyue; Li, Chunyi

2016-01-01

Tandem repeat units are only detected in the left domain of the mitochondrial DNA control region in sika deer. Previous studies showed that Japanese sika deer have more tandem repeat units than its cousins from the Asian continent and Taiwan, which often have only three repeat units. To determine the origin and evolution of these additional repeat units in Japanese sika deer, we obtained the sequence of repeat units from an expanded dataset of the control region from all sika deer lineages. The functional constraint is inferred to act on the first repeat unit because this repeat has the least sequence divergence in comparison to the other units. Based on slipped-strand mispairing mechanisms, the illegitimate elongation model could account for the addition or deletion of these additional repeat units in the Japanese sika deer population. We also report that these additional repeat units could be occurring in the internal positions of tandem repeat regions, possibly via coupling with a homogenization mechanism within and among these lineages. Moreover, the increased number of repeat units in the Japanese sika deer population could reflect a balance between mutation and selection, as well as genetic drift.
Myotonin protein-kinase [AGC]n trinucleotide repeat in seven nonhuman primates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Novelli, G.; Sineo, L.; Pontieri, E.

Myotonic dystrophy (DM) is due to a genomic instability of a trinucleotide [AGC]n motif, located at the 3{prime} UTR region of a protein-kinase gene (myotonin protein kinase, MT-PK). The [AGC] repeat is meiotically and mitotically unstable, and it is directly related to the manifestations of the disorder. Although a gene dosage effect of the MT-PK has been demonstrated n DM muscle, the mechanism(s) by which the intragenic repeat expansion leads to disease is largely unknown. This non-standard mutational event could reflect an evolutionary mechanism widespread among animal genomes. We have isolated and sequenced the complete 3{prime}UTR region of the MT-PKmore » gene in seven primates (macaque, orangutan, gorilla, chimpanzee, gibbon, owl monkey, saimiri), and examined by comparative sequence nucleotide analysis the [AGC]n intragenic repeat and the surrounding nucleotides. The genomic organization, including the [AGC]n repeat structure, was conserved in all examined species, excluding the gibbon (Hylobates agilis), in which the [AGC]n upstream sequence (GGAA) is replaced by a GA dinucleotide. The number of [AGC]n in the examined species ranged between 7 (gorilla) and 13 repeats (owl monkeys), with a polymorphism informative content (PIC) similar to that observed in humans. These results indicate that the 3{prime}UTR [AGC] repeat within the MT-PK gene is evolutionarily conserved, supporting that this region has important regulatory functions.« less
The complete chloroplast genome sequence of Cephalotaxus oliveri (Cephalotaxaceae): evolutionary comparison of cephalotaxus chloroplast DNAs and insights into the loss of inverted repeat copies in gymnosperms.

PubMed

Yi, Xuan; Gao, Lei; Wang, Bo; Su, Ying-Juan; Wang, Ting

2013-01-01

We have determined the complete chloroplast (cp) genome sequence of Cephalotaxus oliveri. The genome is 134,337 bp in length, encodes 113 genes, and lacks inverted repeat (IR) regions. Genome-wide mutational dynamics have been investigated through comparative analysis of the cp genomes of C. oliveri and C. wilsoniana. Gene order transformation analyses indicate that when distinct isomers are considered as alternative structures for the ancestral cp genome of cupressophyte and Pinaceae lineages, it is not possible to distinguish between hypotheses favoring retention of the same IR region in cupressophyte and Pinaceae cp genomes from a hypothesis proposing independent loss of IRA and IRB. Furthermore, in cupressophyte cp genomes, the highly reduced IRs are replaced by short repeats that have the potential to mediate homologous recombination, analogous to the situation in Pinaceae. The importance of repeats in the mutational dynamics of cupressophyte cp genomes is also illustrated by the accD reading frame, which has undergone extreme length expansion in cupressophytes. This has been caused by a large insertion comprising multiple repeat sequences. Overall, we find that the distribution of repeats, indels, and substitutions is significantly correlated in Cephalotaxus cp genomes, consistent with a hypothesis that repeats play a role in inducing substitutions and indels in conifer cp genomes.
Identification, variation and transcription of pneumococcal repeat sequences

PubMed Central

2011-01-01

Background Small interspersed repeats are commonly found in many bacterial chromosomes. Two families of repeats (BOX and RUP) have previously been identified in the genome of Streptococcus pneumoniae, a nasopharyngeal commensal and respiratory pathogen of humans. However, little is known about the role they play in pneumococcal genetics. Results Analysis of the genome of S. pneumoniae ATCC 700669 revealed the presence of a third repeat family, which we have named SPRITE. All three repeats are present at a reduced density in the genome of the closely related species S. mitis. However, they are almost entirely absent from all other streptococci, although a set of elements related to the pneumococcal BOX repeat was identified in the zoonotic pathogen S. suis. In conjunction with information regarding their distribution within the pneumococcal chromosome, this suggests that it is unlikely that these repeats are specialised sequences performing a particular role for the host, but rather that they constitute parasitic elements. However, comparing insertion sites between pneumococcal sequences indicates that they appear to transpose at a much lower rate than IS elements. Some large BOX elements in S. pneumoniae were found to encode open reading frames on both strands of the genome, whilst another was found to form a composite RNA structure with two T box riboswitches. In multiple cases, such BOX elements were demonstrated as being expressed using directional RNA-seq and RT-PCR. Conclusions BOX, RUP and SPRITE repeats appear to have proliferated extensively throughout the pneumococcal chromosome during the species' past, but novel insertions are currently occurring at a relatively slow rate. Through their extensive secondary structures, they seem likely to affect the expression of genes with which they are co-transcribed. Software for annotation of these repeats is freely available from ftp://ftp.sanger.ac.uk/pub/pathogens/strep_repeats/. PMID:21333003
Evidence of birth-and-death evolution of 5S rRNA gene in Channa species (Teleostei, Perciformes).

PubMed

Barman, Anindya Sundar; Singh, Mamta; Singh, Rajeev Kumar; Lal, Kuldeep Kumar

2016-12-01

In higher eukaryotes, minor rDNA family codes for 5S rRNA that is arranged in tandem arrays and comprises of a highly conserved 120 bp long coding sequence with a variable non-transcribed spacer (NTS). Initially the 5S rDNA repeats are considered to be evolved by the process of concerted evolution. But some recent reports, including teleost fishes suggested that evolution of 5S rDNA repeat does not fit into the concerted evolution model and evolution of 5S rDNA family may be explained by a birth-and-death evolution model. In order to study the mode of evolution of 5S rDNA repeats in Perciformes fish species, nucleotide sequence and molecular organization of five species of genus Channa were analyzed in the present study. Molecular analyses revealed several variants of 5S rDNA repeats (four types of NTS) and networks created by a neighbor net algorithm for each type of sequences (I, II, III and IV) did not show a clear clustering in species specific manner. The stable secondary structure is predicted and upstream and downstream conserved regulatory elements were characterized. Sequence analyses also shown the presence of two putative pseudogenes in Channa marulius. Present study supported that 5S rDNA repeats in genus Channa were evolved under the process of birth-and-death.

Drastic stability change of X-X mismatch in d(CXG) trinucleotide repeat disorders under molecular crowding condition.

PubMed

Teng, Ye; Pramanik, Smritimoy; Tateishi-Karimata, Hisae; Ohyama, Tatsuya; Sugimoto, Naoki

2018-02-05

The trinucleotide repeat d(CXG) (X = A, C, G or T) is the most common sequence causing repeat expansion disorders. The formation of non-canonical structures, such as hairpin structures with X-X mismatches, has been proposed to affect gene expression and regulation, which are important in pathological studies of these devastating neurological diseases. However, little information is available regarding the thermodynamics of the repeat sequence under crowded cellular conditions where many non-canonical structures such as G-quadruplexes are highly stabilized, while duplexes are destabilised. In this study, we investigated the different stabilities of X-X mismatches in the context of internal d(CXG) self-complementary sequences in an environment with a high concentration of cosolutes to mimic the crowding conditions in cells. The stabilities of full-matched duplexes and duplexes with A-A, G-G, and T-T mismatched base pairs under molecular crowding conditions were notably decreased compared to under dilute conditions. However, the stability of the DNA duplex with a C-C mismatch base pair was only slightly destabilised. Investigating different stabilities of X-X mismatches in d(CXG) sequences is important for improving our understanding of the formation and transition of multiple non-canonical structures in trinucleotide repeat diseases, and may provide insights for pathological studies and drug development. Copyright © 2018 Elsevier Inc. All rights reserved.
Effects of GABA[subscript A] Modulators on the Repeated Acquisition of Response Sequences in Squirrel Monkeys

ERIC Educational Resources Information Center

Campbell, Una C.; Winsauer, Peter J.; Stevenson, Michael W.; Moerschbaecher, Joseph M.

2004-01-01

The present study investigated the effects of positive and negative GABA[subscript A] modulators under three different baselines of repeated acquisition in squirrel monkeys in which the monkeys acquired a three-response sequence on three keys under a second-order fixed-ratio (FR) schedule of food reinforcement. In two of these baselines, the…
Short intronic repeat sequences facilitate circular RNA production.

PubMed

Liang, Dongming; Wilusz, Jeremy E

2014-10-15

Recent deep sequencing studies have revealed thousands of circular noncoding RNAs generated from protein-coding genes. These RNAs are produced when the precursor messenger RNA (pre-mRNA) splicing machinery "backsplices" and covalently joins, for example, the two ends of a single exon. However, the mechanism by which the spliceosome selects only certain exons to circularize is largely unknown. Using extensive mutagenesis of expression plasmids, we show that miniature introns containing the splice sites along with short (∼ 30- to 40-nucleotide) inverted repeats, such as Alu elements, are sufficient to allow the intervening exons to circularize in cells. The intronic repeats must base-pair to one another, thereby bringing the splice sites into close proximity to each other. More than simple thermodynamics is clearly at play, however, as not all repeats support circularization, and increasing the stability of the hairpin between the repeats can sometimes inhibit circular RNA biogenesis. The intronic repeats and exonic sequences must collaborate with one another, and a functional 3' end processing signal is required, suggesting that circularization may occur post-transcriptionally. These results suggest detailed and generalizable models that explain how the splicing machinery determines whether to produce a circular noncoding RNA or a linear mRNA. © 2014 Liang and Wilusz; Published by Cold Spring Harbor Laboratory Press.
Typing of artiodactyl MHC-DRB genes with the help of intronic simple repeated DNA sequences.

PubMed

Schwaiger, F W; Buitkamp, J; Weyers, E; Epplen, J T

1993-02-01

An efficient oligonucleotide typing method for the highly polymorphic MHC-DRB genes is described for artiodactyls like cattle, sheep and goat. By means of the polymerase chain reaction, the second exon of MHC-DRB is amplified as well as part of the adjacent intron containing a mixed simple repeat sequence. Using this primer combination we were able to amplify the MHC-DRB exons 2 and adjacent introns from all of the investigated 10 species of the family of Bovidae and giraffes. Therefore, the DRB genes of novel artiodactyl species can also be readily studied. Oligonucleotide probes specific for the polymorphisms of ungulate DRB genes are used with which sequences differing in at least one single base can be distinguished. Exonic polymorphism was found to be correlated with the allele lengths and the patterns of the repeat structures. Hence oligonucleotide probes specific for different simple repeats and polymorphic positions serve also for typing across species barriers. The strict correlation of sequence length and exonic polymorphism permits a preselection of specific oligonucleotides for hybridization. Thus more than 20 alleles can already be differentiated from each of the three species.
High Quality Maize Centromere 10 Sequence Reveals Evidence of Frequent Recombination Events

PubMed Central

Wolfgruber, Thomas K.; Nakashima, Megan M.; Schneider, Kevin L.; Sharma, Anupma; Xie, Zidian; Albert, Patrice S.; Xu, Ronghui; Bilinski, Paul; Dawe, R. Kelly; Ross-Ibarra, Jeffrey; Birchler, James A.; Presting, Gernot G.

2016-01-01

The ancestral centromeres of maize contain long stretches of the tandemly arranged CentC repeat. The abundance of tandem DNA repeats and centromeric retrotransposons (CR) has presented a significant challenge to completely assembling centromeres using traditional sequencing methods. Here, we report a nearly complete assembly of the 1.85 Mb maize centromere 10 from inbred B73 using PacBio technology and BACs from the reference genome project. The error rates estimated from overlapping BAC sequences are 7 × 10−6 and 5 × 10−5 for mismatches and indels, respectively. The number of gaps in the region covered by the reassembly was reduced from 140 in the reference genome to three. Three expressed genes are located between 92 and 477 kb from the inferred ancestral CentC cluster, which lies within the region of highest centromeric repeat density. The improved assembly increased the count of full-length CR from 5 to 55 and revealed a 22.7 kb segmental duplication that occurred approximately 121,000 years ago. Our analysis provides evidence of frequent recombination events in the form of partial retrotransposons, deletions within retrotransposons, chimeric retrotransposons, segmental duplications including higher order CentC repeats, a deleted CentC monomer, centromere-proximal inversions, and insertion of mitochondrial sequences. Double-strand DNA break (DSB) repair is the most plausible mechanism for these events and may be the major driver of centromere repeat evolution and diversity. In many cases examined here, DSB repair appears to be mediated by microhomology, suggesting that tandem repeats may have evolved to efficiently repair frequent DSBs in centromeres. PMID:27047500
Rapid and highly efficient construction of TALE-based transcriptional regulators and nucleases for genome modification.

PubMed

Li, Lixin; Piatek, Marek J; Atef, Ahmed; Piatek, Agnieszka; Wibowo, Anjar; Fang, Xiaoyun; Sabir, J S M; Zhu, Jian-Kang; Mahfouz, Magdy M

2012-03-01

Transcription activator-like effectors (TALEs) can be used as DNA-targeting modules by engineering their repeat domains to dictate user-selected sequence specificity. TALEs have been shown to function as site-specific transcriptional activators in a variety of cell types and organisms. TALE nucleases (TALENs), generated by fusing the FokI cleavage domain to TALE, have been used to create genomic double-strand breaks. The identity of the TALE repeat variable di-residues, their number, and their order dictate the DNA sequence specificity. Because TALE repeats are nearly identical, their assembly by cloning or even by synthesis is challenging and time consuming. Here, we report the development and use of a rapid and straightforward approach for the construction of designer TALE (dTALE) activators and nucleases with user-selected DNA target specificity. Using our plasmid set of 100 repeat modules, researchers can assemble repeat domains for any 14-nucleotide target sequence in one sequential restriction-ligation cloning step and in only 24 h. We generated several custom dTALEs and dTALENs with new target sequence specificities and validated their function by transient expression in tobacco leaves and in vitro DNA cleavage assays, respectively. Moreover, we developed a web tool, called idTALE, to facilitate the design of dTALENs and the identification of their genomic targets and potential off-targets in the genomes of several model species. Our dTALE repeat assembly approach along with the web tool idTALE will expedite genome-engineering applications in a variety of cell types and organisms including plants.
Inter-plate aseismic slip on the subducting plate boundaries estimated from repeating earthquakes

NASA Astrophysics Data System (ADS)

Igarashi, T.

2015-12-01

Sequences of repeating earthquakes are caused by repeating slips of small patches surrounded by aseismic slip areas at plate boundary zones. Recently, they have been detected in many regions. In this study, I detected repeating earthquakes which occurred in Japan and the world by using seismograms observed in the Japanese seismic network, and investigated the space-time characteristics of inter-plate aseismic slip on the subducting plate boundaries. To extract repeating earthquakes, I calculate cross-correlation coefficients of band-pass filtering seismograms at each station following Igarashi [2010]. I used two data-set based on USGS catalog for about 25 years from May 1990 and JMA catalog for about 13 years from January 2002. As a result, I found many sequences of repeating earthquakes in the subducting plate boundaries of the Andaman-Sumatra-Java and Japan-Kuril-Kamchatka-Aleutian subduction zones. By applying the scaling relations among a seismic moment, recurrence interval and slip proposed by Nadeau and Johnson [1998], they indicate the space-time changes of inter-plate aseismic slips. Pairs of repeating earthquakes with the longest time interval occurred in the Solomon Islands area and the recurrence interval was about 18.5 years. The estimated slip-rate is about 46 mm/year, which correspond to about half of the relative plate motion in this area. Several sequences with fast slip-rates correspond to the post-seismic slips after the 2004 Sumatra-Andaman earthquake (M9.0), the 2006 Kuril earthquake (M8.3), the 2007 southern Sumatra earthquake (M8.5), and the 2011 Tohoku-oki earthquake (M9.0). The database of global repeating earthquakes enables the comparison of the inter-plate aseismic slips of various plate boundary zones of the world. I believe that I am likely to detect more sequences by extending analysis periods in the area where they were not found in this analysis.
Molecular cloning and sequence analysis of the gene coding for the 57kDa soluble antigen of the salmonid fish pathogen Renibacterium salmoninarum

USGS Publications Warehouse

Chien, Maw-Sheng; Gilbert , Teresa L.; Huang, Chienjin; Landolt, Marsha L.; O'Hara, Patrick J.; Winton, James R.

1992-01-01

The complete sequence coding for the 57-kDa major soluble antigen of the salmonid fish pathogen, Renibacterium salmoninarum, was determined. The gene contained an opening reading frame of 1671 nucleotides coding for a protein of 557 amino acids with a calculated Mr value of 57190. The first 26 amino acids constituted a signal peptide. The deduced sequence for amino acid residues 27–61 was in agreement with the 35 N-terminal amino acid residues determined by microsequencing, suggesting the protein in synthesized as a 557-amino acid precursor and processed to produce a mature protein of Mr 54505. Two regions of the protein contained imperfect direct repeats. The first region contained two copies of an 81-residue repeat, the second contained five copies of an unrelated 25-residue repeat. Also, a perfect inverted repeat (including three in-frame UAA stop codons) was observed at the carboxyl-terminus of the gene.
Direct repeat sequences are essential for function of the cis-acting locus of transfer (clt) of Streptomyces phaeochromogenes plasmid pJV1.

PubMed

Franco, Bernardo; González-Cerón, Gabriela; Servín-González, Luis

2003-11-01

The functionality of direct and inverted repeat sequences inside the cis acting locus of transfer (clt) of the Streptomyces plasmid pJV1 was determined by testing the effect of different deletions on plasmid transfer. The results show that the single most important element for pJV1 clt function is a series of evenly spaced 9 bp long direct repeats which match the consensus CCGCACA(C/G)(C/G), since their deletion caused a dramatic reduction in plasmid transfer. The presence of these repeats in the absence of any other clt sequences allowed plasmid transfer to occur at a frequency that was at least two orders of magnitude higher than that obtained in the complete absence of clt. A database search revealed regions with a similar organization, and in the same position, in Streptomyces plasmids pSN22 and pSLS, which have transfer proteins homologous to those of pJV1.
BAC end sequencing of Pacific white shrimp Litopenaeus vannamei: a glimpse into the genome of Penaeid shrimp

NASA Astrophysics Data System (ADS)

Zhao, Cui; Zhang, Xiaojun; Liu, Chengzhang; Huan, Pin; Li, Fuhua; Xiang, Jianhai; Huang, Chao

2012-05-01

Little is known about the genome of Pacific white shrimp ( Litopenaeus vannamei). To address this, we conducted BAC (bacterial artificial chromosome) end sequencing of L. vannamei. We selected and sequenced 7 812 BAC clones from the BAC library LvHE from the two ends of the inserts by Sanger sequencing. After trimming and quality filtering, 11 279 BAC end sequences (BESs) including 4 609 pairedends BESs were obtained. The total length of the BESs was 4 340 753 bp, representing 0.18% of the L. vannamei haploid genome. The lengths of the BESs ranged from 100 bp to 660 bp with an average length of 385 bp. Analysis of the BESs indicated that the L. vannamei genome is AT-rich and that the primary repeats patterns were simple sequence repeats (SSRs) and low complexity sequences. Dinucleotide and hexanucleotide repeats were the most common SSR types in the BESs. The most abundant transposable element was gypsy, which may contribute to the generation of the large genome size of L. vannamei. We successfully annotated 4 519 BESs by BLAST searching, including genes involved in immunity and sex determination. Our results provide an important resource for functional gene studies, map construction and integration, and complete genome assembly for this species.
Transposon-like properties of the major, long repetitive sequence family in the genome of Physarum polycephalum

PubMed Central

Pearston, Douglas H.; Gordon, Mairi; Hardman, Norman

1985-01-01

A family of long, highly-repetitive sequences, referred to previously as `HpaII-repeats', dominates the genome of the eukaryotic slime mould Physarum polycephalum. These sequences are found exclusively in scrambled clusters. They account for about one-half of the total complement of repetitive DNA in Physarum, and represent the major sequence component found in hypermethylated, 20-50 kb segments of Physarum genomic DNA that fail to be cleaved using the restriction endonuclease HpaII. The structure of this abundant repetitive element was investigated by analysing cloned segments derived from the hypermethylated genomic DNA compartment. We show that the `HpaII-repeat' forms part of a larger repetitive DNA structure, ∼8.6 kb in length, with several structural features in common with recognised eukaryotic transposable genetic elements. Scrambled clusters of the sequence probably arise as a result of transposition-like events, during which the element preferentially recombines in either orientation with target sites located in other copies of the same repeated sequence. The target sites for transposition/recombination are not related in sequence but in all cases studied they are potentially capable of promoting the formation of small `cruciforms' or `Z-DNA' structures which might be recognised during the recombination process. ImagesFig. 3.Fig. 4. PMID:16453652
Sequencing, annotation and comparative analysis of nine BACs of giant panda (Ailuropoda melanoleuca).

PubMed

Zheng, Yang; Cai, Jing; Li, JianWen; Li, Bo; Lin, Runmao; Tian, Feng; Wang, XiaoLing; Wang, Jun

2010-01-01

A 10-fold BAC library for giant panda was constructed and nine BACs were selected to generate finish sequences. These BACs could be used as a validation resource for the de novo assembly accuracy of the whole genome shotgun sequencing reads of giant panda newly generated by the Illumina GA sequencing technology. Complete sanger sequencing, assembly, annotation and comparative analysis were carried out on the selected BACs of a joint length 878 kb. Homologue search and de novo prediction methods were used to annotate genes and repeats. Twelve protein coding genes were predicted, seven of which could be functionally annotated. The seven genes have an average gene size of about 41 kb, an average coding size of about 1.2 kb and an average exon number of 6 per gene. Besides, seven tRNA genes were found. About 27 percent of the BAC sequence is composed of repeats. A phylogenetic tree was constructed using neighbor-join algorithm across five species, including giant panda, human, dog, cat and mouse, which reconfirms dog as the most related species to giant panda. Our results provide detailed sequence and structure information for new genes and repeats of giant panda, which will be helpful for further studies on the giant panda.
Highly sensitive detection of individual HEAT and ARM repeats with HHpred and COACH.

PubMed

Kippert, Fred; Gerloff, Dietlind L

2009-09-24

HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat. Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries. A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high sensitivity at a low false positive rate and will therefore greatly enhance the accuracy of predictions of HEAT and ARM domains.
Highly Sensitive Detection of Individual HEAT and ARM Repeats with HHpred and COACH

PubMed Central

Kippert, Fred; Gerloff, Dietlind L.

2009-01-01

Background HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat. Methodology and Principal Findings Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries. Significance A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high sensitivity at a low false positive rate and will therefore greatly enhance the accuracy of predictions of HEAT and ARM domains. PMID:19777061
Genetic characterization of the UCS and Kex1 loci of Pneumocystis jirovecii.

PubMed

Esteves, F; Tavares, A; Costa, M C; Gaspar, J; Antunes, F; Matos, O

2009-02-01

Nucleotide variation in the Pneumocystis jirovecii upstream conserved sequence (UCS) and kexin-like serine protease (Kex1) loci was studied in pulmonary specimens from Portuguese HIV-positive patients. DNA was extracted and used for specific molecular sequence analysis. The number of UCS tandem repeats detected in 13 successfully sequenced isolates ranged from three (9 isolates, 69%) to four (4 isolates, 31%). A novel tandem repeat pattern and two novel polymorphisms were detected in the UCS region. For the Kex1 gene, the wild-type (24 isolates, 86%) was the most frequent sequence detected among the 28 sequenced isolates. Nevertheless, a nonsynonymous (1 isolate, 3%) and three synonymous (3 isolates, 11%) polymorphisms were detected and are described here for the first time.
APE1 incision activity at abasic sites in tandem repeat sequences.

PubMed

Li, Mengxia; Völker, Jens; Breslauer, Kenneth J; Wilson, David M

2014-05-29

Repetitive DNA sequences, such as those present in microsatellites and minisatellites, telomeres, and trinucleotide repeats (linked to fragile X syndrome, Huntington disease, etc.), account for nearly 30% of the human genome. These domains exhibit enhanced susceptibility to oxidative attack to yield base modifications, strand breaks, and abasic sites; have a propensity to adopt non-canonical DNA forms modulated by the positions of the lesions; and, when not properly processed, can contribute to genome instability that underlies aging and disease development. Knowledge on the repair efficiencies of DNA damage within such repetitive sequences is therefore crucial for understanding the impact of such domains on genomic integrity. In the present study, using strategically designed oligonucleotide substrates, we determined the ability of human apurinic/apyrimidinic endonuclease 1 (APE1) to cleave at apurinic/apyrimidinic (AP) sites in a collection of tandem DNA repeat landscapes involving telomeric and CAG/CTG repeat sequences. Our studies reveal the differential influence of domain sequence, conformation, and AP site location/relative positioning on the efficiency of APE1 binding and strand incision. Intriguingly, our data demonstrate that APE1 endonuclease efficiency correlates with the thermodynamic stability of the DNA substrate. We discuss how these results have both predictive and mechanistic consequences for understanding the success and failure of repair protein activity associated with such oxidatively sensitive, conformationally plastic/dynamic repetitive DNA domains. Published by Elsevier Ltd.
The complete chloroplast genome of Cinnamomum camphora and its comparison with related Lauraceae species.

PubMed

Chen, Caihui; Zheng, Yongjie; Liu, Sian; Zhong, Yongda; Wu, Yanfang; Li, Jiang; Xu, Li-An; Xu, Meng

2017-01-01

Cinnamomum camphora , a member of the Lauraceae family, is a valuable aromatic and timber tree that is indigenous to the south of China and Japan. All parts of Cinnamomum camphora have secretory cells containing different volatile chemical compounds that are utilized as herbal medicines and essential oils. Here, we reported the complete sequencing of the chloroplast genome of Cinnamomum camphora using illumina technology. The chloroplast genome of Cinnamomum camphora is 152,570 bp in length and characterized by a relatively conserved quadripartite structure containing a large single copy region of 93,705 bp, a small single copy region of 19,093 bp and two inverted repeat (IR) regions of 19,886 bp. Overall, the genome contained 123 coding regions, of which 15 were repeated in the IR regions. An analysis of chloroplast sequence divergence revealed that the small single copy region was highly variable among the different genera in the Lauraceae family. A total of 40 repeat structures and 83 simple sequence repeats were detected in both the coding and non-coding regions. A phylogenetic analysis indicated that Calycanthus is most closely related to Lauraceae , both being members of Laurales , which forms a sister group to Magnoliids . The complete sequence of the chloroplast of Cinnamomum camphora will aid in in-depth taxonomical studies of the Lauraceae family in the future. The genetic sequence information will also have valuable applications for chloroplast genetic engineering.
[Detection of CRISPR and its relationship to drug resistance in Shigella].

PubMed

Wang, Linlin; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Guo, Xiangjiao; Wang, Pengfei; Xi, Yuanlin; Yang, Haiyan

2015-04-04

To detect clustered regularly interspaced short palindromic repeats (CRISPR) in Shigella, and to analyze its relationship to drug resistance. Four pairs of primers were used for the detection of convincing CRISPR structures CRISPR-S2 and CRISPR-S4, questionable CRISPR structures CRISPR-S1 and CRISPR-S3 in 60 Shigella strains. All primers were designed using sequences in CRISPR database. CRISPR Finder was used to analyze CRISPR and susceptibilities of Shigella strains were tested by agar diffusion method. Furthermore, we analyzed the relationship between drug resistance and CRISPR-S4. The positive rate of convincing CRISPR structures was 95%. The four CRISPR loci formed 12 spectral patterns (A-L), all of which contained convincing CRISPR structures except type K. We found one new repeat and 12 new spacers. The multi-drug resistance rate was 53. 33% . We found no significant difference between CRISPR-S4 and drug resistant. However, the repeat sequence of CRISPR-S4 in multi- or TE-resistance strains was mainly R4.1 with AC deletions in the 3' end, and the spacer sequences of CRISPR-S4 in multi-drug resistance strains were mainly Sp5.1, Sp6.1 and Sp7. CRISPR was common in Shigella. Variations df repeat sequences and diversities of spacer sequences might be related to drug resistance in Shigella.
The structure of TON1937 from archaeon Thermococcus onnurineus NA1 reveals a eukaryotic HEAT-like architecture.

PubMed

Jeong, Jae-Hee; Kim, Yi-Seul; Rojviriya, Catleya; Cha, Hyung Jin; Ha, Sung-Chul; Kim, Yeon-Gil

2013-10-01

The members of the ARM/HEAT repeat-containing protein superfamily in eukaryotes have been known to mediate protein-protein interactions by using their concave surface. However, little is known about the ARM/HEAT repeat proteins in prokaryotes. Here we report the crystal structure of TON1937, a hypothetical protein from the hyperthermophilic archaeon Thermococcus onnurineus NA1. The structure reveals a crescent-shaped molecule composed of a double layer of α-helices with seven anti-parallel α-helical repeats. A structure-based sequence alignment of the α-helical repeats identified a conserved pattern of hydrophobic or aliphatic residues reminiscent of the consensus sequence of eukaryotic HEAT repeats. The individual repeats of TON1937 also share high structural similarity with the canonical eukaryotic HEAT repeats. In addition, the concave surface of TON1937 is proposed to be its potential binding interface based on this structural comparison and its surface properties. These observations lead us to speculate that the archaeal HEAT-like repeats of TON1937 have evolved to engage in protein-protein interactions in the same manner as eukaryotic HEAT repeats. Copyright © 2013 Elsevier B.V. All rights reserved.
Glial S100B protein modulates mutant ataxin-1 aggregation and toxicity: TRTK12 peptide, a potential candidate for SCA1 therapy.

PubMed

Vig, Parminder J S; Hearst, Scoty; Shao, Qingmei; Lopez, Mariper E; Murphy, Henry A; Safaya, Eshan

2011-06-01

Non-cell autonomous involvement of glial cells in the pathogenesis of polyglutamine diseases is gaining recognition in the ataxia field. We previously demonstrated that Purkinje cells (PCs) in polyglutamine disease spinocerebellar ataxia-1 (SCA1) contain cytoplasmic vacuoles rich in Bergmann glial protein S100B. The vacuolar formation in SCA1 PCs is accompanied with an abnormal morphology of dendritic spines. In addition, S100B messenger RNA (mRNA) expression levels are significantly high in the cerebella of asymptomatic SCA1 transgenic (Tg) mice and increase further with age when compared with the age-matched wild-type animals. This higher S100B mRNA expression positively correlates with an increase in the number of vacuoles. To further characterize the function of S100B in SCA1 pathology, we explored the effects of S100B protein on GFP-ataxin-1 (ATXN1) with expanded polyglutamines [82Q] in HEK stable cell line. Externally added S100B protein to these cells induced S100B-positive vacuoles similar to those seen in SCA1 PCs in vivo. Further, we found that both externally added and internally expressed S100B significantly reduced GFP-ATXN1[82Q] inclusion body formation. In contrast, the addition of S100B inhibitory peptide TRTK12 reversed S100B-mediated effects. Interestingly, in SCA1 Tg mice, PCs containing S100B vacuoles also showed the lack of nuclear inclusions, whereas PCs without vacuoles contained nuclear inclusions. Additionally, TRTK12 treatment reduced abnormal dendritic growth and morphology of PCs in cerebellar slice cultures prepared from SCA1 Tg mice. Moreover, intranasal administration of TRTK12 to SCA1 Tg mice reduced cerebellar S100B levels in the particulate fractions, and these mice displayed a significant improvement in their performance deficit on the Rotarod test. Taken together, our results suggest that glial S100B may augment degenerative changes in SCA1 PCs by modulating mutant ataxin-1 toxicity/solubility through an unknown signaling pathway.

Glial S100B protein modulates mutant ataxin-1 aggregation and toxicity: TRTK12 peptide, a potential candidate for SCA1 therapy

PubMed Central

Vig, Parminder J.S.; Hearst, Scoty; Shao, Qingmei; Lopez, Maripar E; Murphy, Henry A; Safaya, Eshan

2011-01-01

Non-cell autonomous involvement of glial cells in the pathogenesis of polyglutamine diseases is gaining recognition in the ataxia field. We previously demonstrated that Purkinje cells (PCs) in polyglutamine disease spinocerebellar ataxia-1 (SCA1) contain cytoplasmic vacuoles rich in Bergmann glial (BG) protein S100B. The vacuolar formation in SCA1 PCs is accompanied with an abnormal morphology of dendritic spines. In addition, S100B mRNA expression levels are significantly high in the cerebella of asymptomatic SCA1 transgenic (Tg) mice and increase further with age when compared with the age-matched wildtype animals. This higher S100B mRNA expression positively correlates with an increase in the number of vacuoles. To further characterize the function of S100B in SCA1 pathology, we explored the effects of S100B protein on GFP-ataxin-1 (ATXN1) with expanded polyglutamines [82Q] in HEK stable cell line. Externally added S100B protein to these cells induced S100B positive vacuoles similar to those seen in SCA1 PCs in vivo. Further, we found that both externally added and internally expressed S100B significantly reduced GFP-ATXN1[82Q] inclusion body formation. In contrast, the addition of S100B inhibitory peptide TRTK12 reversed S100B mediated effects. Interestingly, in SCA1 Tg mice, PCs containing S100B vacuoles also showed the lack of nuclear inclusions, whereas, PCs without vacuoles contained nuclear inclusions. Additionally, TRTK12 treatment reduced abnormal dendritic growth and morphology of PCs in cerebellar slice cultures prepared from SCA1 Tg mice. Moreover, intranasal administration of TRTK12 to SCA1 Tg mice reduced cerebellar S100B levels in the particulate fractions and these mice displayed a significant improvement in their performance deficit on the Rotarod test. Taken together our results suggest that glial S100B may augment degenerative changes in SCA1 PCs by modulating mutant ataxin-1 toxicity/solubility through an unknown signaling pathway. PMID:21384195
Mutation at a distance caused by homopolymeric guanine repeats in Saccharomyces cerevisiae

PubMed Central

McDonald, Michael J.; Yu, Yen-Hsin; Guo, Jheng-Fen; Chong, Shin Yen; Kao, Cheng-Fu; Leu, Jun-Yi

2016-01-01

Mutation provides the raw material from which natural selection shapes adaptations. The rate at which new mutations arise is therefore a key factor that determines the tempo and mode of evolution. However, an accurate assessment of the mutation rate of a given organism is difficult because mutation rate varies on a fine scale within a genome. A central challenge of evolutionary genetics is to determine the underlying causes of this variation. In earlier work, we had shown that repeat sequences not only are prone to a high rate of expansion and contraction but also can cause an increase in mutation rate (on the order of kilobases) of the sequence surrounding the repeat. We perform experiments that show that simple guanine repeats 13 bp (base pairs) in length or longer (G13+) increase the substitution rate 4- to 18-fold in the downstream DNA sequence, and this correlates with DNA replication timing (R = 0.89). We show that G13+ mutagenicity results from the interplay of both error-prone translesion synthesis and homologous recombination repair pathways. The mutagenic repeats that we study have the potential to be exploited for the artificial elevation of mutation rate in systems biology and synthetic biology applications. PMID:27386516
The molecular diversity of α-gliadin genes in the tribe Triticeae.

PubMed

Qi, Peng-Fei; Chen, Qing; Ouellet, Thérèse; Wang, Zhao; Le, Cheng-Xing; Wei, Yu-Ming; Lan, Xiu-Jin; Zheng, You-Liang

2013-09-01

Many of the unique properties of wheat flour are derived from seed storage proteins such as the α-gliadins. In this study these α-gliadin genes from diploid Triticeae species were systemically characterized, and divided into 3 classes according to the distinct organization of their protein domains. Our analyses indicated that these α-gliadins varied in the number of cysteine residues they contained. Most of the α-gliadin genes were grouped according to their genomic origins within the phylogenetic tree. As expected, sequence alignments suggested that the repetitive domain and the two polyglutamine regions were responsible for length variations of α-gliadins as were the insertion/deletion of structural domains within the three different classes (I, II, and III) of α-gliadins. A screening of celiac disease toxic epitopes indicated that the α-gliadins of the class II, derived from the Ns genome, contain no epitope, and that some other genomes contain much fewer epitopes than the A, S(B) and D genomes of wheat. Our results suggest that the observed genetic differences in α-gliadins of Triticeae might indicate their use as a fertile ground for the breeding of less CD-toxic wheat varieties.
From NGS assembly challenges to instability of fungal mitochondrial genomes: A case study in genome complexity.

PubMed

Misas, Elizabeth; Muñoz, José Fernando; Gallo, Juan Esteban; McEwen, Juan Guillermo; Clay, Oliver Keatinge

2016-04-01

The presence of repetitive or non-unique DNA persisting over sizable regions of a eukaryotic genome can hinder the genome's successful de novo assembly from short reads: ambiguities in assigning genome locations to the non-unique subsequences can result in premature termination of contigs and thus overfragmented assemblies. Fungal mitochondrial (mtDNA) genomes are compact (typically less than 100 kb), yet often contain short non-unique sequences that can be shown to impede their successful de novo assembly in silico. Such repeats can also confuse processes in the cell in vivo. A well-studied example is ectopic (out-of-register, illegitimate) recombination associated with repeat pairs, which can lead to deletion of functionally important genes that are located between the repeats. Repeats that remain conserved over micro- or macroevolutionary timescales despite such risks may indicate functionally or structurally (e.g., for replication) important regions. This principle could form the basis of a mining strategy for accelerating discovery of function in genome sequences. We present here our screening of a sample of 11 fully sequenced fungal mitochondrial genomes by observing where exact k-mer repeats occurred several times; initial analyses motivated us to focus on 17-mers occurring more than three times. Based on the diverse repeats we observe, we propose that such screening may serve as an efficient expedient for gaining a rapid but representative first insight into the repeat landscapes of sparsely characterized mitochondrial chromosomes. Our matching of the flagged repeats to previously reported regions of interest supports the idea that systems of persisting, non-trivial repeats in genomes can often highlight features meriting further attention. Copyright © 2016 Elsevier Ltd. All rights reserved.
Comparison of the carboxy-terminal DP-repeat region in the co-chaperones Hop and Hip

PubMed Central

Nelson, Gregory M.; Huffman, Holly; Smith, David F.

2003-01-01

Functional steroid receptor complexes are assembled and maintained by an ordered pathway of interactions involving multiple components of the cellular chaperone machinery. Two of these components, Hop and Hip, serve as co-chaperones to the major heat shock proteins (Hsps), Hsp70 and Hsp90, and participate in intermediate stages of receptor assembly. In an effort to better understand the functions of Hop and Hip in the assembly process, we focused on a region of similarity located near the C-terminus of each co-chaperone. Contained within this region is a repeated sequence motif we have termed the DP repeat. Earlier mutagenesis studies implicated the DP repeat of either Hop or Hip in Hsp70 binding and in normal assembly of the co-chaperones with progesterone receptor (PR) complexes. We report here that the DP repeat lies within a protease-resistant domain that extends to or is near the C-terminus of both co-chaperones. Point mutations in the DP repeats render the C-terminal regions hypersensitive to proteolysis. In addition, a Hop DP mutant displays altered proteolytic digestion patterns, which suggest that the DP-repeat region influences the folding of other Hop domains. Although the respective DP regions of Hop and Hip share sequence and structural similarities, they are not functionally interchangeable. Moreover, a double-point mutation within the second DP-repeat unit of Hop that converts this to the sequence found in Hip disrupts Hop function; however, the corresponding mutation in Hip does not alter its function. We conclude that the DP repeats are important structural elements within a C-terminal domain, which is important for Hop and Hip function. PMID:14627198
Comparison of the carboxy-terminal DP-repeat region in the co-chaperones Hop and Hip.

PubMed

Nelson, Gregory M; Huffman, Holly; Smith, David F

2003-01-01

Functional steroid receptor complexes are assembled and maintained by an ordered pathway of interactions involving multiple components of the cellular chaperone machinery. Two of these components, Hop and Hip, serve as co-chaperones to the major heat shock proteins (Hsps), Hsp70 and Hsp90, and participate in intermediate stages of receptor assembly. In an effort to better understand the functions of Hop and Hip in the assembly process, we focused on a region of similarity located near the C-terminus of each co-chaperone. Contained within this region is a repeated sequence motif we have termed the DP repeat. Earlier mutagenesis studies implicated the DP repeat of either Hop or Hip in Hsp70 binding and in normal assembly of the co-chaperones with progesterone receptor (PR) complexes. We report here that the DP repeat lies within a protease-resistant domain that extends to or is near the C-terminus of both co-chaperones. Point mutations in the DP repeats render the C-terminal regions hypersensitive to proteolysis. In addition, a Hop DP mutant displays altered proteolytic digestion patterns, which suggest that the DP-repeat region influences the folding of other Hop domains. Although the respective DP regions of Hop and Hip share sequence and structural similarities, they are not functionally interchangeable. Moreover, a double-point mutation within the second DP-repeat unit of Hop that converts this to the sequence found in Hip disrupts Hop function; however, the corresponding mutation in Hip does not alter its function. We conclude that the DP repeats are important structural elements within a C-terminal domain, which is important for Hop and Hip function.
Selfish DNA in protein-coding genes of Rickettsia.

PubMed

Ogata, H; Audic, S; Barbe, V; Artiguenave, F; Fournier, P E; Raoult, D; Claverie, J M

2000-10-13

Rickettsia conorii, the aetiological agent of Mediterranean spotted fever, is an intracellular bacterium transmitted by ticks. Preliminary analyses of the nearly complete genome sequence of R. conorii have revealed 44 occurrences of a previously undescribed palindromic repeat (150 base pairs long) throughout the genome. Unexpectedly, this repeat was found inserted in-frame within 19 different R. conorii open reading frames likely to encode functional proteins. We found the same repeat in proteins of other Rickettsia species. The finding of a mobile element inserted in many unrelated genes suggests the potential role of selfish DNA in the creation of new protein sequences.
Complex structure of knob DNA on maize chromosome 9. Retrotransposon invasion into heterochromatin.

PubMed Central

Ananiev, E V; Phillips, R L; Rines, H W

1998-01-01

The recovery of maize (Zea mays L.) chromosome addition lines of oat (Avena sativa L.) from oat x maize crosses enables us to analyze the structure and composition of specific regions, such as knobs, of individual maize chromosomes. A DNA hybridization blot panel of eight individual maize chromosome addition lines revealed that 180-bp repeats found in knobs are present in each of these maize chromosomes, but the copy number varies from approximately 100 to 25, 000. Cosmid clones with knob DNA segments were isolated from a genomic library of an oat-maize chromosome 9 addition line with the help of the 180-bp knob-associated repeated DNA sequence used as a probe. Cloned knob DNA segments revealed a complex organization in which blocks of tandemly arranged 180-bp repeating units are interrupted by insertions of other repeated DNA sequences, mostly represented by individual full size copies of retrotransposable elements. There is an obvious preference for the integration of retrotransposable elements into certain sites (hot spots) of the 180-bp repeat. Sequence microheterogeneity including point mutations and duplications was found in copies of 180-bp repeats. The 180-bp repeats within an array all had the same polarity. Restriction maps constructed for 23 cloned knob DNA fragments revealed the positions of polymorphic sites and sites of integration of insertion elements. Discovery of the interspersion of retrotransposable elements among blocks of tandem repeats in maize and some other organisms suggests that this pattern may be basic to heterochromatin organization for eukaryotes. PMID:9691055
Concerted evolution of the tandem array encoding primate U2 snRNA occurs in situ, without changing the cytological context of the RNU2 locus.

PubMed Central

Pavelitz, T; Rusché, L; Matera, A G; Scharf, J M; Weiner, A M

1995-01-01

In primates, the tandemly repeated genes encoding U2 small nuclear RNA evolve concertedly, i.e. the sequence of the U2 repeat unit is essentially homogeneous within each species but differs somewhat between species. Using chromosome painting and the NGFR gene as an outside marker, we show that the U2 tandem array (RNU2) has remained at the same chromosomal locus (equivalent to human 17q21) through multiple speciation events over > 35 million years leading to the Old World monkey and hominoid lineages. The data suggest that the U2 tandem repeat, once established in the primate lineage, contained sequence elements favoring perpetuation and concerted evolution of the array in situ, despite a pericentric inversion in chimpanzee, a reciprocal translocation in gorilla and a paracentric inversion in orang utan. Comparison of the 11 kb U2 repeat unit found in baboon and other Old World monkeys with the 6 kb U2 repeat unit in humans and other hominids revealed that an ancestral U2 repeat unit was expanded by insertion of a 5 kb retrovirus bearing 1 kb long terminal repeats (LTRs). Subsequent excision of the provirus by homologous recombination between the LTRs generated a 6 kb U2 repeat unit containing a solo LTR. Remarkably, both junctions between the human U2 tandem array and flanking chromosomal DNA at 17q21 fall within the solo LTR sequence, suggesting a role for the LTR in the origin or maintenance of the primate U2 array. Images PMID:7828589
CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats.

PubMed

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2007-07-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) constitute a particular family of tandem repeats found in a wide range of prokaryotic genomes (half of eubacteria and almost all archaea). They consist of a succession of highly conserved regions (DR) varying in size from 23 to 47 bp, separated by similarly sized unique sequences (spacer) of usually viral origin. A CRISPR cluster is flanked on one side by an AT-rich sequence called the leader and assumed to be a transcriptional promoter. Recent studies suggest that this structure represents a putative RNA-interference-based immune system. Here we describe CRISPRFinder, a web service offering tools to (i) detect CRISPRs including the shortest ones (one or two motifs); (ii) define DRs and extract spacers; (iii) get the flanking sequences to determine the leader; (iv) blast spacers against Genbank database and (v) check if the DR is found elsewhere in prokaryotic sequenced genomes. CRISPRFinder is freely accessible at http://crispr.u-psud.fr/Server/CRISPRfinder.php.
Tactile Ranschburg effects: facilitation and inhibitory repetition effects analogous to verbal memory.

PubMed

Roe, Daisy; Miles, Christopher; Johnson, Andrew J

2017-07-01

The present paper examines the effect of within-sequence item repetitions in tactile order memory. Employing an immediate serial recall procedure, participants reconstructed a six-item sequence tapped upon their fingers by moving those fingers in the order of original stimulation. In Experiment 1a, within-sequence repetition of an item separated by two-intervening items resulted in a significant reduction in recall accuracy for that repeated item (i.e., the Ranschburg effect). In Experiment 1b, within-sequence repetition of an adjacent item resulted in significant recall facilitation for that repeated item. These effects mirror those reported for verbal stimuli (e.g., Henson, 1998a . Item repetition in short-term memory: Ranschburg repeated. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24(5), 1162-1181. doi:doi.org/10.1037/0278-7393.24.5.1162). These data are the first to demonstrate the Ranschburg effect with non-verbal stimuli and suggest further cross-modal similarities in order memory.
Development, characterization and cross species amplification of polymorphic microsatellite markers from expressed sequence tags of turmeric (Curcuma longa L.).

PubMed

Siju, S; Dhanya, K; Syamkumar, S; Sasikumar, B; Sheeja, T E; Bhat, A I; Parthasarathy, V A

2010-02-01

Expressed sequence tags (ESTs) from turmeric (Curcuma longa L.) were used for the screening of type and frequency of Class I (hypervariable) simple sequence repeats (SSRs). A total of 231 microsatellite repeats were detected from 12,593 EST sequences of turmeric after redundancy elimination. The average density of Class I SSRs accounts to one SSR per 17.96 kb of EST. Mononucleotides were the most abundant class of microsatellite repeat in turmeric ESTs followed by trinucleotides. A robust set of 17 polymorphic EST-SSRs were developed and used for evaluating 20 turmeric accessions. The number of alleles detected ranged from 3 to 8 per loci. The developed markers were also evaluated in 13 related species of C. longa confirming high rate (100%) of cross species transferability. The polymorphic microsatellite markers generated from this study could be used for genetic diversity analysis and resolving the taxonomic confusion prevailing in the genus.
A Glance at Microsatellite Motifs from 454 Sequencing Reads of Watermelon Genomic DNA

USDA-ARS?s Scientific Manuscript database

A single 454 (Life Sciences Sequencing Technology) run of Charleston Gray watermelon (Citrullus lanatus var. lanatus) genomic DNA was performed and sequence data were assembled. A large scale identification of simple sequence repeat (SSR) was performed and SSR sequence data were used for the develo...
Fast and Cost-Effective Mining of Microsatellite Markers Using NGS Technology: An Example of a Korean Water Deer Hydropotes inermis argyropus

PubMed Central

Yu, Jeong-Nam; Won, Changman; Jun, Jumin; Lim, YoungWoon; Kwak, Myounghai

2011-01-01

Background Microsatellites, a special class of repetitive DNA sequence, have become one of the most popular genetic markers for population/conservation genetic studies. However, its application to endangered species has been impeded by high development costs, a lack of available sequences, and technical difficulties. The water deer Hydropotes inermis is the sole existing endangered species of the subfamily Capreolinae. Although population genetics studies are urgently required for conservation management, no species-specific microsatellite marker has been reported. Methods We adopted next-generation sequencing (NGS) to elucidate the microsatellite markers of Korean water deer and overcome these impediments on marker developments. We performed genotyping to determine the efficiency of this method as applied to population genetics. Results We obtained 98 Mbp of nucleotide information from 260,467 sequence reads. A total of 20,101 di-/tri-nucleotide repeat motifs were identified; di-repeats were 5.9-fold more common than tri-repeats. [CA]n and [AAC]n/[AAT]n repeats were the most frequent di- and tri-repeats, respectively. Of the 17,206 di-repeats, 12,471 microsatellite primer pairs were derived. PCR amplification of 400 primer pairs yielded 106 amplicons and 79 polymorphic markers from 20 individual Korean water deer. Polymorphic rates of the 79 new microsatellites varied from 2 to 11 alleles per locus (He: 0.050–0.880; Ho: 0.000–1.000), while those of known microsatellite markers transferred from cattle to Chinese water deer ranged from 4 to 6 alleles per locus (He: 0.279–0.714; Ho: 0.300–0.400). Conclusions Polymorphic microsatellite markers from Korean water deer were successfully identified using NGS without any prior sequence information and deposited into the public database. Thus, the methods described herein represent a rapid and low-cost way to investigate the population genetics of endangered/non-model species. PMID:22069476
Billions of basepairs of recently expanded, repetitive sequences are eliminated from the somatic genome during copepod development.

PubMed

Sun, Cheng; Wyngaard, Grace; Walton, D Brian; Wichman, Holly A; Mueller, Rachel Lockridge

2014-03-11

Chromatin diminution is the programmed deletion of DNA from presomatic cell or nuclear lineages during development, producing single organisms that contain two different nuclear genomes. Phylogenetically diverse taxa undergo chromatin diminution--some ciliates, nematodes, copepods, and vertebrates. In cyclopoid copepods, chromatin diminution occurs in taxa with massively expanded germline genomes; depending on species, germline genome sizes range from 15 - 75 Gb, 12-74 Gb of which are lost from pre-somatic cell lineages at germline--soma differentiation. This is more than an order of magnitude more sequence than is lost from other taxa. To date, the sequences excised from copepods have not been analyzed using large-scale genomic datasets, and the processes underlying germline genomic gigantism in this clade, as well as the functional significance of chromatin diminution, have remained unknown. Here, we used high-throughput genomic sequencing and qPCR to characterize the germline and somatic genomes of Mesocyclops edax, a freshwater cyclopoid copepod with a germline genome of ~15 Gb and a somatic genome of ~3 Gb. We show that most of the excised DNA consists of repetitive sequences that are either 1) verifiable transposable elements (TEs), or 2) non-simple repeats of likely TE origin. Repeat elements in both genomes are skewed towards younger (i.e. less divergent) elements. Excised DNA is a non-random sample of the germline repeat element landscape; younger elements, and high frequency DNA transposons and LINEs, are disproportionately eliminated from the somatic genome. Our results suggest that germline genome expansion in M. edax reflects explosive repeat element proliferation, and that billions of base pairs of such repeats are deleted from the somatic genome every generation. Thus, we hypothesize that chromatin diminution is a mechanism that controls repeat element load, and that this load can evolve to be divergent between tissue types within single organisms.
Billions of basepairs of recently expanded, repetitive sequences are eliminated from the somatic genome during copepod development

PubMed Central

2014-01-01

Background Chromatin diminution is the programmed deletion of DNA from presomatic cell or nuclear lineages during development, producing single organisms that contain two different nuclear genomes. Phylogenetically diverse taxa undergo chromatin diminution — some ciliates, nematodes, copepods, and vertebrates. In cyclopoid copepods, chromatin diminution occurs in taxa with massively expanded germline genomes; depending on species, germline genome sizes range from 15 – 75 Gb, 12–74 Gb of which are lost from pre-somatic cell lineages at germline – soma differentiation. This is more than an order of magnitude more sequence than is lost from other taxa. To date, the sequences excised from copepods have not been analyzed using large-scale genomic datasets, and the processes underlying germline genomic gigantism in this clade, as well as the functional significance of chromatin diminution, have remained unknown. Results Here, we used high-throughput genomic sequencing and qPCR to characterize the germline and somatic genomes of Mesocyclops edax, a freshwater cyclopoid copepod with a germline genome of ~15 Gb and a somatic genome of ~3 Gb. We show that most of the excised DNA consists of repetitive sequences that are either 1) verifiable transposable elements (TEs), or 2) non-simple repeats of likely TE origin. Repeat elements in both genomes are skewed towards younger (i.e. less divergent) elements. Excised DNA is a non-random sample of the germline repeat element landscape; younger elements, and high frequency DNA transposons and LINEs, are disproportionately eliminated from the somatic genome. Conclusions Our results suggest that germline genome expansion in M. edax reflects explosive repeat element proliferation, and that billions of base pairs of such repeats are deleted from the somatic genome every generation. Thus, we hypothesize that chromatin diminution is a mechanism that controls repeat element load, and that this load can evolve to be divergent between tissue types within single organisms. PMID:24618421
Diversity and evolution of centromere repeats in the maize genome.

PubMed

Bilinski, Paul; Distor, Kevin; Gutierrez-Lopez, Jose; Mendoza, Gabriela Mendoza; Shi, Jinghua; Dawe, R Kelly; Ross-Ibarra, Jeffrey

2015-03-01

Centromere repeats are found in most eukaryotes and play a critical role in kinetochore formation. Though centromere repeats exhibit considerable diversity both within and among species, little is understood about the mechanisms that drive centromere repeat evolution. Here, we use maize as a model to investigate how a complex history involving polyploidy, fractionation, and recent domestication has impacted the diversity of the maize centromeric repeat CentC. We first validate the existence of long tandem arrays of repeats in maize and other taxa in the genus Zea. Although we find considerable sequence diversity among CentC copies genome-wide, genetic similarity among repeats is highest within these arrays, suggesting that tandem duplications are the primary mechanism for the generation of new copies. Nonetheless, clustering analyses identify similar sequences among distant repeats, and simulations suggest that this pattern may be due to homoplasious mutation. Although the two ancestral subgenomes of maize have contributed nearly equal numbers of centromeres, our analysis shows that the majority of all CentC repeats derive from one of the parental genomes, with an even stronger bias when examining the largest assembled contiguous clusters. Finally, by comparing maize with its wild progenitor teosinte, we find that the abundance of CentC likely decreased after domestication, while the pericentromeric repeat Cent4 has drastically increased.
Medium-sized tandem repeats represent an abundant component of the Drosophila virilis genome.

PubMed

Abdurashitov, Murat A; Gonchar, Danila A; Chernukhin, Valery A; Tomilov, Victor N; Tomilova, Julia E; Schostak, Natalia G; Zatsepina, Olga G; Zelentsova, Elena S; Evgen'ev, Michael B; Degtyarev, Sergey K H

2013-11-09

Previously, we developed a simple method for carrying out a restriction enzyme analysis of eukaryotic DNA in silico, based on the known DNA sequences of the genomes. This method allows the user to calculate lengths of all DNA fragments that are formed after a whole genome is digested at the theoretical recognition sites of a given restriction enzyme. A comparison of the observed peaks in distribution diagrams with the results from DNA cleavage using several restriction enzymes performed in vitro have shown good correspondence between the theoretical and experimental data in several cases. Here, we applied this approach to the annotated genome of Drosophila virilis which is extremely rich in various repeats. Here we explored the combined approach to perform the restriction analysis of D. virilis DNA. This approach enabled to reveal three abundant medium-sized tandem repeats within the D. virilis genome. While the 225 bp repeats were revealed previously in intergenic non-transcribed spacers between ribosomal genes of D. virilis, two other families comprised of 154 bp and 172 bp repeats were not described. Tandem Repeats Finder search demonstrated that 154 bp and 172 bp units are organized in multiple clusters in the genome of D. virilis. Characteristically, only 154 bp repeats derived from Helitron transposon are transcribed. Using in silico digestion in combination with conventional restriction analysis and sequencing of repeated DNA fragments enabled us to isolate and characterize three highly abundant families of medium-sized repeats present in the D. virilis genome. These repeats comprise a significant portion of the genome and may have important roles in genome function and structural integrity. Therefore, we demonstrated an approach which makes possible to investigate in detail the gross arrangement and expression of medium-sized repeats basing on sequencing data even in the case of incompletely assembled and/or annotated genomes.
Targeting of Repeated Sequences Unique to a Gene Results in Significant Increases in Antisense Oligonucleotide Potency

PubMed Central

Vickers, Timothy A.; Freier, Susan M.; Bui, Huynh-Hoa; Watt, Andrew; Crooke, Stanley T.

2014-01-01

A new strategy for identifying potent RNase H-dependent antisense oligonucleotides (ASOs) is presented. Our analysis of the human transcriptome revealed that a significant proportion of genes contain unique repeated sequences of 16 or more nucleotides in length. Activities of ASOs targeting these repeated sites in several representative genes were compared to those of ASOs targeting unique single sites in the same transcript. Antisense activity at repeated sites was also evaluated in a highly controlled minigene system. Targeting both native and minigene repeat sites resulted in significant increases in potency as compared to targeting of non-repeated sites. The increased potency at these sites is a result of increased frequency of ASO/RNA interactions which, in turn, increases the probability of a productive interaction between the ASO/RNA heteroduplex and human RNase H1 in the cell. These results suggest a new, highly efficient strategy for rapid identification of highly potent ASOs. PMID:25334092
Comparative molecular cytogenetic analyses of a major tandemly repeated DNA family and retrotransposon sequences in cultivated jute Corchorus species (Malvaceae).

PubMed

Begum, Rabeya; Zakrzewski, Falk; Menzel, Gerhard; Weber, Beatrice; Alam, Sheikh Shamimul; Schmidt, Thomas

2013-07-01

The cultivated jute species Corchorus olitorius and Corchorus capsularis are important fibre crops. The analysis of repetitive DNA sequences, comprising a major part of plant genomes, has not been carried out in jute but is useful to investigate the long-range organization of chromosomes. The aim of this study was the identification of repetitive DNA sequences to facilitate comparative molecular and cytogenetic studies of two jute cultivars and to develop a fluorescent in situ hybridization (FISH) karyotype for chromosome identification. A plasmid library was generated from C. olitorius and C. capsularis with genomic restriction fragments of 100-500 bp, which was complemented by targeted cloning of satellite DNA by PCR. The diversity of the repetitive DNA families was analysed comparatively. The genomic abundance and chromosomal localization of different repeat classes were investigated by Southern analysis and FISH, respectively. The cytosine methylation of satellite arrays was studied by immunolabelling. Major satellite repeats and retrotransposons have been identified from C. olitorius and C. capsularis. The satellite family CoSat I forms two undermethylated species-specific subfamilies, while the long terminal repeat (LTR) retrotransposons CoRetro I and CoRetro II show similarity to the Metaviridea of plant retroelements. FISH karyotypes were developed by multicolour FISH using these repetitive DNA sequences in combination with 5S and 18S-5·8S-25S rRNA genes which enable the unequivocal chromosome discrimination in both jute species. The analysis of the structure and diversity of the repeated DNA is crucial for genome sequence annotation. The reference karyotypes will be useful for breeding of jute and provide the basis for karyotyping homeologous chromosomes of wild jute species to reveal the genetic and evolutionary relationship between cultivated and wild Corchorus species.

A Method for WD40 Repeat Detection and Secondary Structure Prediction

PubMed Central

Wang, Yang; Jiang, Fan; Zhuo, Zhu; Wu, Xian-Hui; Wu, Yun-Dong

2013-01-01

WD40-repeat proteins (WD40s), as one of the largest protein families in eukaryotes, play vital roles in assembling protein-protein/DNA/RNA complexes. WD40s fold into similar β-propeller structures despite diversified sequences. A program WDSP (WD40 repeat protein Structure Predictor) has been developed to accurately identify WD40 repeats and predict their secondary structures. The method is designed specifically for WD40 proteins by incorporating both local residue information and non-local family-specific structural features. It overcomes the problem of highly diversified protein sequences and variable loops. In addition, WDSP achieves a better prediction in identifying multiple WD40-domain proteins by taking the global combination of repeats into consideration. In secondary structure prediction, the average Q3 accuracy of WDSP in jack-knife test reaches 93.7%. A disease related protein LRRK2 was used as a representive example to demonstrate the structure prediction. PMID:23776530
Memory for sequences of events impaired in typical aging.

PubMed

Allen, Timothy A; Morris, Andrea M; Stark, Shauna M; Fortin, Norbert J; Stark, Craig E L

2015-03-01

Typical aging is associated with diminished episodic memory performance. To improve our understanding of the fundamental mechanisms underlying this age-related memory deficit, we previously developed an integrated, cross-species approach to link converging evidence from human and animal research. This novel approach focuses on the ability to remember sequences of events, an important feature of episodic memory. Unlike existing paradigms, this task is nonspatial, nonverbal, and can be used to isolate different cognitive processes that may be differentially affected in aging. Here, we used this task to make a comprehensive comparison of sequence memory performance between younger (18-22 yr) and older adults (62-86 yr). Specifically, participants viewed repeated sequences of six colored, fractal images and indicated whether each item was presented "in sequence" or "out of sequence." Several out of sequence probe trials were used to provide a detailed assessment of sequence memory, including: (i) repeating an item from earlier in the sequence ("Repeats"; e.g., AB A: DEF), (ii) skipping ahead in the sequence ("Skips"; e.g., AB D: DEF), and (iii) inserting an item from a different sequence into the same ordinal position ("Ordinal Transfers"; e.g., AB 3: DEF). We found that older adults performed as well as younger controls when tested on well-known and predictable sequences, but were severely impaired when tested using novel sequences. Importantly, overall sequence memory performance in older adults steadily declined with age, a decline not detected with other measures (RAVLT or BPS-O). We further characterized this deficit by showing that performance of older adults was severely impaired on specific probe trials that required detailed knowledge of the sequence (Skips and Ordinal Transfers), and was associated with a shift in their underlying mnemonic representation of the sequences. Collectively, these findings provide unambiguous evidence that the capacity to remember sequences of events is fundamentally affected by typical aging. © 2015 Allen et al.; Published by Cold Spring Harbor Laboratory Press.
Construction of a self-cloning sake yeast that overexpresses alcohol acetyltransferase gene by a two-step gene replacement protocol.

PubMed

Hirosawa, I; Aritomi, K; Hoshida, H; Kashiwagi, S; Nishizawa, Y; Akada, R

2004-07-01

The commercial application of genetically modified industrial microorganisms has been problematic due to public concerns. We constructed a "self-cloning" sake yeast strain that overexpresses the ATF1 gene encoding alcohol acetyltransferase, to improve the flavor profile of Japanese sake. A constitutive yeast overexpression promoter, TDH3p, derived from the glyceraldehyde-3-phosphate dehydrogenase gene from sake yeast was fused to ATF1; and the 5' upstream non-coding sequence of ATF1 was further fused to TDH3p-ATF1. The fragment was placed on a binary vector, pGG119, containing a drug-resistance marker for transformation and a counter-selection marker for excision of unwanted DNA. The plasmid was integrated into the ATF1 locus of a sake yeast strain. This integration constructed tandem repeats of ATF1 and TDH3p-ATF1 sequences, between which the plasmid was inserted. Loss of the plasmid, which occurs through homologous recombination between either the TDH3p downstream ATF1 repeats or the TDH3p upstream repeat sequences, was selected by growing transformants on counter-selective medium. Recombination between the downstream repeats led to reversion to a wild type strain, but that between the upstream repeats resulted in a strain that possessed TDH3p-ATF1 without the extraneous DNA sequences. The self-cloning TDH3p-ATF1 yeast strain produced a higher amount of isoamyl acetate. This is the first expression-controlled self-cloning industrial yeast.
The Complete Chloroplast Genome Sequence of a Relict Conifer Glyptostrobus pensilis: Comparative Analysis and Insights into Dynamics of Chloroplast Genome Rearrangement in Cupressophytes and Pinaceae

PubMed Central

Zheng, Renhua; Xu, Haibin; Zhou, Yanwei; Li, Meiping; Lu, Fengjuan; Dong, Yini; Liu, Xin; Chen, Jinhui; Shi, Jisen

2016-01-01

Glyptostrobus pensilis, belonging to the monotypic genus Glyptostrobus (Family: Cupressaceae), is an ancient conifer that is naturally distributed in low-lying wet areas. Here, we report the complete chloroplast (cp) genome sequence (132,239 bp) of G. pensilis. The G. pensilis cp genome is similar in gene content, organization and genome structure to the sequenced cp genomes from other cupressophytes, especially with respect to the loss of the inverted repeat region A (IRA). Through phylogenetic analysis, we demonstrated that the genus Glyptostrobus is closely related to the genus Cryptomeria, supporting previous findings based on physiological characteristics. Since IRs play an important role in stabilize cp genome and conifer cp genomes lost different IR regions after splitting in two clades (cupressophytes and Pinaceae), we performed cp genome rearrangement analysis and found more extensive cp genome rearrangements among the species of cupressophytes relative to Pinaceae. Additional repeat analysis indicated that cupressophytes cp genomes contained less potential functional repeats, especially in Cupressaceae, compared with Pinaceae. These results suggested that dynamics of cp genome rearrangement in conifers differed since the two clades, Pinaceae and cupressophytes, lost IR copies independently and developed different repeats to complement the residual IRs. In addition, we identified 170 perfect simple sequence repeats that will be useful in future research focusing on the evolution of genetic diversity and conservation of genetic variation for this endangered species in the wild. PMID:27560965
Determining Phylogenetic Relationships Among Date Palm Cultivars Using Random Amplified Polymorphic DNA (RAPD) and Inter-Simple Sequence Repeat (ISSR) Markers.

PubMed

Haider, Nadia

2017-01-01

Investigation of genetic variation and phylogenetic relationships among date palm (Phoenix dactylifera L.) cultivars is useful for their conservation and genetic improvement. Various molecular markers such as restriction fragment length polymorphisms (RFLPs), simple sequence repeat (SSR), representational difference analysis (RDA), and amplified fragment length polymorphism (AFLP) have been developed to molecularly characterize date palm cultivars. PCR-based markers random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) are powerful tools to determine the relatedness of date palm cultivars that are difficult to distinguish morphologically. In this chapter, the principles, materials, and methods of RAPD and ISSR techniques are presented. Analysis of data generated from these two techniques and the use of these data to reveal phylogenetic relationships among date palm cultivars are also discussed.
Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

PubMed

Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

2017-07-01

PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.
Length and sequence variability in mitochondrial control region of the milkfish, Chanos chanos.

PubMed

Ravago, Rachel G; Monje, Virginia D; Juinio-Meñez, Marie Antonette

2002-01-01

Extensive length variability was observed in the mitochondrial control region of the milkfish, Chanos chanos. The nucleotide sequence of the control region and flanking regions was determined. Length variability and heteroplasmy was due to the presence of varying numbers of a 41-bp tandemly repeated sequence and a 48-bp insertion/deletion (indel). The structure and organization of the milkfish control region is similar to that of other teleost fish and vertebrates. However, extensive variation in the copy number of tandem repeats (4-20 copies) and the presence of a relatively large (48-bp) indel, are apparently uncommon in teleost fish control region sequences reported to date. High sequence variability of control region peripheral domains indicates the potential utility of selected regions as markers for population-level studies.
Length and sequence heterogeneity in 5S rDNA of Populus deltoides.

PubMed

Negi, Madan S; Rajagopal, Jyothi; Chauhan, Neeti; Cronn, Richard; Lakshmikumaran, Malathi

2002-12-01

The 5S rRNA genes and their associated non-transcribed spacer (NTS) regions are present as repeat units arranged in tandem arrays in plant genomes. Length heterogeneity in 5S rDNA repeats was previously identified in Populus deltoides and was also observed in the present study. Primers were designed to amplify the 5S rDNA NTS variants from the P. deltoides genome. The PCR-amplified products from the two accessions of P. deltoides (G3 and G48) suggested the presence of length heterogeneity of 5S rDNA units within and among accessions, and the size of the spacers ranged from 385 to 434 bp. Sequence analysis of the non-transcribed spacer (NTS) revealed two distinct classes of 5S rDNA within both accessions: class 1, which contained GAA trinucleotide microsatellite repeats, and class 2, which lacked the repeats. The class 1 spacer shows length variation owing to the microsatellite, with two clones exhibiting 10 GAA repeat units and one clone exhibiting 16 such repeat units. However, distance analysis shows that class 1 spacer sequences are highly similar inter se, yielding nucleotide diversity (pi) estimates that are less than 0.15% of those obtained for class 2 spacers (pi = 0.0183 vs. 0.1433, respectively). The presence of microsatellite in the NTS region leading to variation in spacer length is reported and discussed for the first time in P. deltoides.
Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs

Treesearch

M.N. lslam-Faridi; C.D. Nelson; S.P. DiFazio; L.E. Gunter; G.A. Tuskan

2009-01-01

The 185-285 rDNA and 55 rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 185-285 rDNA sites and one 55 rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis-type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones...
Non-RVD mutations that enhance the dynamics of the TAL repeat array along the superhelical axis improve TALEN genome editing efficacy

PubMed Central

Tochio, Naoya; Umehara, Kohei; Uewaki, Jun-ichi; Flechsig, Holger; Kondo, Masaharu; Dewa, Takehisa; Sakuma, Tetsushi; Yamamoto, Takashi; Saitoh, Takashi; Togashi, Yuichi; Tate, Shin-ichi

2016-01-01

Transcription activator-like effector (TALE) nuclease (TALEN) is widely used as a tool in genome editing. The DNA binding part of TALEN consists of a tandem array of TAL-repeats that form a right-handed superhelix. Each TAL-repeat recognises a specific base by the repeat variable diresidue (RVD) at positions 12 and 13. TALEN comprising the TAL-repeats with periodic mutations to residues at positions 4 and 32 (non-RVD sites) in each repeat (VT-TALE) exhibits increased efficacy in genome editing compared with a counterpart without the mutations (CT-TALE). The molecular basis for the elevated efficacy is unknown. In this report, comparison of the physicochemical properties between CT- and VT-TALEs revealed that VT-TALE has a larger amplitude motion along the superhelical axis (superhelical motion) compared with CT-TALE. The greater superhelical motion in VT-TALE enabled more TAL-repeats to engage in the target sequence recognition compared with CT-TALE. The extended sequence recognition by the TAL-repeats improves site specificity with limiting the spatial distribution of FokI domains to facilitate their dimerization at the desired site. Molecular dynamics simulations revealed that the non-RVD mutations alter inter-repeat hydrogen bonding to amplify the superhelical motion of VT-TALE. The TALEN activity is associated with the inter-repeat hydrogen bonding among the TAL repeats. PMID:27883072
Evaluation of anonymous and expressed sequence tag derived polymorphic microsatellite markers in the tobacco budworm Heliothis virescens (Lepidoptera: noctuidae)

USDA-ARS?s Scientific Manuscript database

Polymorphic genetic markers were identified and characterized using a partial genomic library of Heliothis virescens enriched for simple sequence repeats (SSR) and nucleotide sequences of expressed sequence tags (EST). Nucleotide sequences of 192 clones from the partial genomic library yielded 147 u...
Development of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L.) Millspaugh].

PubMed

Dutta, Sutapa; Kumawat, Giriraj; Singh, Bikram P; Gupta, Deepak K; Singh, Sangeeta; Dogra, Vivek; Gaikwad, Kishor; Sharma, Tilak R; Raje, Ranjeet S; Bandhopadhya, Tapas K; Datta, Subhojit; Singh, Mahendra N; Bashasab, Fakrudin; Kulwal, Pawan; Wanjari, K B; K Varshney, Rajeev; Cook, Douglas R; Singh, Nagendra K

2011-01-20

Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥ 18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. We developed 550 validated genic-SSR markers in pigeonpea using deep transcriptome sequencing. From these, 20 highly polymorphic markers were used to evaluate the genetic relationship among species of the genus Cajanus. A comprehensive set of genic-SSR markers was developed as an important genomic resource for diversity analysis and genetic mapping in pigeonpea.
Development of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L.) Millspaugh

PubMed Central

2011-01-01

Background Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. Results In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. Conclusion We developed 550 validated genic-SSR markers in pigeonpea using deep transcriptome sequencing. From these, 20 highly polymorphic markers were used to evaluate the genetic relationship among species of the genus Cajanus. A comprehensive set of genic-SSR markers was developed as an important genomic resource for diversity analysis and genetic mapping in pigeonpea. PMID:21251263
Neurodegenerative Models in Drosophila: Polyglutamine Disorders, Parkinson Disease, and Amyotrophic Lateral Sclerosis

PubMed Central

Ambegaokar, Surendra S.; Roy, Bidisha; Jackson, George R.

2010-01-01

Neurodegenerative diseases encompass a large group of neurological disorders. Clinical symptoms can include memory loss, cognitive impairment, loss of movement or loss of control of movement, and loss of sensation. Symptoms are typically adult onset (although severe cases can occur in adolescents) and are reflective of neuronal and glial cell loss in the central nervous system. Neurodegenerative diseases also are considered progressive, with increased severity of symptoms over time, also reflective of increased neuronal cell death. However, various neurodegenerative diseases differentially affect certain brain regions or neuronal or glial cell types. As an example, Alzheimer disease (AD) primarily affects the temporal lobe, whereas neuronal loss in Parkinson disease (PD) is largely (although not exclusively) confined to the nigrostriatal system. Neuronal loss is almost invariably accompanied by abnormal insoluble aggregates, either intra- or extracellular. Thus, neurodegenerative diseases are categorized by (a) the composite of clinical symptoms, (b) the brain regions or types of brain cells primarily affected, and (c) the types of protein aggregates found in the brain. Here we review the methods by which Drosophila melanogaster has been used to model aspects of polyglutamine diseases, Parkinson disease, and amyotrophic lateral sclerosis and key insights into that have been gained from these models; Alzheimer disease and the tauopathies are covered elsewhere in this special issue. PMID:20561920
Heat shock factor 2 is required for maintaining proteostasis against febrile-range thermal stress and polyglutamine aggregation

PubMed Central

Shinkawa, Toyohide; Tan, Ke; Fujimoto, Mitsuaki; Hayashida, Naoki; Yamamoto, Kaoru; Takaki, Eiichi; Takii, Ryosuke; Prakasam, Ramachandran; Inouye, Sachiye; Mezger, Valerie; Nakai, Akira

2011-01-01

Heat shock response is characterized by the induction of heat shock proteins (HSPs), which facilitate protein folding, and non-HSP proteins with diverse functions, including protein degradation, and is regulated by heat shock factors (HSFs). HSF1 is a master regulator of HSP expression during heat shock in mammals, as is HSF3 in avians. HSF2 plays roles in development of the brain and reproductive organs. However, the fundamental roles of HSF2 in vertebrate cells have not been identified. Here we find that vertebrate HSF2 is activated during heat shock in the physiological range. HSF2 deficiency reduces threshold for chicken HSF3 or mouse HSF1 activation, resulting in increased HSP expression during mild heat shock. HSF2-null cells are more sensitive to sustained mild heat shock than wild-type cells, associated with the accumulation of ubiquitylated misfolded proteins. Furthermore, loss of HSF2 function increases the accumulation of aggregated polyglutamine protein and shortens the lifespan of R6/2 Huntington's disease mice, partly through αB-crystallin expression. These results identify HSF2 as a major regulator of proteostasis capacity against febrile-range thermal stress and suggest that HSF2 could be a promising therapeutic target for protein-misfolding diseases. PMID:21813737
Involvement of HDAC1 and HDAC3 in the Pathology of Polyglutamine Disorders: Therapeutic Implications for Selective HDAC1/HDAC3 Inhibitors

PubMed Central

Thomas, Elizabeth A.

2014-01-01

Histone deacetylases (HDACs) enzymes, which affect the acetylation status of histones and other important cellular proteins, have been recognized as potentially useful therapeutic targets for a broad range of human disorders. Emerging studies have demonstrated that different types of HDAC inhibitors show beneficial effects in various experimental models of neurological disorders. HDAC enzymes comprise a large family of proteins, with18 HDAC enzymes currently identified in humans. Hence, an important question for HDAC inhibitor therapeutics is which HDAC enzyme(s) is/are important for the amelioration of disease phenotypes, as it has become clear that individual HDAC enzymes play different biological roles in the brain. This review will discuss evidence supporting the involvement of HDAC1 and HDAC3 in polyglutamine disorders, including Huntington’s disease, and the use of HDAC1- and HDAC3-selective HDAC inhibitors as therapeutic intervention for these disorders. Further, while HDAC inhibitors are known alter chromatin structure resulting in changes in gene transcription, understanding the exact mechanisms responsible for the preclinical efficacy of these compounds remains a challenge. The potential chromatin-related and non-chromatin-related mechanisms of action of selective HDAC inhibitors will also be discussed. PMID:24865773
Proteins with Intrinsically Disordered Domains Are Preferentially Recruited to Polyglutamine Aggregates

PubMed Central

O’Meally, Robert; Sonnenberg, Jason L.; Cole, Robert N.; Shewmaker, Frank P.

2015-01-01

Intracellular protein aggregation is the hallmark of several neurodegenerative diseases. Aggregates formed by polyglutamine (polyQ)-expanded proteins, such as Huntingtin, adopt amyloid-like structures that are resistant to denaturation. We used a novel purification strategy to isolate aggregates formed by human Huntingtin N-terminal fragments with expanded polyQ tracts from both yeast and mammalian (PC-12) cells. Using mass spectrometry we identified the protein species that are trapped within these polyQ aggregates. We found that proteins with very long intrinsically-disordered (ID) domains (≥100 amino acids) and RNA-binding proteins were disproportionately recruited into aggregates. The removal of the ID domains from selected proteins was sufficient to eliminate their recruitment into polyQ aggregates. We also observed that several neurodegenerative disease-linked proteins were reproducibly trapped within the polyQ aggregates purified from mammalian cells. Many of these proteins have large ID domains and are found in neuronal inclusions in their respective diseases. Our study indicates that neurodegenerative disease-associated proteins are particularly vulnerable to recruitment into polyQ aggregates via their ID domains. Also, the high frequency of ID domains in RNA-binding proteins may explain why RNA-binding proteins are frequently found in pathological inclusions in various neurodegenerative diseases. PMID:26317359
Contrasting Patterns of rDNA Homogenization within the Zygosaccharomyces rouxii Species Complex

PubMed Central

Chand Dakal, Tikam; Giudici, Paolo; Solieri, Lisa

2016-01-01

Arrays of repetitive ribosomal DNA (rDNA) sequences are generally expected to evolve as a coherent family, where repeats within such a family are more similar to each other than to orthologs in related species. The continuous homogenization of repeats within individual genomes is a recombination process termed concerted evolution. Here, we investigated the extent and the direction of concerted evolution in 43 yeast strains of the Zygosaccharomyces rouxii species complex (Z. rouxii, Z. sapae, Z. mellis), by analyzing two portions of the 35S rDNA cistron, namely the D1/D2 domains at the 5’ end of the 26S rRNA gene and the segment including the internal transcribed spacers (ITS) 1 and 2 (ITS regions). We demonstrate that intra-genomic rDNA sequence variation is unusually frequent in this clade and that rDNA arrays in single genomes consist of an intermixing of Z. rouxii, Z. sapae and Z. mellis-like sequences, putatively evolved by reticulate evolutionary events that involved repeated hybridization between lineages. The levels and distribution of sequence polymorphisms vary across rDNA repeats in different individuals, reflecting four patterns of rDNA evolution: I) rDNA repeats that are homogeneous within a genome but are chimeras derived from two parental lineages via recombination: Z. rouxii in the ITS region and Z. sapae in the D1/D2 region; II) intra-genomic rDNA repeats that retain polymorphisms only in ITS regions; III) rDNA repeats that vary only in their D1/D2 domains; IV) heterogeneous rDNA arrays that have both polymorphic ITS and D1/D2 regions. We argue that an ongoing process of homogenization following allodiplodization or incomplete lineage sorting gave rise to divergent evolutionary trajectories in different strains, depending upon temporal, structural and functional constraints. We discuss the consequences of these findings for Zygosaccharomyces species delineation and, more in general, for yeast barcoding. PMID:27501051
The organization and evolution of the Responder satellite in species of the Drosophila melanogaster group: dynamic evolution of a target of meiotic drive.

PubMed

Larracuente, Amanda M

2014-11-25

Satellite DNA can make up a substantial fraction of eukaryotic genomes and has roles in genome structure and chromosome segregation. The rapid evolution of satellite DNA can contribute to genomic instability and genetic incompatibilities between species. Despite its ubiquity and its contribution to genome evolution, we currently know little about the dynamics of satellite DNA evolution. The Responder (Rsp) satellite DNA family is found in the pericentric heterochromatin of chromosome 2 of Drosophila melanogaster. Rsp is well-known for being the target of Segregation Distorter (SD)- an autosomal meiotic drive system in D. melanogaster. I present an evolutionary genetic analysis of the Rsp family of repeats in D. melanogaster and its closely-related species in the melanogaster group (D. simulans, D. sechellia, D. mauritiana, D. erecta, and D. yakuba) using a combination of available BAC sequences, whole genome shotgun Sanger reads, Illumina short read deep sequencing, and fluorescence in situ hybridization. I show that Rsp repeats have euchromatic locations throughout the D. melanogaster genome, that Rsp arrays show evidence for concerted evolution, and that Rsp repeats exist outside of D. melanogaster, in the melanogaster group. The repeats in these species are considerably diverged at the sequence level compared to D. melanogaster, and have a strikingly different genomic distribution, even between closely-related sister taxa. The genomic organization of the Rsp repeat in the D. melanogaster genome is complex-it exists of large blocks of tandem repeats in the heterochromatin and small blocks of tandem repeats in the euchromatin. My discovery of heterochromatic Rsp-like sequences outside of D. melanogaster suggests that SD evolved after its target satellite and that the evolution of the Rsp satellite family is highly dynamic over a short evolutionary time scale (<240,000 years).
Sequence of contactin, a 130-kD glycoprotein concentrated in areas of interneuronal contact, defines a new member of the immunoglobulin supergene family in the nervous system

PubMed Central

1988-01-01

The primary amino acid sequence of contactin, a neuronal cell surface glycoprotein of 130 kD that is isolated in association with components of the cytoskeleton (Ranscht, B., D. J. Moss, and C. Thomas. 1984. J. Cell Biol. 99:1803-1813), was deduced from the nucleotide sequence of cDNA clones and is reported here. The cDNA sequence contains an open reading frame for a 1,071-amino acid transmembrane protein with 962 extracellular and 89 cytoplasmic amino acids. In its extracellular portion, the polypeptide features six type 1 and two type 2 repeats. The six amino-terminal type 1 repeats (I-VI) each consist of 81-99 amino acids and contain two cysteine residues that are in the right context to form globular domains as described for molecules with immunoglobulin structure. Within the proposed globular region, contactin shares 31% identical amino acids with the neural cell adhesion molecule NCAM. The two type 2 repeats (I-II) are each composed of 100 amino acids and lack cysteine residues. They are 20-31% identical to fibronectin type III repeats. Both the structural similarity of contactin to molecules of the immunoglobulin supergene family, in particular the amino acid sequence resemblance to NCAM, and its relationship to fibronectin indicate that contactin could be involved in some aspect of cellular adhesion. This suggestion is further strengthened by its localization in neuropil containing axon fascicles and synapses. PMID:3049624

The Peculiar Landscape of Repetitive Sequences in the Olive (Olea europaea L.) Genome

PubMed Central

Barghini, Elena; Natali, Lucia; Cossu, Rosa Maria; Giordani, Tommaso; Pindo, Massimo; Cattonaro, Federica; Scalabrin, Simone; Velasco, Riccardo; Morgante, Michele; Cavallini, Andrea

2014-01-01

Analyzing genome structure in different species allows to gain an insight into the evolution of plant genome size. Olive (Olea europaea L.) has a medium-sized haploid genome of 1.4 Gb, whose structure is largely uncharacterized, despite the growing importance of this tree as oil crop. Next-generation sequencing technologies and different computational procedures have been used to study the composition of the olive genome and its repetitive fraction. A total of 2.03 and 2.3 genome equivalents of Illumina and 454 reads from genomic DNA, respectively, were assembled following different procedures, which produced more than 200,000 differently redundant contigs, with mean length higher than 1,000 nt. Mapping Illumina reads onto the assembled sequences was used to estimate their redundancy. The genome data set was subdivided into highly and medium redundant and nonredundant contigs. By combining identification and mapping of repeated sequences, it was established that tandem repeats represent a very large portion of the olive genome (∼31% of the whole genome), consisting of six main families of different length, two of which were first discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements (especially long terminal repeat-retrotransposons). On the whole, the results of our analyses show the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, more than that reported for any other sequenced plant genome. PMID:24671744
The peculiar landscape of repetitive sequences in the olive (Olea europaea L.) genome.

PubMed

Barghini, Elena; Natali, Lucia; Cossu, Rosa Maria; Giordani, Tommaso; Pindo, Massimo; Cattonaro, Federica; Scalabrin, Simone; Velasco, Riccardo; Morgante, Michele; Cavallini, Andrea

2014-04-01

Analyzing genome structure in different species allows to gain an insight into the evolution of plant genome size. Olive (Olea europaea L.) has a medium-sized haploid genome of 1.4 Gb, whose structure is largely uncharacterized, despite the growing importance of this tree as oil crop. Next-generation sequencing technologies and different computational procedures have been used to study the composition of the olive genome and its repetitive fraction. A total of 2.03 and 2.3 genome equivalents of Illumina and 454 reads from genomic DNA, respectively, were assembled following different procedures, which produced more than 200,000 differently redundant contigs, with mean length higher than 1,000 nt. Mapping Illumina reads onto the assembled sequences was used to estimate their redundancy. The genome data set was subdivided into highly and medium redundant and nonredundant contigs. By combining identification and mapping of repeated sequences, it was established that tandem repeats represent a very large portion of the olive genome (∼31% of the whole genome), consisting of six main families of different length, two of which were first discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements (especially long terminal repeat-retrotransposons). On the whole, the results of our analyses show the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, more than that reported for any other sequenced plant genome.
The complete chloroplast genome of Gentiana straminea (Gentianaceae), an endemic species to the Sino-Himalayan subregion.

PubMed

Ni, Lianghong; Zhao, Zhili; Xu, Hongxi; Chen, Shilin; Dorje, Gaawe

2016-02-15

Endemic to the Sino-Himalayan subregion, the medicinal alpine plant Gentiana straminea is a threatened species. The genetic and molecular data about it is deficient. Here we report the complete chloroplast (cp) genome sequence of G. straminea, as the first sequenced member of the family Gentianaceae. The cp genome is 148,991bp in length, including a large single copy (LSC) region of 81,240bp, a small single copy (SSC) region of 17,085bp and a pair of inverted repeats (IRs) of 25,333bp. It contains 112 unique genes, including 78 protein-coding genes, 30 tRNAs and 4 rRNAs. The rps16 gene lacks exon2 between trnK-UUU and trnQ-UUG, which is the first rps16 pseudogene found in the nonparasitic plants of Asterids clade. Sequence analysis revealed the presence of 13 forward repeats, 13 palindrome repeats and 39 simple sequence repeats (SSRs). An entire cp genome comparison study of G. straminea and four other species in Gentianales was carried out. Phylogenetic analyses using maximum likelihood (ML) and maximum parsimony (MP) were performed based on 69 protein-coding genes from 36 species of Asterids. The results strongly supported the position of Gentianaceae as one member of the order Gentianales. The complete chloroplast genome sequence will provide intragenic information for its conservation and contribute to research on the genetic and phylogenetic analyses of Gentianales and Asterids. Copyright © 2015 Elsevier B.V. All rights reserved.
Characterization of the variable-number tandem repeats in vrrA from different Bacillus anthracis isolates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jackson, P.J.; Walthers, E.A.; Richmond, K.L.

1997-04-01

PCR analysis of 198 Bacillus anthracis isolates revealed a variable region of DNA sequence differing in length among the isolates. Five Polymorphisms differed by the presence Of two to six copies of the 12-bp tandem repeat 5{prime}-CAATATCAACAA-3{prime}. This variable-number tandem repeat (VNTR) region is located within a larger sequence containing one complete open reading frame that encodes a putative 30-kDa protein. Length variation did not change the reading frame of the encoded protein and only changed the copy number of a 4-amino-acid sequence (QYQQ) from 2 to 6. The structure of the VNTR region suggests that these multiple repeats aremore » generated by recombination or polymerase slippage. Protein structures predicted from the reverse-translated DNA sequence suggest that any structural changes in the encoded protein are confined to the region encoded by the VNTR sequence. Copy number differences in the VNTR region were used to define five different B. anthracis alleles. Characterization of 198 isolates revealed allele frequencies of 6.1, 17.7, 59.6, 5.6, and 11.1% sequentially from shorter to longer alleles. The high degree of polymorphism in the VNTR region provides a criterion for assigning isolates to five allelic categories. There is a correlation between categories and geographic distribution. Such molecular markers can be used to monitor the epidemiology of anthrax outbreaks in domestic and native herbivore populations. 22 refs., 4 figs., 3 tabs.« less
Modular probes for enriching and detecting complex nucleic acid sequences

NASA Astrophysics Data System (ADS)

Wang, Juexiao Sherry; Yan, Yan Helen; Zhang, David Yu

2017-12-01

Complex DNA sequences are difficult to detect and profile, but are important contributors to human health and disease. Existing hybridization probes lack the capability to selectively bind and enrich hypervariable, long or repetitive sequences. Here, we present a generalized strategy for constructing modular hybridization probes (M-Probes) that overcomes these challenges. We demonstrate that M-Probes can tolerate sequence variations of up to 7 nt at prescribed positions while maintaining single nucleotide sensitivity at other positions. M-Probes are also shown to be capable of sequence-selectively binding a continuous DNA sequence of more than 500 nt. Furthermore, we show that M-Probes can detect genes with triplet repeats exceeding a programmed threshold. As a demonstration of this technology, we have developed a hybrid capture method to determine the exact triplet repeat expansion number in the Huntington's gene of genomic DNA using quantitative PCR.
Rapid construction of insulated genetic circuits via synthetic sequence-guided isothermal assembly

DOE Office of Scientific and Technical Information (OSTI.GOV)

Torella, JP; Boehm, CR; Lienert, F

2013-12-28

In vitro recombination methods have enabled one-step construction of large DNA sequences from multiple parts. Although synthetic biological circuits can in principle be assembled in the same fashion, they typically contain repeated sequence elements such as standard promoters and terminators that interfere with homologous recombination. Here we use a computational approach to design synthetic, biologically inactive unique nucleotide sequences (UNSes) that facilitate accurate ordered assembly. Importantly, our designed UNSes make it possible to assemble parts with repeated terminator and insulator sequences, and thereby create insulated functional genetic circuits in bacteria and mammalian cells. Using UNS-guided assembly to construct repeating promoter-gene-terminatormore » parts, we systematically varied gene expression to optimize production of a deoxychromoviridans biosynthetic pathway in Escherichia coli. We then used this system to construct complex eukaryotic AND-logic gates for genomic integration into embryonic stem cells. Construction was performed by using a standardized series of UNS-bearing BioBrick-compatible vectors, which enable modular assembly and facilitate reuse of individual parts. UNS-guided isothermal assembly is broadly applicable to the construction and optimization of genetic circuits and particularly those requiring tight insulation, such as complex biosynthetic pathways, sensors, counters and logic gates.« less
The La-related protein 1-specific domain repurposes HEAT-like repeats to directly bind a 5'TOP sequence.

PubMed

Lahr, Roni M; Mack, Seshat M; Héroux, Annie; Blagden, Sarah P; Bousquet-Antonelli, Cécile; Deragon, Jean-Marc; Berman, Andrea J

2015-09-18

La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. A putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. These studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
The La-related protein 1-specific domain repurposes HEAT-like repeats to directly bind a 5'TOP sequence

DOE PAGES

Lahr, Roni M.; Mack, Seshat M.; Heroux, Annie; ...

2015-07-22

La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. Amore » putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. Ultimately, these studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis.« less
Iterative dictionary construction for compression of large DNA data sets.

PubMed

Kuruppu, Shanika; Beresford-Smith, Bryan; Conway, Thomas; Zobel, Justin

2012-01-01

Genomic repositories increasingly include individual as well as reference sequences, which tend to share long identical and near-identical strings of nucleotides. However, the sequential processing used by most compression algorithms, and the volumes of data involved, mean that these long-range repetitions are not detected. An order-insensitive, disk-based dictionary construction method can detect this repeated content and use it to compress collections of sequences. We explore a dictionary construction method that improves repeat identification in large DNA data sets. Our adaptation, COMRAD, of an existing disk-based method identifies exact repeated content in collections of sequences with similarities within and across the set of input sequences. COMRAD compresses the data over multiple passes, which is an expensive process, but allows COMRAD to compress large data sets within reasonable time and space. COMRAD allows for random access to individual sequences and subsequences without decompressing the whole data set. COMRAD has no competitor in terms of the size of data sets that it can compress (extending to many hundreds of gigabytes) and, even for smaller data sets, the results are competitive compared to alternatives; as an example, 39 S. cerevisiae genomes compressed to 0.25 bits per base.
The primitive code and repeats of base oligomers as the primordial protein-encoding sequence.

PubMed Central

Ohno, S; Epplen, J T

1983-01-01

Even if the prebiotic self-replication of nucleic acids and the subsequent emergence of primitive, enzyme-independent tRNAs are accepted as plausible, the origin of life by spontaneous generation still appears improbable. This is because the just-emerged primitive translational machinery had to cope with base sequences that were not preselected for their coding potentials. Particularly if the primitive mitochondria-like code with four chain-terminating base triplets preceded the universal code, the translation of long, randomly generated, base sequences at this critical stage would have merely resulted in the production of short oligopeptides instead of long polypeptide chains. We present the base sequence of a mouse transcript containing tetranucleotide repeats conserved during evolution. Even if translated in accordance with the primitive mitochondria-like code, this transcript in its three reading frames can yield 245-, 246-, and 251-residue-long tetrapeptidic periodical polypeptides that are already acquiring longer periodicities. We contend that the first set of base sequences translated at the beginning of life were such oligonucleotide repeats. By quickly acquiring longer periodicities, their products must have soon gained characteristic secondary structures--alpha-helical or beta-sheet or both. PMID:6574491
Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution

USDA-ARS?s Scientific Manuscript database

Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres comprise of megabase-scale arrays of tandem repeats. The true prevalence of centromere tandem repeats, and whether they exhibit conserved seque...
Insights on genome size evolution from a miniature inverted repeat transposon driving a satellite DNA.

PubMed

Scalvenzi, Thibault; Pollet, Nicolas

2014-12-01

The genome size in eukaryotes does not correlate well with the number of genes they contain. We can observe this so-called C-value paradox in amphibian species. By analyzing an amphibian genome we asked how repetitive DNA can impact genome size and architecture. We describe here our discovery of a Tc1/mariner miniature inverted-repeat transposon family present in Xenopus frogs. These transposons named miDNA4 are unique since they contain a satellite DNA motif. We found that miDNA4 measured 331 bp, contained 25 bp long inverted terminal repeat sequences and a sequence motif of 119 bp present as a unique copy or as an array of 2-47 copies. We characterized the structure, dynamics, impact and evolution of the miDNA4 family and its satellite DNA in Xenopus frog genomes. This led us to propose a model for the evolution of these two repeated sequences and how they can synergize to increase genome size. Copyright © 2014 Elsevier Inc. All rights reserved.
The CentO satellite confers translational and rotational phasing on cenH3 nucleosomes in rice centromeres.

PubMed

Zhang, Tao; Talbert, Paul B; Zhang, Wenli; Wu, Yufeng; Yang, Zujun; Henikoff, Jorja G; Henikoff, Steven; Jiang, Jiming

2013-12-10

Plant and animal centromeres comprise megabases of highly repeated satellite sequences, yet centromere function can be specified epigenetically on single-copy DNA by the presence of nucleosomes containing a centromere-specific variant of histone H3 (cenH3). We determined the positions of cenH3 nucleosomes in rice (Oryza sativa), which has centromeres composed of both the 155-bp CentO satellite repeat and single-copy non-CentO sequences. We find that cenH3 nucleosomes protect 90-100 bp of DNA from micrococcal nuclease digestion, sufficient for only a single wrap of DNA around the cenH3 nucleosome core. cenH3 nucleosomes are translationally phased with 155-bp periodicity on CentO repeats, but not on non-CentO sequences. CentO repeats have an ∼10-bp periodicity in WW dinucleotides and in micrococcal nuclease cleavage, providing evidence for rotational phasing of cenH3 nucleosomes on CentO and suggesting that satellites evolve for translational and rotational stabilization of centromeric nucleosomes.
Fingerprinting of Cyanobacteria Based on PCR with Primers Derived from Short and Long Tandemly Repeated Repetitive Sequences

PubMed Central

Rasmussen, Ulla; Svenning, Mette M.

1998-01-01

The presence of repeated DNA (short tandemly repeated repetitive [STRR] and long tandemly repeated repetitive [LTRR]) sequences in the genome of cyanobacteria was used to generate a fingerprint method for symbiotic and free-living isolates. Primers corresponding to the STRR and LTRR sequences were used in the PCR, resulting in a method which generate specific fingerprints for individual isolates. The method was useful both with purified DNA and with intact cyanobacterial filaments or cells as templates for the PCR. Twenty-three Nostoc isolates from a total of 35 were symbiotic isolates from the angiosperm Gunnera species, including isolates from the same Gunnera species as well as from different species. The results show a genetic similarity among isolates from different Gunnera species as well as a genetic heterogeneity among isolates from the same Gunnera species. Isolates which have been postulated to be closely related or identical revealed similar results by the PCR method, indicating that the technique is useful for clustering of even closely related strains. The method was applied to nonheterocystus cyanobacteria from which a fingerprint pattern was obtained. PMID:16349487
Similarities in the chromosomal distribution of AG and AC repeats within and between Drosophila, human and barley chromosomes.

PubMed

Cuadrado, A; Jouve, N

2007-01-01

Two simple sequence repeats (SSRs), AG and AC, were mapped directly in the metaphase chromosomes of man and barley (Hordeum vulgare L.), and in the metaphase and polytene chromosomes of Drosophila melanogaster. To this end, synthetic oligonucleotides corresponding to (AG)(12) and (AC)(8) were labelled by the random primer technique and used as probes in fluorescent in situ hybridisation (FISH) under high stringency and strict washing conditions. The distribution and intensity of the signals for the repeat sequences were found to be characteristic of the chromosomes and genomes of the three species analysed. The AC repeat sites were uniformly dispersed along the euchromatic segments of all three genomes; in fact, they were largely excluded from the heterochromatin. The Drosophila genome showed a high density of AC sequences on the X chromosome in both mitotic and polytene nuclei. In contrast, the AG repeats were associated with the euchromatic regions of the polytene chromosomes (and in high density on the X chromosome), but were only seen in specific heterochromatic regions in the mitotic chromosomes of all three species. In Drosophila, the AG repeats were exclusively distributed on the tips of the Y chromosome and near the centromere on both arms of chromosome 2. In barley and man, AG repeats were associated with the centromeres (of all chromosomes) and nucleolar organizer regions, respectively. The conserved chromosome distribution of AC within and between these three phylogenetically distant species, and the association of AG in specific chromosome regions with structural or functional properties, suggests that long clusters of these repeats may have some, as yet unknown, role. Copyright (c) 2007 S. Karger AG, Basel.
Clock gene polymorphism and scheduling of migration: a geolocator study of the barn swallow Hirundo rustica

PubMed Central

Bazzi, Gaia; Ambrosini, Roberto; Caprioli, Manuela; Costanzo, Alessandra; Liechti, Felix; Gatti, Emanuele; Gianfranceschi, Luca; Podofillini, Stefano; Romano, Andrea; Romano, Maria; Scandolara, Chiara; Saino, Nicola; Rubolini, Diego

2015-01-01

Circannual rhythms often rely on endogenous seasonal photoperiodic timers involving ‘clock’ genes, and Clock gene polymorphism has been associated to variation in phenology in some bird species. In the long-distance migratory barn swallow Hirundo rustica, individuals bearing the rare Clock allele with the largest number of C-terminal polyglutamine repeats found in this species (Q8) show a delayed reproduction and moult later. We explored the association between Clock polymorphism and migration scheduling, as gauged by light-level geolocators, in two barn swallow populations (Switzerland; Po Plain, Italy). Genetic polymorphism was low: 91% of the 64 individuals tracked year-round were Q7/Q7 homozygotes. We compared the phenology of the rare genotypes with the phenotypic distribution of Q7/Q7 homozygotes within each population. In Switzerland, compared to Q7/Q7, two Q6/Q7 males departed earlier from the wintering grounds and arrived earlier to their colony in spring, while a single Q7/Q8 female was delayed for both phenophases. On the other hand, in the Po Plain, three Q6/Q7 individuals had a similar phenology compared to Q7/Q7. The Swiss data are suggestive for a role of genetic polymorphism at a candidate phenological gene in shaping migration traits, and support the idea that Clock polymorphism underlies phenological variation in birds. PMID:26197782
AUTEN-67 (Autophagy Enhancer-67) Hampers the Progression of Neurodegenerative Symptoms in a Drosophila model of Huntington's Disease.

PubMed

Billes, Viktor; Kovács, Tibor; Hotzi, Bernadette; Manzéger, Anna; Tagscherer, Kinga; Komlós, Marcell; Tarnóci, Anna; Pádár, Zsolt; Erdős, Attila; Bjelik, Annamaria; Legradi, Adam; Gulya, Károly; Gulyás, Balázs; Vellai, Tibor

2016-05-07

Autophagy, a lysosome-mediated self-degradation process of eukaryotic cells, serves as a main route for the elimination of cellular damage [1-3]. Such damages include aggregated, oxidized or misfolded proteins whose accumulation can cause various neurodegenerative pathologies, including Huntington's disease (HD). Here we examined whether enhanced autophagic activity can alleviate neurophatological features in a Drosophila model of HD (the transgenic animals express a human mutant Huntingtin protein with a long polyglutamine repeat, 128Q). We have recently identified an autophagy-enhancing small molecule, AUTEN-67 (autophagy enhancer 67), with potent neuroprotective effects [4]. AUTEN-67 was applied to induce autophagic activity in the HD model used in this study. We showed that AUTEN-67 treatment interferes with the progressive accumulation of ubiquitinated proteins in the brain of Drosophila transgenic for the pathological 128Q form of human Huntingtin protein. The compound significantly improved the climbing ability and moderately extended the mean life span of these flies. Furthermore, brain tissue samples from human patients diagnosed for HD displayed increased levels of the autophagy substrate SQSTM1/p62 protein, as compared with controls. These results imply that AUTEN-67 impedes the progression of neurodegenerative symptoms characterizing HD, and that autophagy is a promising therapeutic target for treating this pathology. In humans, AUTEN-67 may have the potential to delay the onset and decrease the severity of HD.
Glial S100B Positive Vacuoles In Purkinje Cells: Earliest Morphological Abnormality In SCA1 Transgenic Mice

PubMed Central

VIG, Parminder J.S.; LOPEZ, Maripar E.; WEI, Jinrong; D’SOUZA, David R.; SUBRAMONY, SH; HENEGAR, Jeffrey; FRATKIN, Jonathan D.

2007-01-01

Spinocerebellar ataxia-1 (SCA1) is caused by the expansion of a polyglutamine repeat within the disease protein, ataxin-1. The overexpression of mutant ataxin-1 in SCA1 transgenic mice results in the formation of cytoplasmic vacuoles in Purkinje neurons (PKN) of the cerebellum. PKN are closely associated with neighboring Bergmann glia. To elucidate the role of Bergmann glia in SCA1 pathogenesis, cerebellar tissue from 7 days to 6 wks old SCA1 transgenic and wildtype mice were used. We observed that Bergmann glial S100B protein is localized to the cytoplasmic vacuoles in SCA1 PKN. These S100B positive cytoplasmic vacuoles began appearing much before the onset of behavioral abnormalities, and were negative for other glial and PKN marker proteins. Electron micrographs revealed that vacuoles have a double membrane. In the vacuoles, S100B colocalized with receptors of advanced glycation end-products (RAGE), and S100B co-immunoprecipated with cerebellar RAGE. In SCA1 PKN cultures, exogenous S100B protein interacted with the PKN membranes and was internalized. These data suggest that glial S100B though extrinsic to PKN is sequestered into cytoplasmic vacuoles in SCA1 mice at early postnatal ages. Further, S100B may be binding to RAGE on Purkinje cell membranes before these membranes are internalized. PMID:18176630
Novel VCP mutations in inclusion body myopathy associated with Paget disease of bone and frontotemporal dementia.

PubMed

Watts, G D J; Thomasova, D; Ramdeen, S K; Fulchiero, E C; Mehta, S G; Drachman, D A; Weihl, C C; Jamrozik, Z; Kwiecinski, H; Kaminska, A; Kimonis, V E

2007-11-01

Inclusion body myopathy associated with Paget disease of bone and frontotemporal dementia (IBMPFD, OMIM 167320) has recently been attributed to eight missense mutations in valosin-containing protein (VCP). We report novel VCP mutations N387H and L198W in six individuals from two families who presented with proximal muscle weakness at a mean age of diagnosis of 40 years, most losing the ability to walk within a few years of onset. Electromyographic studies in four individuals were suggestive of 'myopathic' changes, and neuropathic pattern was identified in one individual in family 1. Muscle biopsy in four individuals showed myopathic changes characterized by variable fiber size, two individuals showing rimmed vacuoles and IBM-type cytoplasmic inclusions in muscle fibers, and electron microscopy in one individual revealing abundant intranuclear inclusions. Frontotemporal dementia associated with characteristic behavioral changes including short-term memory loss, language difficulty, and antisocial behavior was observed in three individuals at a mean age of 47 years. Detailed brain pathology in one individual showed cortical degenerative changes, most severe in the temporal lobe and hippocampus. Abundant ubiquitin-positive tau-, alpha-synuclein-, polyglutamine repeat-negative neuronal intranuclear inclusions and only rare intracytoplasmic VCP positive inclusions were seen. These new mutations may cause structural changes in VCP and provide some insight into the functional effects of pathogenic mutations.
Short Tandem Repeat DNA Internet Database

National Institute of Standards and Technology Data Gateway

SRD 130 Short Tandem Repeat DNA Internet Database (Web, free access) Short Tandem Repeat DNA Internet Database is intended to benefit research and application of short tandem repeat DNA markers for human identity testing. Facts and sequence information on each STR system, population data, commonly used multiplex STR systems, PCR primers and conditions, and a review of various technologies for analysis of STR alleles have been included.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.