trinucleotide repeat sequence: Topics by Science.gov

Sample records for trinucleotide repeat sequence

Unitary circular code motifs in genomes of eukaryotes.

PubMed

El Soufi, Karim; Michel, Christian J

A set X of 20 trinucleotides was identified in genes of bacteria, eukaryotes, plasmids and viruses, which has in average the highest occurrence in reading frame compared to its two shifted frames (Michel, 2015; Arquès and Michel, 1996). This set X has an interesting mathematical property as X is a circular code (Arquès and Michel, 1996). Thus, the motifs from this circular code X, called X motifs, have the property to always retrieve, synchronize and maintain the reading frame in genes. The origin of this circular code X in genes is an open problem since its discovery in 1996. Here, we first show that the unitary circular codes (UCC), i.e. sets of one word, allow to generate unitary circular code motifs (UCC motifs), i.e. a concatenation of the same motif (simple repeats) leading to low complexity DNA. Three classes of UCC motifs are studied here: repeated dinucleotides (D + motifs), repeated trinucleotides (T + motifs) and repeated tetranucleotides (T + motifs). Thus, the D + , T + and T + motifs allow to retrieve, synchronize and maintain a frame modulo 2, modulo 3 and modulo 4, respectively, and their shifted frames (1 modulo 2; 1 and 2 modulo 3; 1, 2 and 3 modulo 4 according to the C 2 , C 3 and C 4 properties, respectively) in the DNA sequences. The statistical distribution of the D + , T + and T + motifs is analyzed in the genomes of eukaryotes. A UCC motif and its comp lementary UCC motif have the same distribution in the eukaryotic genomes. Furthermore, a UCC motif and its complementary UCC motif have increasing occurrences contrary to their number of hydrogen bonds, very significant with the T + motifs. The longest D + , T + and T + motifs in the studied eukaryotic genomes are also given. Surprisingly, a scarcity of repeated trinucleotides (T + motifs) in the large eukaryotic genomes is observed compared to the D + and T + motifs. This result has been investigated and may be explained by two outcomes. Repeated trinucleotides (T + motifs) are identified in the X motifs of low composition (cardinality less than 10) in the genomes of eukaryotes. Furthermore, identical trinucleotide pairs of the circular code X are preferentially used in the gene sequences of eukaryotes. These two results suggest that the unitary circular codes of trinucleotides may have been involved in the formation of the trinucleotide circular code X. Indeed, repeated trinucleotides in the X motifs in the genomes of eukaryotes may represent an intermediary evolution from repeated trinucleotides of cardinality 1 (T + motifs) in the genomes of eukaryotes up to the X motifs of cardinality 20 in the gene sequences of eukaryotes. Copyright © 2017 Elsevier B.V. All rights reserved.
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis

PubMed Central

Zheng, Jin-shuang; Sun, Cheng-zhen; Zhang, Shu-ning; Hou, Xi-lin; Bonnema, Guusje

2016-01-01

A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis. PMID:27507974
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis.

PubMed

Zheng, Jin-Shuang; Sun, Cheng-Zhen; Zhang, Shu-Ning; Hou, Xi-Lin; Bonnema, Guusje

2016-01-01

A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis.
Drastic stability change of X-X mismatch in d(CXG) trinucleotide repeat disorders under molecular crowding condition.

PubMed

Teng, Ye; Pramanik, Smritimoy; Tateishi-Karimata, Hisae; Ohyama, Tatsuya; Sugimoto, Naoki

2018-02-05

The trinucleotide repeat d(CXG) (X = A, C, G or T) is the most common sequence causing repeat expansion disorders. The formation of non-canonical structures, such as hairpin structures with X-X mismatches, has been proposed to affect gene expression and regulation, which are important in pathological studies of these devastating neurological diseases. However, little information is available regarding the thermodynamics of the repeat sequence under crowded cellular conditions where many non-canonical structures such as G-quadruplexes are highly stabilized, while duplexes are destabilised. In this study, we investigated the different stabilities of X-X mismatches in the context of internal d(CXG) self-complementary sequences in an environment with a high concentration of cosolutes to mimic the crowding conditions in cells. The stabilities of full-matched duplexes and duplexes with A-A, G-G, and T-T mismatched base pairs under molecular crowding conditions were notably decreased compared to under dilute conditions. However, the stability of the DNA duplex with a C-C mismatch base pair was only slightly destabilised. Investigating different stabilities of X-X mismatches in d(CXG) sequences is important for improving our understanding of the formation and transition of multiple non-canonical structures in trinucleotide repeat diseases, and may provide insights for pathological studies and drug development. Copyright © 2018 Elsevier Inc. All rights reserved.
Non-radioactive detection of trinucleotide repeat size variability.

PubMed

Tomé, Stéphanie; Nicole, Annie; Gomes-Pereira, Mario; Gourdon, Genevieve

2014-03-06

Many human diseases are associated with the abnormal expansion of unstable trinucleotide repeat sequences. The mechanisms of trinucleotide repeat size mutation have not been fully dissected, and their understanding must be grounded on the detailed analysis of repeat size distributions in human tissues and animal models. Small-pool PCR (SP-PCR) is a robust, highly sensitive and efficient PCR-based approach to assess the levels of repeat size variation, providing both quantitative and qualitative data. The method relies on the amplification of a very low number of DNA molecules, through sucessive dilution of a stock genomic DNA solution. Radioactive Southern blot hybridization is sensitive enough to detect SP-PCR products derived from single template molecules, separated by agarose gel electrophoresis and transferred onto DNA membranes. We describe a variation of the detection method that uses digoxigenin-labelled locked nucleic acid probes. This protocol keeps the sensitivity of the original method, while eliminating the health risks associated with the manipulation of radiolabelled probes, and the burden associated with their regulation, manipulation and waste disposal.
Pms2 Suppresses Large Expansions of the (GAA·TTC)n Sequence in Neuronal Tissues

PubMed Central

Bourn, Rebecka L.; De Biase, Irene; Pinto, Ricardo Mouro; Sandi, Chiranjeevi; Al-Mahdawi, Sahar; Pook, Mark A.; Bidichandani, Sanjay I.

2012-01-01

Expanded trinucleotide repeat sequences are the cause of several inherited neurodegenerative diseases. Disease pathogenesis is correlated with several features of somatic instability of these sequences, including further large expansions in postmitotic tissues. The presence of somatic expansions in postmitotic tissues is consistent with DNA repair being a major determinant of somatic instability. Indeed, proteins in the mismatch repair (MMR) pathway are required for instability of the expanded (CAG·CTG)n sequence, likely via recognition of intrastrand hairpins by MutSβ. It is not clear if or how MMR would affect instability of disease-causing expanded trinucleotide repeat sequences that adopt secondary structures other than hairpins, such as the triplex/R-loop forming (GAA·TTC)n sequence that causes Friedreich ataxia. We analyzed somatic instability in transgenic mice that carry an expanded (GAA·TTC)n sequence in the context of the human FXN locus and lack the individual MMR proteins Msh2, Msh6 or Pms2. The absence of Msh2 or Msh6 resulted in a dramatic reduction in somatic mutations, indicating that mammalian MMR promotes instability of the (GAA·TTC)n sequence via MutSα. The absence of Pms2 resulted in increased accumulation of large expansions in the nervous system (cerebellum, cerebrum, and dorsal root ganglia) but not in non-neuronal tissues (heart and kidney), without affecting the prevalence of contractions. Pms2 suppressed large expansions specifically in tissues showing MutSα-dependent somatic instability, suggesting that they may act on the same lesion or structure associated with the expanded (GAA·TTC)n sequence. We conclude that Pms2 specifically suppresses large expansions of a pathogenic trinucleotide repeat sequence in neuronal tissues, possibly acting independently of the canonical MMR pathway. PMID:23071719
Pms2 suppresses large expansions of the (GAA·TTC)n sequence in neuronal tissues.

PubMed

Bourn, Rebecka L; De Biase, Irene; Pinto, Ricardo Mouro; Sandi, Chiranjeevi; Al-Mahdawi, Sahar; Pook, Mark A; Bidichandani, Sanjay I

2012-01-01

Expanded trinucleotide repeat sequences are the cause of several inherited neurodegenerative diseases. Disease pathogenesis is correlated with several features of somatic instability of these sequences, including further large expansions in postmitotic tissues. The presence of somatic expansions in postmitotic tissues is consistent with DNA repair being a major determinant of somatic instability. Indeed, proteins in the mismatch repair (MMR) pathway are required for instability of the expanded (CAG·CTG)(n) sequence, likely via recognition of intrastrand hairpins by MutSβ. It is not clear if or how MMR would affect instability of disease-causing expanded trinucleotide repeat sequences that adopt secondary structures other than hairpins, such as the triplex/R-loop forming (GAA·TTC)(n) sequence that causes Friedreich ataxia. We analyzed somatic instability in transgenic mice that carry an expanded (GAA·TTC)(n) sequence in the context of the human FXN locus and lack the individual MMR proteins Msh2, Msh6 or Pms2. The absence of Msh2 or Msh6 resulted in a dramatic reduction in somatic mutations, indicating that mammalian MMR promotes instability of the (GAA·TTC)(n) sequence via MutSα. The absence of Pms2 resulted in increased accumulation of large expansions in the nervous system (cerebellum, cerebrum, and dorsal root ganglia) but not in non-neuronal tissues (heart and kidney), without affecting the prevalence of contractions. Pms2 suppressed large expansions specifically in tissues showing MutSα-dependent somatic instability, suggesting that they may act on the same lesion or structure associated with the expanded (GAA·TTC)(n) sequence. We conclude that Pms2 specifically suppresses large expansions of a pathogenic trinucleotide repeat sequence in neuronal tissues, possibly acting independently of the canonical MMR pathway.
In situ optical sequencing and structure analysis of a trinucleotide repeat genome region by localization microscopy after specific COMBO-FISH nano-probing

NASA Astrophysics Data System (ADS)

Stuhlmüller, M.; Schwarz-Finsterle, J.; Fey, E.; Lux, J.; Bach, M.; Cremer, C.; Hinderhofer, K.; Hausmann, M.; Hildenbrand, G.

2015-10-01

Trinucleotide repeat expansions (like (CGG)n) of chromatin in the genome of cell nuclei can cause neurological disorders such as for example the Fragile-X syndrome. Until now the mechanisms are not clearly understood as to how these expansions develop during cell proliferation. Therefore in situ investigations of chromatin structures on the nanoscale are required to better understand supra-molecular mechanisms on the single cell level. By super-resolution localization microscopy (Spectral Position Determination Microscopy; SPDM) in combination with nano-probing using COMBO-FISH (COMBinatorial Oligonucleotide FISH), novel insights into the nano-architecture of the genome will become possible. The native spatial structure of trinucleotide repeat expansion genome regions was analysed and optical sequencing of repetitive units was performed within 3D-conserved nuclei using SPDM after COMBO-FISH. We analysed a (CGG)n-expansion region inside the 5' untranslated region of the FMR1 gene. The number of CGG repeats for a full mutation causing the Fragile-X syndrome was found and also verified by Southern blot. The FMR1 promotor region was similarly condensed like a centromeric region whereas the arrangement of the probes labelling the expansion region seemed to indicate a loop-like nano-structure. These results for the first time demonstrate that in situ chromatin structure measurements on the nanoscale are feasible. Due to further methodological progress it will become possible to estimate the state of trinucleotide repeat mutations in detail and to determine the associated chromatin strand structural changes on the single cell level. In general, the application of the described approach to any genome region will lead to new insights into genome nano-architecture and open new avenues for understanding mechanisms and their relevance in the development of heredity diseases.
Evidence for anticipation in schizophrenia

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bassett, A.S.; Honer, W.G.

Anticipation, or increasing severity of a disorder across successive generations, is a genetic phenomenon with an identified molecular mechanism: expansion of unstable trinucleotide repeat sequences. This study examined anticipation in familial schizophrenia. Three generations of siblines from the affected side of families selected for unilineal, autosomal dominant-like inheritance of schizophrenia were studied (n = 186). Across generations more subjects were hospitalized with psychotic illness (P<.0001), at progressively earlier ages (P<.0001), and with increasing severity of illness (P<.0003). The results indicate that anticipation is present in familial schizophrenia. These findings support both an active search for unstable trinucleotide repeat sequences inmore » schizophrenia and reconsideration of the genetic model used for linkage studies in this disorder. 32 refs., 2 figs., 1 tab.« less
Trinucleotide repeat length and progression of illness in Huntington's disease.

PubMed

Kieburtz, K; MacDonald, M; Shih, C; Feigin, A; Steinberg, K; Bordwell, K; Zimmerman, C; Srinidhi, J; Sotack, J; Gusella, J

1994-11-01

The genetic defect causing Huntington's disease (HD) has been identified as an unstable expansion of a trinucleotide (CAG) repeat sequence within the coding region of the IT15 gene on chromosome 4. In 50 patients with manifest HD who were evaluated prospectively and uniformly, we examined the relationship between the extent of the DNA expansion and the rate of illness progression. Although the length of CAG repeats showed a strong inverse correlation with the age at onset of HD, there was no such relationship between the number of CAG repeats and the rate of clinical decline. These findings suggest that the CAG repeat length may influence or trigger the onset of HD, but other genetic, neurobiological, or environmental factors contribute to the progression of illness and the underlying pace of neuronal degeneration.
Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species

PubMed Central

Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha

2011-01-01

Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309
Development of Pineapple Microsatellite Markers and Germplasm Genetic Diversity Analysis

PubMed Central

Tong, Helin; Chen, You; Wang, Jingyi; Chen, Yeyuan; Sun, Guangming; He, Junhu; Wu, Yaoting

2013-01-01

Two methods were used to develop pineapple microsatellite markers. Genomic library-based SSR development: using selectively amplified microsatellite assay, 86 sequences were generated from pineapple genomic library. 91 (96.8%) of the 94 Simple Sequence Repeat (SSR) loci were dinucleotide repeats (39 AC/GT repeats and 52 GA/TC repeats, accounting for 42.9% and 57.1%, resp.), and the other three were mononucleotide repeats. Thirty-six pairs of SSR primers were designed; 24 of them generated clear bands of expected sizes, and 13 of them showed polymorphism. EST-based SSR development: 5659 pineapple EST sequences obtained from NCBI were analyzed; among 1397 nonredundant EST sequences, 843 were found containing 1110 SSR loci (217 of them contained more than one SSR locus). Frequency of SSRs in pineapple EST sequences is 1SSR/3.73 kb, and 44 types were found. Mononucleotide, dinucleotide, and trinucleotide repeats dominate, accounting for 95.6% in total. AG/CT and AGC/GCT were the dominant type of dinucleotide and trinucleotide repeats, accounting for 83.5% and 24.1%, respectively. Thirty pairs of primers were designed for each of randomly selected 30 sequences; 26 of them generated clear and reproducible bands, and 22 of them showed polymorphism. Eighteen pairs of primers obtained by the one or the other of the two methods above that showed polymorphism were selected to carry out germplasm genetic diversity analysis for 48 breeds of pineapple; similarity coefficients of these breeds were between 0.59 and 1.00, and they can be divided into four groups accordingly. Amplification products of five SSR markers were extracted and sequenced, corresponding repeat loci were found and locus mutations are mainly in copy number of repeats and base mutations in the flanking region. PMID:24024187
Trinucleotide repeat length and progression of illness in Huntington's disease.

PubMed Central

Kieburtz, K; MacDonald, M; Shih, C; Feigin, A; Steinberg, K; Bordwell, K; Zimmerman, C; Srinidhi, J; Sotack, J; Gusella, J

1994-01-01

The genetic defect causing Huntington's disease (HD) has been identified as an unstable expansion of a trinucleotide (CAG) repeat sequence within the coding region of the IT15 gene on chromosome 4. In 50 patients with manifest HD who were evaluated prospectively and uniformly, we examined the relationship between the extent of the DNA expansion and the rate of illness progression. Although the length of CAG repeats showed a strong inverse correlation with the age at onset of HD, there was no such relationship between the number of CAG repeats and the rate of clinical decline. These findings suggest that the CAG repeat length may influence or trigger the onset of HD, but other genetic, neurobiological, or environmental factors contribute to the progression of illness and the underlying pace of neuronal degeneration. PMID:7853373
ATXN2 trinucleotide repeat length correlates with risk of ALS.

PubMed

Sproviero, William; Shatunov, Aleksey; Stahl, Daniel; Shoai, Maryam; van Rheenen, Wouter; Jones, Ashley R; Al-Sarraj, Safa; Andersen, Peter M; Bonini, Nancy M; Conforti, Francesca L; Van Damme, Philip; Daoud, Hussein; Del Mar Amador, Maria; Fogh, Isabella; Forzan, Monica; Gaastra, Ben; Gellera, Cinzia; Gitler, Aaron D; Hardy, John; Fratta, Pietro; La Bella, Vincenzo; Le Ber, Isabelle; Van Langenhove, Tim; Lattante, Serena; Lee, Yi-Chung; Malaspina, Andrea; Meininger, Vincent; Millecamps, Stéphanie; Orrell, Richard; Rademakers, Rosa; Robberecht, Wim; Rouleau, Guy; Ross, Owen A; Salachas, Francois; Sidle, Katie; Smith, Bradley N; Soong, Bing-Wen; Sorarù, Gianni; Stevanin, Giovanni; Kabashi, Edor; Troakes, Claire; van Broeckhoven, Christine; Veldink, Jan H; van den Berg, Leonard H; Shaw, Christopher E; Powell, John F; Al-Chalabi, Ammar

2017-03-01

We investigated a CAG trinucleotide repeat expansion in the ATXN2 gene in amyotrophic lateral sclerosis (ALS). Two new case-control studies, a British dataset of 1474 ALS cases and 567 controls, and a Dutch dataset of 1328 ALS cases and 691 controls were analyzed. In addition, to increase power, we systematically searched PubMed for case-control studies published after 1 August 2010 that investigated the association between ATXN2 intermediate repeats and ALS. We conducted a meta-analysis of the new and existing studies for the relative risks of ATXN2 intermediate repeat alleles of between 24 and 34 CAG trinucleotide repeats and ALS. There was an overall increased risk of ALS for those carrying intermediate sized trinucleotide repeat alleles (odds ratio 3.06 [95% confidence interval 2.37-3.94]; p = 6 × 10 -18 ), with an exponential relationship between repeat length and ALS risk for alleles of 29-32 repeats (R 2 = 0.91, p = 0.0002). No relationship was seen for repeat length and age of onset or survival. In contrast to trinucleotide repeat diseases, intermediate ATXN2 trinucleotide repeat expansion in ALS does not predict age of onset but does predict disease risk. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.
Myotonin protein-kinase [AGC]n trinucleotide repeat in seven nonhuman primates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Novelli, G.; Sineo, L.; Pontieri, E.

Myotonic dystrophy (DM) is due to a genomic instability of a trinucleotide [AGC]n motif, located at the 3{prime} UTR region of a protein-kinase gene (myotonin protein kinase, MT-PK). The [AGC] repeat is meiotically and mitotically unstable, and it is directly related to the manifestations of the disorder. Although a gene dosage effect of the MT-PK has been demonstrated n DM muscle, the mechanism(s) by which the intragenic repeat expansion leads to disease is largely unknown. This non-standard mutational event could reflect an evolutionary mechanism widespread among animal genomes. We have isolated and sequenced the complete 3{prime}UTR region of the MT-PKmore » gene in seven primates (macaque, orangutan, gorilla, chimpanzee, gibbon, owl monkey, saimiri), and examined by comparative sequence nucleotide analysis the [AGC]n intragenic repeat and the surrounding nucleotides. The genomic organization, including the [AGC]n repeat structure, was conserved in all examined species, excluding the gibbon (Hylobates agilis), in which the [AGC]n upstream sequence (GGAA) is replaced by a GA dinucleotide. The number of [AGC]n in the examined species ranged between 7 (gorilla) and 13 repeats (owl monkeys), with a polymorphism informative content (PIC) similar to that observed in humans. These results indicate that the 3{prime}UTR [AGC] repeat within the MT-PK gene is evolutionarily conserved, supporting that this region has important regulatory functions.« less
Msh2-Msh3 Interferes with Okazaki Fragment Processing to Promote Trinucleotide Repeat Expansions

PubMed Central

Kantartzis, Athena; Williams, Gregory M.; Balakrishnan, Lata; Roberts, Rick L.; Surtees, Jennifer A.; Bambara, Robert A.

2012-01-01

Summary Trinucleotide repeat (TNR) expansions are the underlying cause of more than forty neurodegenerative and neuromuscular diseases, including myotonic dystrophy and Huntington’s disease. Although genetic evidence has attributed the cause of these diseases to errors in DNA replication and/or repair, clear molecular mechanisms have not been described. We have focused on the role of the mismatch repair complex Msh2-Msh3 in promoting TNR expansions. We demonstrate that Msh2-Msh3 promotes CTG and CAG repeat expansions in vivo in Saccharomyces cerevisiae. We further provide biochemical evidence that Msh2-Msh3 directly interferes with normal Okazaki fragment processing by flap endonuclease1 (Rad27) and DNA Ligase I (Cdc9) in the presence of TNR sequences, thereby producing small, incremental expansion events. We believe that this is the first mechanistic evidence showing the interplay of replication and repair proteins in the expansion of sequences during lagging strand DNA replication. PMID:22938864
Msh2-Msh3 interferes with Okazaki fragment processing to promote trinucleotide repeat expansions.

PubMed

Kantartzis, Athena; Williams, Gregory M; Balakrishnan, Lata; Roberts, Rick L; Surtees, Jennifer A; Bambara, Robert A

2012-08-30

Trinucleotide repeat (TNR) expansions are the underlying cause of more than 40 neurodegenerative and neuromuscular diseases, including myotonic dystrophy and Huntington's disease. Although genetic evidence points to errors in DNA replication and/or repair as the cause of these diseases, clear molecular mechanisms have not been described. Here, we focused on the role of the mismatch repair complex Msh2-Msh3 in promoting TNR expansions. We demonstrate that Msh2-Msh3 promotes CTG and CAG repeat expansions in vivo in Saccharomyces cerevisiae. Furthermore, we provide biochemical evidence that Msh2-Msh3 directly interferes with normal Okazaki fragment processing by flap endonuclease1 (Rad27) and DNA ligase I (Cdc9) in the presence of TNR sequences, thereby producing small, incremental expansion events. We believe that this is the first mechanistic evidence showing the interplay of replication and repair proteins in the expansion of sequences during lagging-strand DNA replication. Copyright © 2012 The Authors. Published by Elsevier Inc. All rights reserved.
Isolation and characterization of microsatellite markers in Fraser fir (Abies fraseri)

Treesearch

S.A. Josserand; K.M. Potter; G. Johnson; J.A. Bowen; J. Frampton; C.D. Nelson

2006-01-01

We describe the isolation and characterization of 14 microsatellite loci from Fraser fir (Abies fraseri). These markers originated from cloned inserts enriched for DNA sequences containing tandem di- and tri-nucleotide repeats. In total, 36 clones were selected, sequenced and evaluated. Polymerase chain reaction (PCR) primers for 14 of these...
Thermodynamic stability of RNA structures formed by CNG trinucleotide repeats. Implication for prediction of RNA structure.

PubMed

Broda, Magdalena; Kierzek, Elzbieta; Gdaniec, Zofia; Kulinski, Tadeusz; Kierzek, Ryszard

2005-08-16

Trinucleotide repeat expansion diseases (TREDs) are correlated with elongation of CNG DNA and RNA repeats to pathological level. This paper shows, for the first time, complete data concerning thermodynamic stabilities of RNA with CNG trinucleotide repeats. Our studies include the stability of oligoribonucleotides composed of two to seven of CAG, CCG, CGG, and CUG repeats. The thermodynamic parameters of helix propagation correlated with the presence of multiple N-N mismatches within CNG RNA duplexes were also determined. Moreover, the total stability of CNG RNA hairpins, as well as the contribution of trinucleotide repeats placed only in the stem or loop regions, was evaluated. The improved thermodynamic parameters allow to predict much more accurately the thermodynamic stabilities and structures of CNG RNAs.
A codon-usage variant in the (GGN){sub n} trinucleotide polymorphism of the androgen receptor gene as an aid in the prenatal diagnosis of ambiguous genitalia due to partial androgen insensitivity

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lumbroso, R.; Vasiliou, M.; Beitel, L.K.

1994-09-01

Exon 1 at the X-linked androgen receptor (AR) locus encodes an N-terminal modulatory domain that contains two large homopolyamino acid tracts: (CAG;glutamine;Gln){sub 11-33} and (GGN;Glycine;Cly){sub 15-27}. Certain AR mutations cause partial androgen insensitivity (PAI) with frank genital ambiguity that may engender appreciable parental anxiety and patient morbidity. If the AR mutation in a PAI family is unknown, the AR`s intragenic trinucleotide repeat polymorphisms may be used for prenatal diagnosis. However, intergenerational instability of repeat-size may be worrisome, particularly when the information alleles differ by only a few repeats. Here, we report the discovery of a codon-usage (silent substitution) variant inmore » the GGN repeat, and describe its use as a source of complementary information for prenatal diagnosis. The standard sense sequence of the (GGN){sub n} tract is (GGT){sub 3} GGG(GGT){sub 2} (GGC){sub 9-21}. On 4 of 27 X chromosomes we noted that the internal GGT sequence was expanded to 3 or 4 repeats. We used an internal (GGT){sub 4} repeat in a total (GGN){sub 24} tract together with a (CAG){sub 20} tract to distinguish an X chromosome with a mutant AR allele from another X chromosome, bearing a normal allele, that had an internal (GGT){sub 2} repeat in a total (GGN){sub 23} tract together with a (CAG){sub 21} tract. Subsequently, we found the base change leading to a pathogenic amino acid substitution (M779I) in codon 6 of the mutant AR gene in an affected maternal aunt and the fetus at risk. This confirmed the prenatal diagnosis based on the intragenic trinucleotide repeat polymorphisms, and it strengthened the prediction of external genital ambiguity using our previous experience with M779I in another family.« less

Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

PubMed

Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

2014-01-01

Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.
Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers

PubMed Central

Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

2014-01-01

Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551
The reproductive outcome of female patients with myotonic dystrophy type 1 (DM1) undergoing PGD is not affected by the size of the expanded CTG repeat tract

PubMed Central

Seneca, Sara; De Rademaeker, Marjan; Sermon, Karen; De Rycke, Martine; De Vos, Michel; Haentjens, Patrick; Devroey, Paul; Liebaers, Ingeborg

2010-01-01

Purpose This study aims to analyze the relationship between trinucleotide repeat length and reproductive outcome in a large cohort of DM1 patients undergoing ICSI and PGD. Methods Prospective cohort study. The effect of trinucleotide repeat length on reproductive outcome per patient was analyzed using bivariate analysis (T-test) and multivariate analysis using Kaplan-Meier and Cox regression analysis. Results Between 1995 and 2005, 205 cycles of ICSI and PGD were carried out for DM1 in 78 couples. The number of trinucleotide repeats does not have an influence on reproductive outcome when adjusted for age, BMI, basal FSH values, parity, infertility status and male or female affected. Cox regression analysis indicates that cumulative live birth rate is not influenced by the number of trinucleotide repeats. The only factor with a significant effect is age (p < 0.05). Conclusion There is no evidence of an effect of trinucleotide repeat length on reproductive outcome in patients undergoing ICSI and PGD. PMID:20221684
Rate-determining Step of Flap Endonuclease 1 (FEN1) Reflects a Kinetic Bias against Long Flaps and Trinucleotide Repeat Sequences.

PubMed

Tarantino, Mary E; Bilotti, Katharina; Huang, Ji; Delaney, Sarah

2015-08-21

Flap endonuclease 1 (FEN1) is a structure-specific nuclease responsible for removing 5'-flaps formed during Okazaki fragment maturation and long patch base excision repair. In this work, we use rapid quench flow techniques to examine the rates of 5'-flap removal on DNA substrates of varying length and sequence. Of particular interest are flaps containing trinucleotide repeats (TNR), which have been proposed to affect FEN1 activity and cause genetic instability. We report that FEN1 processes substrates containing flaps of 30 nucleotides or fewer at comparable single-turnover rates. However, for flaps longer than 30 nucleotides, FEN1 kinetically discriminates substrates based on flap length and flap sequence. In particular, FEN1 removes flaps containing TNR sequences at a rate slower than mixed sequence flaps of the same length. Furthermore, multiple-turnover kinetic analysis reveals that the rate-determining step of FEN1 switches as a function of flap length from product release to chemistry (or a step prior to chemistry). These results provide a kinetic perspective on the role of FEN1 in DNA replication and repair and contribute to our understanding of FEN1 in mediating genetic instability of TNR sequences. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Comparison of simple sequence repeats in 19 Archaea.

PubMed

Trivedi, S

2006-12-05

All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.
Spinocerebellar ataxia type 1 and Machado-Joseph disease: Incidence of CAG expansions among adult-onset ataxia patients from 311 families with dominant, recessive, or sporadic ataxia

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ranum, L.P.W.; Gomez, C.; Orr, H.T.

1995-09-01

The ataxias are a complex group of diseases with both environmental and genetic causes. Among the autosomal dominant forms of ataxia the genes for two, spinocerebellar ataxia type 1 (SCA1) and Machado-Joseph disease (MJD), have been isolated. In both of these disorders the molecular basis of disease is the expansion of an unstable CAG trinucleotide repeat. To assess the frequency of the SCA1 and MJD trinucleotide repeat expansions among individuals diagnosed with ataxia, we have collected DNA from individuals representing 311 families with adult-onset ataxia of unknown etiology and screened these samples for trinucleotide repeat expansions within the SCA1 andmore » MJD genes. Within this group there are 149 families with dominantly inherited ataxia. Of these, 3% have SCA1 trinucleotide repeat expansions, whereas 21% were positive for the MJD trinucleotide expansion. Thus, together SCA1 and MJD represent 24% of the autosomal dominant ataxias in our group, and the frequency of MJD is substantially greater than that of SCA1. For the 57 patients with MJD trinucleotide repeat expansions, a strong inverse correlation between CAG repeat size and age at onset was observed (r = -.838). Among the MJD patients, the normal and affected ranges of CAG repeat size are 14-40 and 68-82 repeats, respectively. For SCA1 the normal and affected ranges are much closer, containing 19-38 and 40-81 CAG repeats, respectively. 30 refs., 1 fig., 3 tabs.« less
Development of unigene-derived SSR markers in cowpea (Vigna unguiculata) and their transferability to other Vigna species.

PubMed

Gupta, S K; Gopalakrishna, T

2010-07-01

Unigene sequences available in public databases provide a cost-effective and valuable source for the development of molecular markers. In this study, the identification and development of unigene-based SSR markers in cowpea (Vigna unguiculata (L.) Walp.) is presented. A total of 1071 SSRs were identified in 15 740 cowpea unigene sequences downloaded from the National Center for Biotechnology Information. The most frequent SSR motifs present in the unigenes were trinucleotides (59.7%), followed by dinucleotides (34.8%), pentanucleotides (4%), and tetranucleotides (1.5%). The copy number varied from 6 to 33 for dinucleotide, 5 to 29 for trinucleotide, 5 to 7 for tetranucleotide, and 4 to 6 for pentanucleotide repeats. Primer pairs were successfully designed for 803 SSR motifs and 102 SSR markers were finally characterized and validated. Putative function was assigned to 64.7% of the unigene SSR markers based on significant homology to reported proteins. About 31.7% of the SSRs were present in coding sequences and 68.3% in untranslated regions of the genes. About 87% of the SSRs located in the coding sequences were trinucleotide repeats. Allelic variation at 32 SSR loci produced 98 alleles in 20 cowpea genotypes. The polymorphic information content for the SSR markers varied from 0.10 to 0.83 with an average of 0.53. These unigene SSR markers showed a high rate of transferability (88%) across other Vigna species, thereby expanding their utility. Alignment of unigene sequences with soybean genomic sequences revealed the presence of introns in amplified products of some of the SSR markers. This study presents the distribution of SSRs in the expressed portion of the cowpea genome and is the first report of the development of functional unigene-based SSR markers in cowpea. These SSR markers would play an important role in molecular mapping, comparative genomics, and marker-assisted selection strategies in cowpea and other Vigna species.
Analysis of thirteen trinucleotide repeat loci as candidate genes for Schizophrenia and bipolar affective disorder

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jain, S.; Leggo, J.; Ferguson-Smith, M.A.

1996-04-09

A group of diseases are due to abnormal expansions of trinucleotide repeats. These diseases all affect the nervous system. In addition, they manifest the phenomenon of anticipation, in which the disease tends to present at an earlier age or with greater severity in successive generations. Many additional genes with trinucleotide repeats are believed to be expressed in the human brain. As anticipation has been reported in schizophrenia and bipolar affective disorder, we have examined allele distributions of 13 trinucleotide repeat-containing genes, many novel and all expressed in the brain, in genomic DNA from schizophrenic (n = 20-97) and bipolar affectivemore » disorder patients (23-30) and controls (n = 43-146). No evidence was obtained to implicate expanded alleles in these 13 genes as causal factors in these diseases. 26 refs., 1 fig., 2 tabs.« less
Single sperm analysis of the trinucleotide repeat in the Huntington`s disease gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leeflang, E.P.; Zhang, L.; Hubert, R.

1994-09-01

Huntington`s disease (HD) is one of several genetic diseases caused by trinucleotide repeat expansion. The CAG repeat is very unstable, with size changes occurring in more than 80% of transmissions. The degree of instability of this repeat in the male germline can be determined by analysis of individual sperm cells. An easy and sensitive PCR assay has been developed to amplify this trinucleotide repeat region from single sperm using two rounds of PCR. As many as 90% of the single sperm show amplification for the HD repeat. The PCR product can be easily detected on an ethidium bromide-stained agarose gel.more » Single sperm samples from an HD patient with 18 and 49 repeats were studied. We observed size variations for the expanded alleles while the size of the normal allele in sperm is very consistent. We did not detect any significant bias in the amplification of normal alleles over the larger HD alleles. Our preliminary study supports the observation made by PCR of total sperm that instability of the HD trinucleotide repeat occurs in the germline. HD preimplantation diagnosis on single embryo blastomeres may also possible.« less
SSRscanner: a program for reporting distribution and exact location of simple sequence repeats.

PubMed

Anwar, Tamanna; Khan, Asad U

2006-02-20

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com.
[Dynamic mutations--a newly detected category of mutations which is the basis for certain neurologic diseases].

PubMed

Mardesić, D

1995-01-01

This review offers some basic information on the discovery of a new type of mutations being the cause of some significant neurologic diseases: myotonic dystrophy, Huntington's disease, spinocerebellar ataxia type 1, spinobulbar pallido-louysian muscular atrophy, fragile X syndrome and some other, up to a total of ten entities. The basis of the so-called dynamic mutations is an abnormal multiplication of a trinucleotide producing sequences of several hundreds or even thousands of identical copies in the respective gene. The result is designated as expanded or amplified trinucleotide (or triplet) repeat. These sequences are not stable, but increase (or exceptionally decrease) in length during cell multiplication in successive generations. They segregate within families with the affected members, demonstrating a significant correlation between the length of the repeat sequence, the severity of the pathologic phenotype and an inverse correlation with the age at the clinical manifestation of the disease. Thus, at least, a formal explanation for the anticipation phenomenon of the age at which the disease is manifested within a family is offered. The importance of the discovery of dynamic mutations lies in the possibility for more precise and reliable genetic counselling. The discovery has opened a lot of new questions giving a new impetus for intensive research.
FISH-detected delay in replication timing of mutated FMR1 alleles on both active and inactive X-chromosomes.

PubMed

Yeshaya, J; Shalgi, R; Shohat, M; Avivi, L

1999-01-01

X-chromosome inactivation and the size of the CGG repeat number are assumed to play a role in the clinical, physical, and behavioral phenotype of female carriers of a mutated FMR1 allele. In view of the tight relationship between replication timing and the expression of a given DNA sequence, we have examined the replication timing of FMR1 alleles on active and inactive X-chromosomes in cell samples (lymphocytes or amniocytes) of 25 females: 17 heterozygous for a mutated FMR1 allele with a trinucleotide repeat number varying from 58 to a few hundred, and eight homozygous for a wild-type allele. We have applied two-color fluorescence in situ hybridization (FISH) with FMR1 and X-chromosome alpha-satellite probes to interphase cells of the various genotypes: the alpha-satellite probe was used to distinguish between early replicating (active) and late replicating (inactive) X-chromosomes, and the FMR1 probe revealed the replication pattern of this locus. All samples, except one with a large trinucleotide expansion, showed an early replicating FMR1 allele on the active X-chromosome and a late replicating allele on the inactive X-chromosome. In samples of mutation carriers, both the early and the late alleles showed delayed replication compared with normal alleles, regardless of repeat size. We conclude therefore that: (1) the FMR1 locus is subjected to X-inactivation; (2) mutated FMR1 alleles, regardless of repeat size, replicate later than wild-type alleles on both the active and inactive X-chromosomes; and (3) the delaying effect of the trinucleotide expansion, even with a low repeat size, is superimposed on the delay in replication associated with X-inactivation.
Triplet repeat expansion at the FRAXE locus and X-linked mild mental handicap.

PubMed Central

Knight, S. J.; Voelckel, M. A.; Hirst, M. C.; Flannery, A. V.; Moncla, A.; Davies, K. E.

1994-01-01

We have recently shown that the expression of the FRAXE fragile site in Xq28 is associated with the expansion of a GCC trinucleotide repeat. In the families studied, FRAXE expression is also associated with mild mental handicap. Here we present data on families that previously had been diagnosed as having the fragile X syndrome but that later were found to be negative for trinucleotide repeat expansion at the FRAXA locus. In these families we demonstrate the presence of a GCC trinucleotide repeat expansion at the FRAXE locus. Studies of the FRAXE locus of normal individuals show that they have 6-25 copies of the repeat, whereas affected individuals have > 200 copies. As in the fragile X syndrome, the amplified CpG residues are methylated in affected males. Images Figure 2 Figure 3 Figure 4 PMID:8023854
SSRscanner: a program for reporting distribution and exact location of simple sequence repeats

PubMed Central

Anwar, Tamanna; Khan, Asad U

2006-01-01

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. Availability This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com PMID:17597863
Differential Impact of the "FMR1" Gene on Visual Processing in Fragile X Syndrome

ERIC Educational Resources Information Center

Kogan, Cary S.; Boutet, Isabelle; Cornish, Kim; Zangenehpour, Shahin; Mullen, Kathy T.; Holden, Jeanette J. A.; Kaloustian, Vazken M. Der; Andermann, Eva; Chaudhuri, Avi

2004-01-01

Fragile X syndrome (FXS) is the most common form of heritable mental retardation, affecting (~ around) 1 in 4000 males. The syndrome arises from expansion of a trinucleotide repeat in the 5'-untranslated region of the fragile X mental retardation 1 ("FMR1") gene, leading to methylation of the promoter sequence and lack of the fragile X mental…
Characterization and compilation of polymorphic simple sequence repeat (SSR) markers of peanut from public database

PubMed Central

2012-01-01

Background There are several reports describing thousands of SSR markers in the peanut (Arachis hypogaea L.) genome. There is a need to integrate various research reports of peanut DNA polymorphism into a single platform. Further, because of lack of uniformity in the labeling of these markers across the publications, there is some confusion on the identities of many markers. We describe below an effort to develop a central comprehensive database of polymorphic SSR markers in peanut. Findings We compiled 1,343 SSR markers as detecting polymorphism (14.5%) within a total of 9,274 markers. Amongst all polymorphic SSRs examined, we found that AG motif (36.5%) was the most abundant followed by AAG (12.1%), AAT (10.9%), and AT (10.3%).The mean length of SSR repeats in dinucleotide SSRs was significantly longer than that in trinucleotide SSRs. Dinucleotide SSRs showed higher polymorphism frequency for genomic SSRs when compared to trinucleotide SSRs, while for EST-SSRs, the frequency of polymorphic SSRs was higher in trinucleotide SSRs than in dinucleotide SSRs. The correlation of the length of SSR and the frequency of polymorphism revealed that the frequency of polymorphism was decreased as motif repeat number increased. Conclusions The assembled polymorphic SSRs would enhance the density of the existing genetic maps of peanut, which could also be a useful source of DNA markers suitable for high-throughput QTL mapping and marker-assisted selection in peanut improvement and thus would be of value to breeders. PMID:22818284
MSH3 Promotes Dynamic Behavior of Trinucleotide Repeat Tracts In Vivo

PubMed Central

Williams, Gregory M.; Surtees, Jennifer A.

2015-01-01

Trinucleotide repeat (TNR) expansions are the underlying cause of more than 40 neurodegenerative and neuromuscular diseases, including myotonic dystrophy and Huntington’s disease, yet the pathway to expansion remains poorly understood. An important step in expansion is the shift from a stable TNR sequence to an unstable, expanding tract, which is thought to occur once a TNR attains a threshold length. Modeling of human data has indicated that TNR tracts are increasingly likely to expand as they increase in size and to do so in increments that are smaller than the repeat itself, but this has not been tested experimentally. Genetic work has implicated the mismatch repair factor MSH3 in promoting expansions. Using Saccharomyces cerevisiae as a model for CAG and CTG tract dynamics, we examined individual threshold-length TNR tracts in vivo over time in MSH3 and msh3Δ backgrounds. We demonstrate, for the first time, that these TNR tracts are highly dynamic. Furthermore, we establish that once such a tract has expanded by even a few repeat units, it is significantly more likely to expand again. Finally, we show that threshold- length TNR sequences readily accumulate net incremental expansions over time through a series of small expansion and contraction events. Importantly, the tracts were substantially stabilized in the msh3Δ background, with a bias toward contractions, indicating that Msh2-Msh3 plays an important role in shifting the expansion-contraction equilibrium toward expansion in the early stages of TNR tract expansion. PMID:25969461
Triplet repeat expansion at the FRAXE locus and x-linked mild mental handicap

DOE Office of Scientific and Technical Information (OSTI.GOV)

Knight, S.J.L.; Hirst, M.C.; Flannery, A.V.

1994-07-01

The authors have recently shown that the expression of the FRAXE fragile site in Xq28 is associated with expansion of a GCC trinucleotide repeat. In the families studied, FRAXE expression is also associated with mild mental handicap. Here they present data on families that previously had been diagnosed as having the fragile X syndrome but that later were found to be negative for trinucleotide repeat expansion at the FRAXA locus. In these families they demonstrate the presence of a GCC trinucleotide repeat expansion at the FRAXE locus. Studies of the FRAXE locus of normal individuals show that they have 6-25more » copies of the repeat, whereas affected individuals have >200 copies. As in the fragile X syndrome, the amplified CpG residues are methylated in affected males. 19 refs., 4 figs., 1 tab.« less
Cis-acting factors modulate stability of intermediate alleles for Huntington disease

DOE Office of Scientific and Technical Information (OSTI.GOV)

Goldberg, Y.P.; Zeisler, J.; Thielmann, J.

1994-09-01

The genetic basis of Huntington disease (HD), a late-onset autosomal dominant neurodegenerative disorder, has recently been defined as a CAG trinucleotide expansion in a novel gene on 4p16.3. The CAG length in clinically normal people ranges from 9 to 37, with the vast majority of alleles (99%) containing less than 30 repeats. In contrast, HD patients have CAG lengths greater than 36 with the largest repeat reported to date being 121. Molecular analysis of sporadic cases of HD revealed that new mutations are not rare (3%), and arise from intermediate alleles (IAs). IAs are CAG alleles greater than that usuallymore » seen in the general population (>30), but less than that seen in patients with HD and occur with a frequency of approximately 1.5% of the general population (12/797). An important question is whether these IAs are also susceptible to expansion. In new mutation families, these IAs are unstable in passage through the male germline and in sporadic cases expand to the full mutation associated with the HD phenotype. On the 41 meioses analyzed in new mutation families, 61% were unstable. In contrast to IAs in the new mutation families, the IAs in the general population were predominately stable from one generation to the next. Comparison of the frequency of intergenerational stability between the general population and the new mutation families showed that IAs in the general population are considerably more stable than those in the new mutation families. In contrast to SCA 1 where sequence interruption is thought to play a role in CAG trinucleotide stability, sequence analysis of IAs both from the general population and the new mutation families failed to reveal any interruption of the CAG tracts. These findings suggest that while CAG size is an important factor, other cis-acting factors present in new mutation families but not in the general population are likely to be critical in conferring instability upon the CAG trinucleotide repeat.« less
MicroRNAs in CAG trinucleotide repeat expansion disorders: an integrated review of the literature.

PubMed

Dumitrescu, Laura; Popescu, Bogdan O

2015-01-01

MicroRNAs are small RNAs involved in gene silencing. They play important roles in transcriptional regulation and are selectively and abundantly expressed in the central nervous system. A considerable amount of the human genome is comprised of tandem repeating nucleotide streams. Several diseases are caused by above-threshold expansion of certain trinucleotide repeats occurring in a protein-coding or non-coding region. Though monogenic, CAG trinucleotide repeat expansion disorders have a complex pathogenesis, various combinations of multiple coexisting pathways resulting in one common final consequence: selective neurodegeneration. Mutant protein and mutant transcript gain of toxic function are considered to be the core pathogenic mechanisms. The profile of microRNAs in CAG trinucleotide repeat disorders is scarcely described, however microRNA dysregulation has been identified in these diseases and microRNA-related intereference with gene expression is considered to be involved in their pathogenesis. Better understanding of microRNAs functions and means of manipulation promises to offer further insights into the pathogenic pathways of CAG repeat expansion disorders, to point out new potential targets for drug intervention and to provide some of the much needed etiopathogenic therapeutic agents. A number of disease-modifying microRNA silencing strategies are under development, but several implementation impediments still have to be resolved. CAG targeting seems feasible and efficient in animal models and is an appealing approach for clinical practice. Preliminary human trials are just beginning.

Simple sequence repeat marker development from bacterial artificial chromosome end sequences and expressed sequence tags of flax (Linum usitatissimum L.).

PubMed

Cloutier, Sylvie; Miranda, Evelyn; Ward, Kerry; Radovanovic, Natasa; Reimer, Elsa; Walichnowski, Andrzej; Datla, Raju; Rowland, Gordon; Duguid, Scott; Ragupathy, Raja

2012-08-01

Flax is an important oilseed crop in North America and is mostly grown as a fibre crop in Europe. As a self-pollinated diploid with a small estimated genome size of ~370 Mb, flax is well suited for fast progress in genomics. In the last few years, important genetic resources have been developed for this crop. Here, we describe the assessment and comparative analyses of 1,506 putative simple sequence repeats (SSRs) of which, 1,164 were derived from BAC-end sequences (BESs) and 342 from expressed sequence tags (ESTs). The SSRs were assessed on a panel of 16 flax accessions with 673 (58 %) and 145 (42 %) primer pairs being polymorphic in the BESs and ESTs, respectively. With 818 novel polymorphic SSR primer pairs reported in this study, the repertoire of available SSRs in flax has more than doubled from the combined total of 508 of all previous reports. Among nucleotide motifs, trinucleotides were the most abundant irrespective of the class, but dinucleotides were the most polymorphic. SSR length was also positively correlated with polymorphism. Two dinucleotide (AT/TA and AG/GA) and two trinucleotide (AAT/ATA/TAA and GAA/AGA/AAG) motifs and their iterations, different from those reported in many other crops, accounted for more than half of all the SSRs and were also more polymorphic (63.4 %) than the rest of the markers (42.7 %). This improved resource promises to be useful in genetic, quantitative trait loci (QTL) and association mapping as well as for anchoring the physical/genetic map with the whole genome shotgun reference sequence of flax.
Development, characterization and cross species amplification of polymorphic microsatellite markers from expressed sequence tags of turmeric (Curcuma longa L.).

PubMed

Siju, S; Dhanya, K; Syamkumar, S; Sasikumar, B; Sheeja, T E; Bhat, A I; Parthasarathy, V A

2010-02-01

Expressed sequence tags (ESTs) from turmeric (Curcuma longa L.) were used for the screening of type and frequency of Class I (hypervariable) simple sequence repeats (SSRs). A total of 231 microsatellite repeats were detected from 12,593 EST sequences of turmeric after redundancy elimination. The average density of Class I SSRs accounts to one SSR per 17.96 kb of EST. Mononucleotides were the most abundant class of microsatellite repeat in turmeric ESTs followed by trinucleotides. A robust set of 17 polymorphic EST-SSRs were developed and used for evaluating 20 turmeric accessions. The number of alleles detected ranged from 3 to 8 per loci. The developed markers were also evaluated in 13 related species of C. longa confirming high rate (100%) of cross species transferability. The polymorphic microsatellite markers generated from this study could be used for genetic diversity analysis and resolving the taxonomic confusion prevailing in the genus.
MSH3 Promotes Dynamic Behavior of Trinucleotide Repeat Tracts In Vivo.

PubMed

Williams, Gregory M; Surtees, Jennifer A

2015-07-01

Trinucleotide repeat (TNR) expansions are the underlying cause of more than 40 neurodegenerative and neuromuscular diseases, including myotonic dystrophy and Huntington's disease, yet the pathway to expansion remains poorly understood. An important step in expansion is the shift from a stable TNR sequence to an unstable, expanding tract, which is thought to occur once a TNR attains a threshold length. Modeling of human data has indicated that TNR tracts are increasingly likely to expand as they increase in size and to do so in increments that are smaller than the repeat itself, but this has not been tested experimentally. Genetic work has implicated the mismatch repair factor MSH3 in promoting expansions. Using Saccharomyces cerevisiae as a model for CAG and CTG tract dynamics, we examined individual threshold-length TNR tracts in vivo over time in MSH3 and msh3Δ backgrounds. We demonstrate, for the first time, that these TNR tracts are highly dynamic. Furthermore, we establish that once such a tract has expanded by even a few repeat units, it is significantly more likely to expand again. Finally, we show that threshold- length TNR sequences readily accumulate net incremental expansions over time through a series of small expansion and contraction events. Importantly, the tracts were substantially stabilized in the msh3Δ background, with a bias toward contractions, indicating that Msh2-Msh3 plays an important role in shifting the expansion-contraction equilibrium toward expansion in the early stages of TNR tract expansion. Copyright © 2015 by the Genetics Society of America.
A study of the Huntington's disease associated trinucleotide repeat in the Scottish population.

PubMed Central

Barron, L H; Warner, J P; Porteous, M; Holloway, S; Simpson, S; Davidson, R; Brock, D J

1993-01-01

Accurate measurements of a specific CAG repeat sequence in the Huntington's disease (HD) gene in 337 HD patients and 229 normal controls from the Scottish population showed a range from 35 to 62 repeats in affected subjects and eight to 33 in normal subjects. A link between early onset of symptoms and very high repeat number was seen. For HD patients with the most common affected allele sizes (39 to 42 repeats) absolute repeat size was a poor index for the age at onset of symptoms. There was variability in the transmitted repeat size for both sexes in the HD size range. We observed a significant increase of repeat size for paternal transmission of the disease and greater instability for paternally transmitted CAG repeats in the HD size range. Images PMID:8133495
Intercalation of XR5944 with the estrogen response element is modulated by the tri-nucleotide spacer sequence between half-sites

PubMed Central

Sidell, Neil; Mathad, Raveendra I.; Shu, Feng-jue; Zhang, Zhenjiang; Kallen, Caleb B.; Yang, Danzhou

2011-01-01

DNA-intercalating molecules can impair DNA replication, DNA repair, and gene transcription. We previously demonstrated that XR5944, a DNA bis-intercalator, specifically blocks binding of estrogen receptor-α (ERα) to the consensus estrogen response element (ERE). The consensus ERE sequence is AGGTCAnnnTGACCT, where nnn is known as the tri-nucleotide spacer. Recent work has shown that the tri-nucleotide spacer can modulate ERα-ERE binding affinity and ligand-mediated transcriptional responses. To further understand the mechanism by which XR5944 inhibits ERα-ERE binding, we tested its ability to interact with consensus EREs with variable tri-nucleotide spacer sequences and with natural but non-consensus ERE sequences using one dimensional nuclear magnetic resonance (1D 1H NMR) titration studies. We found that the tri-nucleotide spacer sequence significantly modulates the binding of XR5944 to EREs. Of the sequences that were tested, EREs with CGG and AGG spacers showed the best binding specificity with XR5944, while those spaced with TTT demonstrated the least specific binding. The binding stoichiometry of XR5944 with EREs was 2:1, which can explain why the spacer influences the drug-DNA interaction; each XR5944 spans four nucleotides (including portions of the spacer) when intercalating with DNA. To validate our NMR results, we conducted functional studies using reporter constructs containing consensus EREs with tri-nucleotide spacers CGG, CTG, and TTT. Results of reporter assays in MCF-7 cells indicated that XR5944 was significantly more potent in inhibiting the activity of CGG- than TTT-spaced EREs, consistent with our NMR results. Taken together, these findings predict that the anti-estrogenic effects of XR5944 will depend not only on ERE half-site composition but also on the tri-nucleotide spacer sequence of EREs located in the promoters of estrogen-responsive genes. PMID:21333738
Development of expressed sequence tag-simple sequence repeat markers for genetic characterization and population structure analysis of Praxelis clematidea (Asteraceae).

PubMed

Wang, Q Z; Huang, M; Downie, S R; Chen, Z X

2016-05-23

Invasive plants tend to spread aggressively in new habitats and an understanding of their genetic diversity and population structure is useful for their management. In this study, expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed for the invasive plant species Praxelis clematidea (Asteraceae) from 5548 Stevia rebaudiana (Asteraceae) expressed sequence tags (ESTs). A total of 133 microsatellite-containing ESTs (2.4%) were identified, of which 56 (42.1%) were hexanucleotide repeat motifs and 50 (37.6%) were trinucleotide repeat motifs. Of the 24 primer pairs designed from these 133 ESTs, 7 (29.2%) resulted in significant polymorphisms. The number of alleles per locus ranged from 5 to 9. The relatively high genetic diversity (H = 0.2667, I = 0.4212, and P = 100%) of P. clematidea was related to high gene flow (Nm = 1.4996) among populations. The coefficient of population differentiation (GST = 0.2500) indicated that most genetic variation occurred within populations. A Mantel test suggested that there was significant correlation between genetic distance and geographical distribution (r = 0.3192, P = 0.012). These results further support the transferability of EST-SSR markers between closely related genera of the same family.
Structure and Dynamics of DNA and RNA Double Helices Obtained from the CCG and GGC Trinucleotide Repeats.

PubMed

Pan, Feng; Man, Viet Hoang; Roland, Christopher; Sagui, Celeste

2018-04-26

Expansions of both GGC and CCG sequences lead to a number of expandable, trinucleotide repeat (TR) neurodegenerative diseases. Understanding of these diseases involves, among other things, the structural characterization of the atypical DNA and RNA secondary structures. We have performed molecular dynamics simulations of (GCC) n and (GGC) n homoduplexes in order to characterize their conformations, stability, and dynamics. Each TR has two reading frames, which results in eight nonequivalent RNA/DNA homoduplexes, characterized by CpG or GpC steps between the Watson-Crick base pairs. Free energy maps for the eight homoduplexes indicate that the C-mismatches prefer anti-anti conformations, while G-mismatches prefer anti-syn conformations. Comparison between three modifications of the DNA AMBER force field shows good agreement for the mismatch free energy maps. The mismatches in DNA-GCC (but not CCG) are extrahelical, forming an extended e-motif. The mismatched duplexes exhibit characteristic sequence-dependent step twist, with strong variations in the G-rich sequences and the e-motif. The distribution of Na + is highly localized around the mismatches, especially G-mismatches. In the e-motif, there is strong Na + binding by two G(N7) atoms belonging to the pseudo GpC step created when cytosines are extruded and by extrahelical cytosines. Finally, we used a novel technique based on fast melting by means of an infrared laser pulse to classify the relative stability of the different DNA-CCG and -GGC homoduplexes.
Development of Novel SSR Markers for Flax (Linum usitatissimum L.) Using Reduced-Representation Genome Sequencing.

PubMed

Wu, Jianzhong; Zhao, Qian; Wu, Guangwen; Zhang, Shuquan; Jiang, Tingbo

2016-01-01

Flax ( Linum usitatissimum L.) is a major fiber and oil yielding crop grown in northeastern China. Identification of flax molecular markers is a key step toward improving flax yield and quality via marker-assisted breeding. Simple sequence repeat (SSR) markers, which are based on genomic structural variation, are considered the most valuable type of genetic marker for this purpose. In this study, we screened 1574 microsatellites from Linum usitatissimum L. obtained using reduced representation genome sequencing (RRGS) to systematically identify SSR markers. The resulting set of microsatellites consisted mainly of trinucleotide (56.10%) and dinucleotide (35.23%) repeats, with each motif consisting of 5-8 repeats. We then evaluated marker sensitivity and specificity based on samples of 48 flax isolates obtained from northeastern China. Using the new SSR panel, the results demonstrated that fiber flax and oilseed flax varieties clustered into two well separated groups. The novel SSR markers developed in this study show potential value for selection of varieties for use in flax breeding programs.
Development of Simple Sequence Repeats (SSR) markers in Setaria italica (Poaceae) and cross-amplification in related species.

PubMed

Lin, Heng-Sheng; Chiang, Chih-Yun; Chang, Song-Bin; Kuoh, Chang-Sheng

2011-01-01

Foxtail millet is one of the world's oldest cultivated crops. It has been adopted as a model organism for providing a deeper understanding of plant biology. In this study, 45 simple sequence repeats (SSR) markers of Setaria italica were developed. These markers showing polymorphism were screened in 223 samples from 12 foxtail millet populations around Taiwan. The most common dinucleotide and trinucleotide repeat motifs are AC/TG (84.21%) and CAT (46.15%). The average number of alleles (N(a)), the average heterozygosities observed (H(o)) and expected (H(e)) are 3.73, 0.714, 0.587, respectively. In addition, 24 SSR markers had shown transferability to six related Poaceae species. These new markers provide tools for examining genetic relatedness among foxtail millet populations and other related species. It is suitable for germplasm management and protection in Poaceae.
Development of Simple Sequence Repeats (SSR) Markers in Setaria italica (Poaceae) and Cross-Amplification in Related Species

PubMed Central

Lin, Heng-Sheng; Chiang, Chih-Yun; Chang, Song-Bin; Kuoh, Chang-Sheng

2011-01-01

Foxtail millet is one of the world’s oldest cultivated crops. It has been adopted as a model organism for providing a deeper understanding of plant biology. In this study, 45 simple sequence repeats (SSR) markers of Setaria italica were developed. These markers showing polymorphism were screened in 223 samples from 12 foxtail millet populations around Taiwan. The most common dinucleotide and trinucleotide repeat motifs are AC/TG (84.21%) and CAT (46.15%). The average number of alleles (Na), the average heterozygosities observed (Ho) and expected (He) are 3.73, 0.714, 0.587, respectively. In addition, 24 SSR markers had shown transferability to six related Poaceae species. These new markers provide tools for examining genetic relatedness among foxtail millet populations and other related species. It is suitable for germplasm management and protection in Poaceae. PMID:22174636
Determination of the genetic diversity of vegetable soybean [Glycine max (L.) Merr.] using EST-SSR markers*

PubMed Central

Zhang, Gu-wen; Xu, Sheng-chun; Mao, Wei-hua; Hu, Qi-zan; Gong, Ya-ming

2013-01-01

The development of expressed sequence tag-derived simple sequence repeats (EST-SSRs) provided a useful tool for investigating plant genetic diversity. In the present study, 22 polymorphic EST-SSRs from grain soybean were identified and used to assess the genetic diversity in 48 vegetable soybean accessions. Among the 22 EST-SSR loci, tri-nucleotides were the most abundant repeats, accounting for 50.00% of the total motifs. GAA was the most common motif among tri-nucleotide repeats, with a frequency of 18.18%. Polymorphic analysis identified a total of 71 alleles, with an average of 3.23 per locus. The polymorphism information content (PIC) values ranged from 0.144 to 0.630, with a mean of 0.386. Observed heterozygosity (H o) values varied from 0.0196 to 1.0000, with an average of 0.6092, while the expected heterozygosity (H e) values ranged from 0.1502 to 0.6840, with a mean value of 0.4616. Principal coordinate analysis and phylogenetic tree analysis indicated that the accessions could be assigned to different groups based to a large extent on their geographic distribution, and most accessions from China were clustered into the same groups. These results suggest that Chinese vegetable soybean accessions have a narrow genetic base. The results of this study indicate that EST-SSRs from grain soybean have high transferability to vegetable soybean, and that these new markers would be helpful in taxonomy, molecular breeding, and comparative mapping studies of vegetable soybean in the future. PMID:23549845
Novel candidate genes may be possible predisposing factors revealed by whole exome sequencing in familial esophageal squamous cell carcinoma.

PubMed

Forouzanfar, Narjes; Baranova, Ancha; Milanizadeh, Saman; Heravi-Moussavi, Alireza; Jebelli, Amir; Abbaszadegan, Mohammad Reza

2017-05-01

Esophageal squamous cell carcinoma is one of the deadliest of all the cancers. Its metastatic properties portend poor prognosis and high rate of recurrence. A more advanced method to identify new molecular biomarkers predicting disease prognosis can be whole exome sequencing. Here, we report the most effective genetic variants of the Notch signaling pathway in esophageal squamous cell carcinoma susceptibility by whole exome sequencing. We analyzed nine probands in unrelated familial esophageal squamous cell carcinoma pedigrees to identify candidate genes. Genomic DNA was extracted and whole exome sequencing performed to generate information about genetic variants in the coding regions. Bioinformatics software applications were utilized to exploit statistical algorithms to demonstrate protein structure and variants conservation. Polymorphic regions were excluded by false-positive investigations. Gene-gene interactions were analyzed for Notch signaling pathway candidates. We identified novel and damaging variants of the Notch signaling pathway through extensive pathway-oriented filtering and functional predictions, which led to the study of 27 candidate novel mutations in all nine patients. Detection of the trinucleotide repeat containing 6B gene mutation (a slice site alteration) in five of the nine probands, but not in any of the healthy samples, suggested that it may be a susceptibility factor for familial esophageal squamous cell carcinoma. Noticeably, 8 of 27 novel candidate gene mutations (e.g. epidermal growth factor, signal transducer and activator of transcription 3, MET) act in a cascade leading to cell survival and proliferation. Our results suggest that the trinucleotide repeat containing 6B mutation may be a candidate predisposing gene in esophageal squamous cell carcinoma. In addition, some of the Notch signaling pathway genetic mutations may act as key contributors to esophageal squamous cell carcinoma.
Application of FTA sample collection and DNA purification system on the determination of CTG trinucleotide repeat size by PCR-based Southern blotting.

PubMed

Hsiao, K M; Lin, H M; Pan, H; Li, T C; Chen, S S; Jou, S B; Chiu, Y L; Wu, M F; Lin, C C; Li, S Y

1999-01-01

Myotonic dystrophy (DM) is caused by a CTG trinucleotide expansion mutation at exon 15 of the myotonic dystrophy protein kinase gene. The clinical severity of this disease correlates with the length of the CTG trinucleotide repeats. Determination of the CTG repeat length has been primarily relied on by Southern blot analysis of restriction enzyme-digested genomic DNA. The development of PCR-based Southern blotting methodology provides a much more sensitive and simpler protocol for DM diagnosis. However, the quality of the template and the high (G+C) ratio of the amplified region hamper the use of PCR on the diagnosis of DM. A modified PCR protocol to amplify different lengths of CTG repeat region using various concentrations of 7deaza-dGTP has been reported (1). Here we describe a procedure including sample collection, DNA purification, and PCR analysis of CTG repeat length without using 7-deaza-dGTP. This protocol is very sensitive and convenient because only a small number of nucleate cells are needed for detection of CTG expansion. Therefore, it could be very useful in clinical and prenatal diagnosis as well as in prevalence study of DM.
Twisting Right to Left: A…A Mismatch in a CAG Trinucleotide Repeat Overexpansion Provokes Left-Handed Z-DNA Conformation

PubMed Central

2015-01-01

Conformational polymorphism of DNA is a major causative factor behind several incurable trinucleotide repeat expansion disorders that arise from overexpansion of trinucleotide repeats located in coding/non-coding regions of specific genes. Hairpin DNA structures that are formed due to overexpansion of CAG repeat lead to Huntington’s disorder and spinocerebellar ataxias. Nonetheless, DNA hairpin stem structure that generally embraces B-form with canonical base pairs is poorly understood in the context of periodic noncanonical A…A mismatch as found in CAG repeat overexpansion. Molecular dynamics simulations on DNA hairpin stems containing A…A mismatches in a CAG repeat overexpansion show that A…A dictates local Z-form irrespective of starting glycosyl conformation, in sharp contrast to canonical DNA duplex. Transition from B-to-Z is due to the mechanistic effect that originates from its pronounced nonisostericity with flanking canonical base pairs facilitated by base extrusion, backbone and/or base flipping. Based on these structural insights we envisage that such an unusual DNA structure of the CAG hairpin stem may have a role in disease pathogenesis. As this is the first study that delineates the influence of a single A…A mismatch in reversing DNA helicity, it would further have an impact on understanding DNA mismatch repair. PMID:25876062
Characterization of the heart transcriptome of the white shark (Carcharodon carcharias)

PubMed Central

2013-01-01

Background The white shark (Carcharodon carcharias) is a globally distributed, apex predator possessing physical, physiological, and behavioral traits that have garnered it significant public attention. In addition to interest in the genetic basis of its form and function, as a representative of the oldest extant jawed vertebrate lineage, white sharks are also of conservation concern due to their small population size and threat from overfishing. Despite this, surprisingly little is known about the biology of white sharks, and genomic resources are unavailable. To address this deficit, we combined Roche-454 and Illumina sequencing technologies to characterize the first transciptome of any tissue for this species. Results From white shark heart cDNA we generated 665,399 Roche 454 reads (median length 387-bp) that were assembled into 141,626 contigs (mean length 503-bp). We also generated 78,566,588 Illumina reads, which we aligned to the 454 contigs producing 105,014 454/Illumina consensus sequences. To these, we added 3,432 non-singleton 454 contigs. By comparing these sequences to the UniProtKB/Swiss-Prot database we were able to annotate 21,019 translated open reading frames (ORFs) of ≥ 20 amino acids. Of these, 19,277 were additionally assigned Gene Ontology (GO) functional annotations. While acknowledging the limitations of our single tissue transcriptome, Fisher tests showed the white shark transcriptome to be significantly enriched for numerous metabolic GO terms compared to the zebra fish and human transcriptomes, with white shark showing more similarity to human than to zebra fish (i.e. fewer terms were significantly different). We also compared the transcriptome to other available elasmobranch sequences, for signatures of positive selection and identified several genes of putative adaptive significance on the white shark lineage. The white shark transcriptome also contained 8,404 microsatellites (dinucleotide, trinucleotide, or tetranucleotide motifs ≥ five perfect repeats). Detailed characterization of these microsatellites showed that ORFs with trinucleotide repeats, were significantly enriched for transcription regulatory roles and that trinucleotide frequency within ORFs was lower than for a wide range of taxonomic groups including other vertebrates. Conclusion The white shark heart transcriptome represents a valuable resource for future elasmobranch functional and comparative genomic studies, as well as for population and other biological studies vital for effective conservation of this globally vulnerable species. PMID:24112713
Characterization of the heart transcriptome of the white shark (Carcharodon carcharias).

PubMed

Richards, Vincent P; Suzuki, Haruo; Stanhope, Michael J; Shivji, Mahmood S

2013-10-11

The white shark (Carcharodon carcharias) is a globally distributed, apex predator possessing physical, physiological, and behavioral traits that have garnered it significant public attention. In addition to interest in the genetic basis of its form and function, as a representative of the oldest extant jawed vertebrate lineage, white sharks are also of conservation concern due to their small population size and threat from overfishing. Despite this, surprisingly little is known about the biology of white sharks, and genomic resources are unavailable. To address this deficit, we combined Roche-454 and Illumina sequencing technologies to characterize the first transciptome of any tissue for this species. From white shark heart cDNA we generated 665,399 Roche 454 reads (median length 387-bp) that were assembled into 141,626 contigs (mean length 503-bp). We also generated 78,566,588 Illumina reads, which we aligned to the 454 contigs producing 105,014 454/Illumina consensus sequences. To these, we added 3,432 non-singleton 454 contigs. By comparing these sequences to the UniProtKB/Swiss-Prot database we were able to annotate 21,019 translated open reading frames (ORFs) of ≥ 20 amino acids. Of these, 19,277 were additionally assigned Gene Ontology (GO) functional annotations. While acknowledging the limitations of our single tissue transcriptome, Fisher tests showed the white shark transcriptome to be significantly enriched for numerous metabolic GO terms compared to the zebra fish and human transcriptomes, with white shark showing more similarity to human than to zebra fish (i.e. fewer terms were significantly different). We also compared the transcriptome to other available elasmobranch sequences, for signatures of positive selection and identified several genes of putative adaptive significance on the white shark lineage. The white shark transcriptome also contained 8,404 microsatellites (dinucleotide, trinucleotide, or tetranucleotide motifs ≥ five perfect repeats). Detailed characterization of these microsatellites showed that ORFs with trinucleotide repeats, were significantly enriched for transcription regulatory roles and that trinucleotide frequency within ORFs was lower than for a wide range of taxonomic groups including other vertebrates. The white shark heart transcriptome represents a valuable resource for future elasmobranch functional and comparative genomic studies, as well as for population and other biological studies vital for effective conservation of this globally vulnerable species.
Development of Novel SSR Markers for Flax (Linum usitatissimum L.) Using Reduced-Representation Genome Sequencing

PubMed Central

Wu, Jianzhong; Zhao, Qian; Wu, Guangwen; Zhang, Shuquan; Jiang, Tingbo

2017-01-01

Flax (Linum usitatissimum L.) is a major fiber and oil yielding crop grown in northeastern China. Identification of flax molecular markers is a key step toward improving flax yield and quality via marker-assisted breeding. Simple sequence repeat (SSR) markers, which are based on genomic structural variation, are considered the most valuable type of genetic marker for this purpose. In this study, we screened 1574 microsatellites from Linum usitatissimum L. obtained using reduced representation genome sequencing (RRGS) to systematically identify SSR markers. The resulting set of microsatellites consisted mainly of trinucleotide (56.10%) and dinucleotide (35.23%) repeats, with each motif consisting of 5–8 repeats. We then evaluated marker sensitivity and specificity based on samples of 48 flax isolates obtained from northeastern China. Using the new SSR panel, the results demonstrated that fiber flax and oilseed flax varieties clustered into two well separated groups. The novel SSR markers developed in this study show potential value for selection of varieties for use in flax breeding programs. PMID:28133461
E-motif formed by extrahelical cytosine bases in DNA homoduplexes of trinucleotide and hexanucleotide repeats

PubMed Central

Pan, Feng; Zhang, Yuan; Man, Viet Hoang; Roland, Christopher

2018-01-01

Abstract Atypical DNA secondary structures play an important role in expandable trinucleotide repeat (TR) and hexanucleotide repeat (HR) diseases. The cytosine mismatches in C-rich homoduplexes and hairpin stems are weakly bonded; experiments show that for certain sequences these may flip out of the helix core, forming an unusual structure termed an ‘e-motif’. We have performed molecular dynamics simulations of C-rich TR and HR DNA homoduplexes in order to characterize the conformations, stability and dynamics of formation of the e-motif, where the mismatched cytosines symmetrically flip out in the minor groove, pointing their base moieties towards the 5′-direction in each strand. TRs have two non-equivalent reading frames, (GCC)n and (CCG)n; while HRs have three: (CCCGGC)n, (CGGCCC)n, (CCCCGG)n. We define three types of pseudo basepair steps related to the mismatches and show that the e-motif is only stable in (GCC)n and (CCCGGC)n homoduplexes due to the favorable stacking of pseudo GpC steps (whose nature depends on whether TRs or HRs are involved) and the formation of hydrogen bonds between the mismatched cytosine at position i and the cytosine (TRs) or guanine (HRs) at position i − 2 along the same strand. We also characterize the extended e-motif, where all mismatched cytosines are extruded, their extra-helical stacking additionally stabilizing the homoduplexes. PMID:29190385
Trinucleotide cassettes increase diversity of T7 phage-displayed peptide library.

PubMed

Krumpe, Lauren R H; Schumacher, Kathryn M; McMahon, James B; Makowski, Lee; Mori, Toshiyuki

2007-10-05

Amino acid sequence diversity is introduced into a phage-displayed peptide library by randomizing library oligonucleotide DNA. We recently evaluated the diversity of peptide libraries displayed on T7 lytic phage and M13 filamentous phage and showed that T7 phage can display a more diverse amino acid sequence repertoire due to differing processes of viral morphogenesis. In this study, we evaluated and compared the diversity of a 12-mer T7 phage-displayed peptide library randomized using codon-corrected trinucleotide cassettes with a T7 and an M13 12-mer phage-displayed peptide library constructed using the degenerate codon randomization method. We herein demonstrate that the combination of trinucleotide cassette amino acid codon randomization and T7 phage display construction methods resulted in a significant enhancement to the functional diversity of a 12-mer peptide library. This novel library exhibited superior amino acid uniformity and order-of-magnitude increases in amino acid sequence diversity as compared to degenerate codon randomized peptide libraries. Comparative analyses of the biophysical characteristics of the 12-mer peptide libraries revealed the trinucleotide cassette-randomized library to be a unique resource. The combination of T7 phage display and trinucleotide cassette randomization resulted in a novel resource for the potential isolation of binding peptides for new and previously studied molecular targets.
Mining and gene ontology based annotation of SSR markers from expressed sequence tags of Humulus lupulus

PubMed Central

Singh, Swati; Gupta, Sanchita; Mani, Ashutosh; Chaturvedi, Anoop

2012-01-01

Humulus lupulus is commonly known as hops, a member of the family moraceae. Currently many projects are underway leading to the accumulation of voluminous genomic and expressed sequence tag sequences in public databases. The genetically characterized domains in these databases are limited due to non-availability of reliable molecular markers. The large data of EST sequences are available in hops. The simple sequence repeat markers extracted from EST data are used as molecular markers for genetic characterization, in the present study. 25,495 EST sequences were examined and assembled to get full-length sequences. Maximum frequency distribution was shown by mononucleotide SSR motifs i.e. 60.44% in contig and 62.16% in singleton where as minimum frequency are observed for hexanucleotide SSR in contig (0.09%) and pentanucleotide SSR in singletons (0.12%). Maximum trinucleotide motifs code for Glutamic acid (GAA) while AT/TA were the most frequent repeat of dinucleotide SSRs. Flanking primer pairs were designed in-silico for the SSR containing sequences. Functional categorization of SSRs containing sequences was done through gene ontology terms like biological process, cellular component and molecular function. PMID:22368382

APE1 incision activity at abasic sites in tandem repeat sequences.

PubMed

Li, Mengxia; Völker, Jens; Breslauer, Kenneth J; Wilson, David M

2014-05-29

Repetitive DNA sequences, such as those present in microsatellites and minisatellites, telomeres, and trinucleotide repeats (linked to fragile X syndrome, Huntington disease, etc.), account for nearly 30% of the human genome. These domains exhibit enhanced susceptibility to oxidative attack to yield base modifications, strand breaks, and abasic sites; have a propensity to adopt non-canonical DNA forms modulated by the positions of the lesions; and, when not properly processed, can contribute to genome instability that underlies aging and disease development. Knowledge on the repair efficiencies of DNA damage within such repetitive sequences is therefore crucial for understanding the impact of such domains on genomic integrity. In the present study, using strategically designed oligonucleotide substrates, we determined the ability of human apurinic/apyrimidinic endonuclease 1 (APE1) to cleave at apurinic/apyrimidinic (AP) sites in a collection of tandem DNA repeat landscapes involving telomeric and CAG/CTG repeat sequences. Our studies reveal the differential influence of domain sequence, conformation, and AP site location/relative positioning on the efficiency of APE1 binding and strand incision. Intriguingly, our data demonstrate that APE1 endonuclease efficiency correlates with the thermodynamic stability of the DNA substrate. We discuss how these results have both predictive and mechanistic consequences for understanding the success and failure of repair protein activity associated with such oxidatively sensitive, conformationally plastic/dynamic repetitive DNA domains. Published by Elsevier Ltd.
Accurate typing of short tandem repeats from genome-wide sequencing data and its applications.

PubMed

Fungtammasan, Arkarachai; Ananda, Guruprasad; Hile, Suzanne E; Su, Marcia Shu-Wei; Sun, Chen; Harris, Robert; Medvedev, Paul; Eckert, Kristin; Makova, Kateryna D

2015-05-01

Short tandem repeats (STRs) are implicated in dozens of human genetic diseases and contribute significantly to genome variation and instability. Yet profiling STRs from short-read sequencing data is challenging because of their high sequencing error rates. Here, we developed STR-FM, short tandem repeat profiling using flank-based mapping, a computational pipeline that can detect the full spectrum of STR alleles from short-read data, can adapt to emerging read-mapping algorithms, and can be applied to heterogeneous genetic samples (e.g., tumors, viruses, and genomes of organelles). We used STR-FM to study STR error rates and patterns in publicly available human and in-house generated ultradeep plasmid sequencing data sets. We discovered that STRs sequenced with a PCR-free protocol have up to ninefold fewer errors than those sequenced with a PCR-containing protocol. We constructed an error correction model for genotyping STRs that can distinguish heterozygous alleles containing STRs with consecutive repeat numbers. Applying our model and pipeline to Illumina sequencing data with 100-bp reads, we could confidently genotype several disease-related long trinucleotide STRs. Utilizing this pipeline, for the first time we determined the genome-wide STR germline mutation rate from a deeply sequenced human pedigree. Additionally, we built a tool that recommends minimal sequencing depth for accurate STR genotyping, depending on repeat length and sequencing read length. The required read depth increases with STR length and is lower for a PCR-free protocol. This suite of tools addresses the pressing challenges surrounding STR genotyping, and thus is of wide interest to researchers investigating disease-related STRs and STR evolution. © 2015 Fungtammasan et al.; Published by Cold Spring Harbor Laboratory Press.
Evolution Analysis of Simple Sequence Repeats in Plant Genome.

PubMed

Qin, Zhen; Wang, Yanping; Wang, Qingmei; Li, Aixian; Hou, Fuyun; Zhang, Liming

2015-01-01

Simple sequence repeats (SSRs) are widespread units on genome sequences, and play many important roles in plants. In order to reveal the evolution of plant genomes, we investigated the evolutionary regularities of SSRs during the evolution of plant species and the plant kingdom by analysis of twelve sequenced plant genome sequences. First, in the twelve studied plant genomes, the main SSRs were those which contain repeats of 1-3 nucleotides combination. Second, in mononucleotide SSRs, the A/T percentage gradually increased along with the evolution of plants (except for P. patens). With the increase of SSRs repeat number the percentage of A/T in C. reinhardtii had no significant change, while the percentage of A/T in terrestrial plants species gradually declined. Third, in dinucleotide SSRs, the percentage of AT/TA increased along with the evolution of plant kingdom and the repeat number increased in terrestrial plants species. This trend was more obvious in dicotyledon than monocotyledon. The percentage of CG/GC showed the opposite pattern to the AT/TA. Forth, in trinucleotide SSRs, the percentages of combinations including two or three A/T were in a rising trend along with the evolution of plant kingdom; meanwhile with the increase of SSRs repeat number in plants species, different species chose different combinations as dominant SSRs. SSRs in C. reinhardtii, P. patens, Z. mays and A. thaliana showed their specific patterns related to evolutionary position or specific changes of genome sequences. The results showed that, SSRs not only had the general pattern in the evolution of plant kingdom, but also were associated with the evolution of the specific genome sequence. The study of the evolutionary regularities of SSRs provided new insights for the analysis of the plant genome evolution.
Length and sequence heterogeneity in 5S rDNA of Populus deltoides.

PubMed

Negi, Madan S; Rajagopal, Jyothi; Chauhan, Neeti; Cronn, Richard; Lakshmikumaran, Malathi

2002-12-01

The 5S rRNA genes and their associated non-transcribed spacer (NTS) regions are present as repeat units arranged in tandem arrays in plant genomes. Length heterogeneity in 5S rDNA repeats was previously identified in Populus deltoides and was also observed in the present study. Primers were designed to amplify the 5S rDNA NTS variants from the P. deltoides genome. The PCR-amplified products from the two accessions of P. deltoides (G3 and G48) suggested the presence of length heterogeneity of 5S rDNA units within and among accessions, and the size of the spacers ranged from 385 to 434 bp. Sequence analysis of the non-transcribed spacer (NTS) revealed two distinct classes of 5S rDNA within both accessions: class 1, which contained GAA trinucleotide microsatellite repeats, and class 2, which lacked the repeats. The class 1 spacer shows length variation owing to the microsatellite, with two clones exhibiting 10 GAA repeat units and one clone exhibiting 16 such repeat units. However, distance analysis shows that class 1 spacer sequences are highly similar inter se, yielding nucleotide diversity (pi) estimates that are less than 0.15% of those obtained for class 2 spacers (pi = 0.0183 vs. 0.1433, respectively). The presence of microsatellite in the NTS region leading to variation in spacer length is reported and discussed for the first time in P. deltoides.
Marked Phenotypic Heterogeneity Associated with Expansion of a CAG Repeat Sequence at the Spinocerebellar Ataxia 3/Machado-Joseph Disease Locus

PubMed Central

Cancel, Géraldine; Abbas, Nacer; Stevanin, Giovanni; Dürr, Alexandra; Chneiweiss, Hervé; Néri, Christian; Duyckaerts, Charles; Penet, Christiane; Cann, Howard M.; Agid, Yves; Brice, Alexis

1995-01-01

The spinocerebellar ataxia 3 locus (SCA3) for type I autosomal dominant cerebellar ataxia (ADCA type I), a clinically and genetically heterogeneous group of neuro-degenerative disorders, has been mapped to chromosome 14q32.1. ADCA type I patients from families segregating SCA3 share clinical features in common with those with Machado-Joseph disease (MJD), the gene of which maps to the same region. We show here that the disease gene segregating in each of three French ADCA type I kindreds and in a French family with neuropatho-logical findings suggesting the ataxochoreic form of dentatorubropallidoluysian atrophy carries an expanded CAG repeat sequence located at the same locus as that for MJD. Analysis of the mutation in these families shows a strong negative correlation between size of the expanded CAG repeat and age at onset of clinical disease. Instability of the expanded triplet repeat was not found to be affected by sex of the parent transmitting the mutation. Evidence was found for somatic and gonadal mosaicism for alleles carrying expanded trinucleotide repeats. ImagesFigure 3Figure 5 PMID:7573040
GFP-based fluorescence assay for CAG repeat instability in cultured human cells.

PubMed

Santillan, Beatriz A; Moye, Christopher; Mittelman, David; Wilson, John H

2014-01-01

Trinucleotide repeats can be highly unstable, mutating far more frequently than point mutations. Repeats typically mutate by addition or loss of units of the repeat. CAG repeat expansions in humans trigger neurological diseases that include myotonic dystrophy, Huntington disease, and several spinocerebellar ataxias. In human cells, diverse mechanisms promote CAG repeat instability, and in mice, the mechanisms of instability are varied and tissue-dependent. Dissection of mechanistic complexity and discovery of potential therapeutics necessitates quantitative and scalable screens for repeat mutation. We describe a GFP-based assay for screening modifiers of CAG repeat instability in human cells. The assay exploits an engineered intronic CAG repeat tract that interferes with expression of an inducible GFP minigene. Like the phenotypes of many trinucleotide repeat disorders, we find that GFP function is impaired by repeat expansion, in a length-dependent manner. The intensity of fluorescence varies inversely with repeat length, allowing estimates of repeat tract changes in live cells. We validate the assay using transcription through the repeat and engineered CAG-specific nucleases, which have previously been reported to induce CAG repeat instability. The assay is relatively fast and should be adaptable to large-scale screens of chemical and shRNA libraries.
GFP-Based Fluorescence Assay for CAG Repeat Instability in Cultured Human Cells

PubMed Central

Santillan, Beatriz A.; Moye, Christopher; Mittelman, David; Wilson, John H.

2014-01-01

Trinucleotide repeats can be highly unstable, mutating far more frequently than point mutations. Repeats typically mutate by addition or loss of units of the repeat. CAG repeat expansions in humans trigger neurological diseases that include myotonic dystrophy, Huntington disease, and several spinocerebellar ataxias. In human cells, diverse mechanisms promote CAG repeat instability, and in mice, the mechanisms of instability are varied and tissue-dependent. Dissection of mechanistic complexity and discovery of potential therapeutics necessitates quantitative and scalable screens for repeat mutation. We describe a GFP-based assay for screening modifiers of CAG repeat instability in human cells. The assay exploits an engineered intronic CAG repeat tract that interferes with expression of an inducible GFP minigene. Like the phenotypes of many trinucleotide repeat disorders, we find that GFP function is impaired by repeat expansion, in a length-dependent manner. The intensity of fluorescence varies inversely with repeat length, allowing estimates of repeat tract changes in live cells. We validate the assay using transcription through the repeat and engineered CAG-specific nucleases, which have previously been reported to induce CAG repeat instability. The assay is relatively fast and should be adaptable to large-scale screens of chemical and shRNA libraries. PMID:25423602
Simple Repeat-Primed PCR Analysis of the Myotonic Dystrophy Type 1 Gene in a Clinical Diagnostics Environment

PubMed Central

Dryland, Philippa A.; Doherty, Elaine; Love, Jennifer M.; Love, Donald R.

2013-01-01

Myotonic dystrophy type 1 is an autosomal dominant neuromuscular disorder that is caused by the expansion of a CTG trinucleotide repeat in the DMPK gene. The confirmation of a clinical diagnosis of DM-1 usually involves PCR amplification of the CTG repeat-containing region and subsequent sizing of the amplification products in order to deduce the number of CTG repeats. In the case of repeat hyperexpansions, Southern blotting is also used; however, the latter has largely been superseded by triplet repeat-primed PCR (TP-PCR), which does not yield a CTG repeat number but nevertheless provides a means of stratifying patients regarding their disease severity. We report here a combination of forward and reverse TP-PCR primers that allows for the simple and effective scoring of both the size of smaller alleles and the presence or absence of expanded repeat sequences. In addition, the CTG repeat-containing TP-PCR forward primer can target both the DM-1 and Huntington disease genes, thereby streamlining the work flow for confirmation of clinical diagnoses in a diagnostic laboratory. PMID:26317000
A study on the trinucleotide repeat associated with Huntington`s disease in the Chinese

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bing-wen Soong; Jih-tsuu Wang

1994-09-01

Analysis of the polymorphic (CAG)n repeat in the hungingtin gene in the chinese confirmed the presence of an expanded repeat on all Huntington`s disease chromosomes. Measurement of the specific CAG repeat sequence in 34 HD chromosomes from 15 unrelated families and 190 control chromosomes from the Chinese population showed a range from 9 to 29 repeats in normal subjects and 40 to 58 in affected subjects. The size distributions of normal and affected alleles did not overlap. A clear correlation bewteen early onset of symptoms and very high repeat number was seen, but the spread of the age-at-onset in themore » major repeat range producing characteristic HD it too wide to be of diagnostic value. There was also variability in the transmitted repeat size for both sexes in the HD size range. Maternal HD alleles showed a moderate instability with a preponderance of size decrease, while paternal HD alleles had a tendency to increase in repeat size on transmission, the degree of which appeared proportional to the initial size.« less
Characterization of 10 new nuclear microsatellite markers in Acca sellowiana (Myrtaceae).

PubMed

Klabunde, Gustavo H F; Olkoski, Denise; Vilperte, Vinicius; Zucchi, Maria I; Nodari, Rubens O

2014-06-01

Microsatellite primers were identified and characterized in Acca sellowiana in order to expand the limited number of pre-existing polymorphic markers for use in population genetic studies for conservation, phylogeography, breeding, and domestication. • A total of 10 polymorphic microsatellite primers were designed from clones obtained from a simple sequence repeat (SSR)-enriched genomic library. The primers amplified di- and trinucleotide repeats with four to 27 alleles per locus. In all tested populations, the observed heterozygosity ranged from 0.269 to 1.0. • These new polymorphic SSR markers will allow future genetic studies to be denser, either for genetic structure characterization of natural populations or for studies involving genetic breeding and domestication process in A. sellowiana.
Comparative analyses of simple sequence repeats (SSRs) in 23 mosquito species genomes: Identification, characterization and distribution (Diptera: Culicidae).

PubMed

Wang, Xiao-Ting; Zhang, Yu-Juan; Qiao, Liang; Chen, Bin

2018-02-27

Simple sequence repeats (SSRs) exist in both eukaryotic and prokaryotic genomes and are the most popular genetic markers, but the SSRs of mosquito genomes are still not well understood. In this study, we identified and analyzed the SSRs in 23 mosquito species using Drosophila melanogaster as reference at the whole-genome level. The results show that SSR numbers (33 076-560 175/genome) and genome sizes (574.57-1342.21 Mb) are significantly positively correlated (R 2 = 0.8992, P < 0.01), but the correlation in individual species varies in these mosquito species. In six types of SSR, mono- to trinucleotide SSRs are dominant with cumulative percentages of 95.14%-99.00% and densities of 195.65/Mb-787.51/Mb, whereas tetra- to hexanucleotide SSRs are rare with 1.12%-4.22% and 3.76/Mb-40.23/Mb. The (A/T)n, (AC/GT)n and (AGC/GCT)n are the most frequent motifs in mononucleotide, dinucleotide and trinucleotide SSRs, respectively, and the motif frequencies of tetra- to hexanucleotide SSRs appear to be species-specific. The 10-20 bp length of SSRs are dominant with the number of 110 561 ± 93 482 and the frequency of 87.25% ± 5.73% on average, and the number and frequency decline with the increase of length. Most SSRs (83.34% ± 7.72%) are located in intergenic regions, followed by intron regions (11.59% ± 5.59%), exon regions (3.74% ± 1.95%), and untranslated regions (1.32% ± 1.39%). The mono-, di- and trinucleotide SSRs are the main SSRs in both gene regions (98.55% ± 0.85%) and exon regions (99.27% ± 0.52%). An average of 42.52% of total genes contains SSRs, and the preference for SSR occurrence in different gene subcategories are species-specific. The study provides useful insights into the SSR diversity, characteristics and distribution in 23 mosquito species of genomes. © 2018 Institute of Zoology, Chinese Academy of Sciences.
Development of novel simple sequence repeat markers in bitter gourd (Momordica charantia L.) through enriched genomic libraries and their utilization in analysis of genetic diversity and cross-species transferability.

PubMed

Saxena, Swati; Singh, Archana; Archak, Sunil; Behera, Tushar K; John, Joseph K; Meshram, Sudhir U; Gaikwad, Ambika B

2015-01-01

Microsatellite or simple sequence repeat (SSR) markers are the preferred markers for genetic analyses of crop plants. The availability of a limited number of such markers in bitter gourd (Momordica charantia L.) necessitates the development and characterization of more SSR markers. These were developed from genomic libraries enriched for three dinucleotide, five trinucleotide, and two tetranucleotide core repeat motifs. Employing the strategy of polymerase chain reaction-based screening, the number of clones to be sequenced was reduced by 81 % and 93.7 % of the sequenced clones contained in microsatellite repeats. Unique primer-pairs were designed for 160 microsatellite loci, and amplicons of expected length were obtained for 151 loci (94.4 %). Evaluation of diversity in 54 bitter gourd accessions at 51 loci indicated that 20 % of the loci were polymorphic with the polymorphic information content values ranging from 0.13 to 0.77. Fifteen Indian varieties were clearly distinguished indicative of the usefulness of the developed markers. Markers at 40 loci (78.4 %) were transferable to six species, viz. Momordica cymbalaria, Momordica subangulata subsp. renigera, Momordica balsamina, Momordica dioca, Momordica cochinchinesis, and Momordica sahyadrica. The microsatellite markers reported will be useful in various genetic and molecular genetic studies in bitter gourd, a cucurbit of immense nutritive, medicinal, and economic importance.
RTEL1 Inhibits Trinucleotide Repeat Expansions and Fragility

PubMed Central

Frizzell, Aisling; Nguyen, Jennifer H.G.; Petalcorin, Mark I.R.; Turner, Katherine D.; Boulton, Simon J.; Freudenreich, Catherine H.; Lahue, Robert S.

2018-01-01

SUMMARY Human RTEL1 is an essential, multifunctional helicase that maintains telomeres, regulates homologous recombination, and helps prevent bone marrow failure. Here, we show that RTEL1 also blocks trinucleotide repeat expansions, the causal mutation for 17 neurological diseases. Increased expansion frequencies of (CTG·CAG) repeats occurred in human cells following knockdown of RTEL1, but not the alternative helicase Fbh1, and purified RTEL1 efficiently unwound triplet repeat hairpins in vitro. The expansion-blocking activity of RTEL1 also required Rad18 and HLTF, homologs of yeast Rad18 and Rad5. These findings are reminiscent of budding yeast Srs2, which inhibits expansions, unwinds hairpins, and prevents triplet-repeat-induced chromosome fragility. Accordingly, we found expansions and fragility were suppressed in yeast srs2 mutants expressing RTEL1, but not Fbh1. We propose that RTEL1 serves as a human analog of Srs2 to inhibit (CTG·CAG) repeat expansions and fragility, likely by unwinding problematic hairpins. PMID:24561255
RTEL1 inhibits trinucleotide repeat expansions and fragility.

PubMed

Frizzell, Aisling; Nguyen, Jennifer H G; Petalcorin, Mark I R; Turner, Katherine D; Boulton, Simon J; Freudenreich, Catherine H; Lahue, Robert S

2014-03-13

Human RTEL1 is an essential, multifunctional helicase that maintains telomeres, regulates homologous recombination, and helps prevent bone marrow failure. Here, we show that RTEL1 also blocks trinucleotide repeat expansions, the causal mutation for 17 neurological diseases. Increased expansion frequencies of (CTG⋅CAG) repeats occurred in human cells following knockdown of RTEL1, but not the alternative helicase Fbh1, and purified RTEL1 efficiently unwound triplet repeat hairpins in vitro. The expansion-blocking activity of RTEL1 also required Rad18 and HLTF, homologs of yeast Rad18 and Rad5. These findings are reminiscent of budding yeast Srs2, which inhibits expansions, unwinds hairpins, and prevents triplet-repeat-induced chromosome fragility. Accordingly, we found expansions and fragility were suppressed in yeast srs2 mutants expressing RTEL1, but not Fbh1. We propose that RTEL1 serves as a human analog of Srs2 to inhibit (CTG⋅CAG) repeat expansions and fragility, likely by unwinding problematic hairpins. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
5′CAG and 5′CTG Repeats Create Differential Impediment to the Progression of a Minimal Reconstituted T4 Replisome Depending on the Concentration of dNTPs

PubMed Central

Delagoutte, Emmanuelle; Baldacci, Giuseppe

2011-01-01

Instability of repetitive sequences originates from strand misalignment during repair or replicative DNA synthesis. To investigate the activity of reconstituted T4 replisomes across trinucleotide repeats (TNRs) during leading strand DNA synthesis, we developed a method to build replication miniforks containing a TNR unit of defined sequence and length. Each minifork consists of three strands, primer, leading strand template, and lagging strand template with a 5′ single-stranded (ss) tail. Each strand is prepared independently, and the minifork is assembled by hybridization of the three strands. Using these miniforks and a minimal reconstituted T4 replisome, we show that during leading strand DNA synthesis, the dNTP concentration dictates which strand of the structure-forming 5′CAG/5′CTG repeat creates the strongest impediment to the minimal replication complex. We discuss this result in the light of the known fluctuation of dNTP concentration during the cell cycle and cell growth and the known concentration balance among individual dNTPs. PMID:22096622
Self-complementary circular codes in coding theory.

PubMed

Fimmel, Elena; Michel, Christian J; Starman, Martin; Strüngmann, Lutz

2018-04-01

Self-complementary circular codes are involved in pairing genetic processes. A maximal [Formula: see text] self-complementary circular code X of trinucleotides was identified in genes of bacteria, archaea, eukaryotes, plasmids and viruses (Michel in Life 7(20):1-16 2017, J Theor Biol 380:156-177, 2015; Arquès and Michel in J Theor Biol 182:45-58 1996). In this paper, self-complementary circular codes are investigated using the graph theory approach recently formulated in Fimmel et al. (Philos Trans R Soc A 374:20150058, 2016). A directed graph [Formula: see text] associated with any code X mirrors the properties of the code. In the present paper, we demonstrate a necessary condition for the self-complementarity of an arbitrary code X in terms of the graph theory. The same condition has been proven to be sufficient for codes which are circular and of large size [Formula: see text] trinucleotides, in particular for maximal circular codes ([Formula: see text] trinucleotides). For codes of small-size [Formula: see text] trinucleotides, some very rare counterexamples have been constructed. Furthermore, the length and the structure of the longest paths in the graphs associated with the self-complementary circular codes are investigated. It has been proven that the longest paths in such graphs determine the reading frame for the self-complementary circular codes. By applying this result, the reading frame in any arbitrary sequence of trinucleotides is retrieved after at most 15 nucleotides, i.e., 5 consecutive trinucleotides, from the circular code X identified in genes. Thus, an X motif of a length of at least 15 nucleotides in an arbitrary sequence of trinucleotides (not necessarily all of them belonging to X) uniquely defines the reading (correct) frame, an important criterion for analyzing the X motifs in genes in the future.
RNA circularization reveals terminal sequence heterogeneity in a double-stranded RNA virus.

PubMed

Widmer, G

1993-03-01

Double-stranded RNA viruses (dsRNA), termed LRV1, have been found in several strains of the protozoan parasite Leishmania. With the aim of constructing a full-length cDNA copy of the viral genome, including its terminal sequences, a protocol based on PCR amplification across the 3'-5' junction of circularized RNA was developed. This method proved to be applicable to dsRNA. It provided a relatively simple alternative to one-sided PCR, without loss of specificity inherent in the use of generic primers. LRV1 terminal nucleotide sequences obtained by this method showed a considerable variation in length, particularly at the 5' end of the positive strand, as well as the potential for forming 3' overhangs. The opposite genomic end terminates in 0, 1, or 2 TCA trinucleotide repeats. These results are compared with terminal sequences derived from one-sided PCR experiments.
Development of Genomic Simple Sequence Repeats (SSR) by Enrichment Libraries in Date Palm.

PubMed

Al-Faifi, Sulieman A; Migdadi, Hussein M; Algamdi, Salem S; Khan, Mohammad Altaf; Al-Obeed, Rashid S; Ammar, Megahed H; Jakse, Jerenj

2017-01-01

Development of highly informative markers such as simple sequence repeats (SSR) for cultivar identification and germplasm characterization and management is essential for date palms genetic studies. The present study documents the development of SSR markers and assesses genetic relationships of commonly grown date palm (Phoenix dactylifera L.) cultivars in different geographical regions of Saudi Arabia. A total of 93 novel simple sequence repeat (SSR) markers were screened for their ability to detect polymorphism in date palm. Around 71% of genomic SSRs are dinucleotide, 25% trinucleotide, 3% tetranucleotide, and 1% pentanucleotide motives and show 100% polymorphism. The Unweighted Pair Group Method with Arithmetic Mean (UPGMA) cluster analysis illustrates that cultivars trend to group according to their class of maturity, region of cultivation, and fruit color. Analysis of molecular variations (AMOVA) reveals genetic variation among and within cultivars of 27% and 73%, respectively, according to the geographical distribution of the cultivars. Developed microsatellite markers are of additional value to date palm characterization, tools which can be used by researchers in population genetics, cultivar identification, as well as genetic resource exploration and management. The cultivars tested exhibited a significant amount of genetic diversity and could be suitable for successful breeding programs. Genomic sequences generated from this study are available at the National Center for Biotechnology Information (NCBI), Sequence Read Archive (Accession numbers. LIBGSS_039019).
The structural basis of actinomycin D–binding induces nucleotide flipping out, a sharp bend and a left-handed twist in CGG triplet repeats

PubMed Central

Lo, Yu-Sheng; Tseng, Wen-Hsuan; Chuang, Chien-Ying; Hou, Ming-Hon

2013-01-01

The potent anticancer drug actinomycin D (ActD) functions by intercalating into DNA at GpC sites, thereby interrupting essential biological processes including replication and transcription. Certain neurological diseases are correlated with the expansion of (CGG)n trinucleotide sequences, which contain many contiguous GpC sites separated by a single G:G mispair. To characterize the binding of ActD to CGG triplet repeat sequences, the structural basis for the strong binding of ActD to neighbouring GpC sites flanking a G:G mismatch has been determined based on the crystal structure of ActD bound to ATGCGGCAT, which contains a CGG triplet sequence. The binding of ActD molecules to GCGGC causes many unexpected conformational changes including nucleotide flipping out, a sharp bend and a left-handed twist in the DNA helix via a two site-binding model. Heat denaturation, circular dichroism and surface plasmon resonance analyses showed that adjacent GpC sequences flanking a G:G mismatch are preferred ActD-binding sites. In addition, ActD was shown to bind the hairpin conformation of (CGG)16 in a pairwise combination and with greater stability than that of other DNA intercalators. Our results provide evidence of a possible biological consequence of ActD binding to CGG triplet repeat sequences. PMID:23408860
Prevalence of Huntington's disease gene CAG trinucleotide repeat alleles in patients with bipolar disorder.

PubMed

Ramos, Eliana Marisa; Gillis, Tammy; Mysore, Jayalakshmi S; Lee, Jong-Min; Alonso, Isabel; Gusella, James F; Smoller, Jordan W; Sklar, Pamela; MacDonald, Marcy E; Perlis, Roy H

2015-06-01

Huntington's disease is a neurodegenerative disorder characterized by motor, cognitive, and psychiatric symptoms that are caused by huntingtin gene (HTT) CAG trinucleotide repeat alleles of 36 or more units. A greater than expected prevalence of incompletely penetrant HTT CAG repeat alleles observed among individuals diagnosed with major depressive disorder raises the possibility that another mood disorder, bipolar disorder, could likewise be associated with Huntington's disease. We assessed the distribution of HTT CAG repeat alleles in a cohort of individuals with bipolar disorder. HTT CAG allele sizes from 2,229 Caucasian individuals diagnosed with DSM-IV bipolar disorder were compared to allele sizes in 1,828 control individuals from multiple cohorts. We found that HTT CAG repeat alleles > 35 units were observed in only one of 4,458 chromosomes from individuals with bipolar disorder, compared to three of 3,656 chromosomes from control subjects. These findings do not support an association between bipolar disorder and Huntington's disease. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae).

PubMed

Bonatelli, Isabel A S; Carstens, Bryan C; Moraes, Evandro M

2015-01-01

Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms.
Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae)

PubMed Central

Bonatelli, Isabel A. S.; Carstens, Bryan C.; Moraes, Evandro M.

2015-01-01

Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms. PMID:26561396
Human mismatch repair protein hMutLα is required to repair short slipped-DNAs of trinucleotide repeats.

PubMed

Panigrahi, Gagan B; Slean, Meghan M; Simard, Jodie P; Pearson, Christopher E

2012-12-07

Mismatch repair (MMR) is required for proper maintenance of the genome by protecting against mutations. The mismatch repair system has also been implicated as a driver of certain mutations, including disease-associated trinucleotide repeat instability. We recently revealed a requirement of hMutSβ in the repair of short slip-outs containing a single CTG repeat unit (1). The involvement of other MMR proteins in short trinucleotide repeat slip-out repair is unknown. Here we show that hMutLα is required for the highly efficient in vitro repair of single CTG repeat slip-outs, to the same degree as hMutSβ. HEK293T cell extracts, deficient in hMLH1, are unable to process single-repeat slip-outs, but are functional when complemented with hMutLα. The MMR-deficient hMLH1 mutant, T117M, which has a point mutation proximal to the ATP-binding domain, is defective in slip-out repair, further supporting a requirement for hMLH1 in the processing of short slip-outs and possibly the involvement of hMHL1 ATPase activity. Extracts of hPMS2-deficient HEC-1-A cells, which express hMLH1, hMLH3, and hPMS1, are only functional when complemented with hMutLα, indicating that neither hMutLβ nor hMutLγ is sufficient to repair short slip-outs. The resolution of clustered short slip-outs, which are poorly repaired, was partially dependent upon a functional hMutLα. The joint involvement of hMutSβ and hMutLα suggests that repeat instability may be the result of aberrant outcomes of repair attempts.
Characterization of 10 new nuclear microsatellite markers in Acca sellowiana (Myrtaceae)1

PubMed Central

Klabunde, Gustavo H. F.; Olkoski, Denise; Vilperte, Vinicius; Zucchi, Maria I.; Nodari, Rubens O.

2014-01-01

• Premise of the study: Microsatellite primers were identified and characterized in Acca sellowiana in order to expand the limited number of pre-existing polymorphic markers for use in population genetic studies for conservation, phylogeography, breeding, and domestication. • Methods and Results: A total of 10 polymorphic microsatellite primers were designed from clones obtained from a simple sequence repeat (SSR)–enriched genomic library. The primers amplified di- and trinucleotide repeats with four to 27 alleles per locus. In all tested populations, the observed heterozygosity ranged from 0.269 to 1.0. • Conclusions: These new polymorphic SSR markers will allow future genetic studies to be denser, either for genetic structure characterization of natural populations or for studies involving genetic breeding and domestication process in A. sellowiana. PMID:25202632
Triptycene: A Nucleic Acid Three-Way Junction Binder Scaffold

NASA Astrophysics Data System (ADS)

Yoon, Ina

Nucleic acids play a critical role in many biological processes such as gene regulation and replication. The development of small molecules that modulate nucleic acids with sequence or structure specificity would provide new strategies for regulating disease states at the nucleic acid level. However, this remains challenging mainly because of the nonspecific interactions between nucleic acids and small molecules. Three-way junctions are critical structural elements of nucleic acids. They are present in many important targets such as trinucleotide repeat junctions related to Huntington's disease, a temperature sensor sigma32 in E. coli, Dengue virus, and HIV. Triptycene-derived small molecules have been shown to bind to nucleic acid three-way junctions, resulting from their shape complementary. To develop a better understanding of designing molecules for targeting different junctions, a rapid screening of triptycene-based small molecules is needed. We envisioned that the installation of a linker at C9 position of the bicyclic core would allow for a rapid solid phase diversification. To achieve this aim, we synthesized 9-substituted triptycene scaffolds by using two different synthetic routes. The first synthetic route installed the linker from the amidation reaction between carboxylic acid at C9 position of the triptycene and an amine linker, beta-alanine ethyl ester. This new 9-substituted triptycene scaffold was then attached to a 2-chlorotrityl chloride resin for solid-phase diversification. This enabled a rapid diversification and an easy purification of mono-, di-, and tri-peptide triptycene derivatives. The binding affinities of these compounds were investigated towards a (CAG)˙(CTG) trinucleotide repeat junction. In the modified second synthetic route, we utilized a combined Heck coupling/benzyne Diels-Alder strategy. This improved synthetic strategy reduced the number of steps and total reaction times, increased the overall yield, improved solubilities of intermediates, and provided a new regioisomer that was not observed in the previous synthesis. Through this investigation, we discovered new high-affinity lead compounds towards a d(CAG)·(CTG) trinucleotide repeat junction. In addition, we turned our attention to sigma 32 mRNA, which contains a RNA three-way junction in E. coli. We demonstrated that triptycene-based small molecules can modulate the heat shock response in E. coli..
De Novo Transcriptome Sequencing Analysis of cDNA Library and Large-Scale Unigene Assembly in Japanese Red Pine (Pinus densiflora)

PubMed Central

Liu, Le; Zhang, Shijie; Lian, Chunlan

2015-01-01

Japanese red pine (Pinus densiflora) is extensively cultivated in Japan, Korea, China, and Russia and is harvested for timber, pulpwood, garden, and paper markets. However, genetic information and molecular markers were very scarce for this species. In this study, over 51 million sequencing clean reads from P. densiflora mRNA were produced using Illumina paired-end sequencing technology. It yielded 83,913 unigenes with a mean length of 751 bp, of which 54,530 (64.98%) unigenes showed similarity to sequences in the NCBI database. Among which the best matches in the NCBI Nr database were Picea sitchensis (41.60%), Amborella trichopoda (9.83%), and Pinus taeda (4.15%). A total of 1953 putative microsatellites were identified in 1784 unigenes using MISA (MicroSAtellite) software, of which the tri-nucleotide repeats were most abundant (50.18%) and 629 EST-SSR (expressed sequence tag- simple sequence repeats) primer pairs were successfully designed. Among 20 EST-SSR primer pairs randomly chosen, 17 markers yielded amplification products of the expected size in P. densiflora. Our results will provide a valuable resource for gene-function analysis, germplasm identification, molecular marker-assisted breeding and resistance-related gene(s) mapping for pine for P. densiflora. PMID:26690126
Possible reduced penetrance of expansion of 44 to 47 CAG/CAA repeats in the TATA-binding protein gene in spinocerebellar ataxia type 17.

PubMed

Oda, Masaya; Maruyama, Hirofumi; Komure, Osamu; Morino, Hiroyuki; Terasawa, Hideo; Izumi, Yuishin; Imamura, Tohru; Yasuda, Minoru; Ichikawa, Keiji; Ogawa, Masafumi; Matsumoto, Masayasu; Kawakami, Hideshi

2004-02-01

Spinocerebellar ataxia type 17 (SCA17) is an autosomal dominant cerebellar ataxia caused by expansion of CAG/CAA trinucleotide repeats in the TATA-binding protein (TBP) gene. Because the number of triplets in patients with SCA17 in previous studies ranged from 43 to 63, the normal number of trinucleotide units has been considered to be 42 or less. However, some healthy subjects in SCA17 pedigrees carry alleles with the same number of expanded repeats as patients with SCA17. To investigate the minimum number of CAG/CAA repeats in the TBP gene that causes SCA17. We amplified the region of the TBP gene containing the CAG/CAA repeat by means of polymerase chain reaction and performed fragment and sequence analyses. The subjects included 734 patients with SCA (480 patients with sporadic SCA and 254 patients with familial SCA) without CAG repeat expansions at the SCA1, SCA2, Machado-Joseph disease, SCA6, SCA7, or dentatorubral-pallidolluysian atrophy loci, with 162 healthy subjects, 216 patients with Parkinson disease, and 195 with Alzheimer disease as control subjects. Eight patients with SCA possessed an allele with more than 43 CAG/CAA repeats. Among the non-SCA groups, alleles with 43 to 45 repeats were seen in 3 healthy subjects and 2 with Parkinson disease. In 1 SCA pedigree, a patient with possible SCA17 and her healthy sister had alleles with 45 repeats. A 34-year-old man carrying alleles with 47 and 44 repeats (47/44) had developed progressive cerebellar ataxia and myoclonus at 25 years of age, and he exhibited dementia and pyramidal signs. He was the only affected person in his pedigree, although his father and mother carried alleles with mildly expanded repeats (44/36 and 47/36, respectively). In another pedigree, 1 patient carried a 43-repeat allele, whereas another patient had 2 normal alleles, indicating that the 43-repeat allele may not be pathologic in this family. We estimate that 44 CAG/CAA repeats is the minimum number required to cause SCA17. However, the existence of unaffected subjects with mildly expanded triplets suggests that the TBP gene mutation may not penetrate fully. Homozygosity of alleles with mildly expanded triplet repeats in the TBP gene might contribute to the pathologic phenotype.
Identification and characterization of gene-based SSR markers in date palm (Phoenix dactylifera L.).

PubMed

Zhao, Yongli; Williams, Roxanne; Prakash, C S; He, Guohao

2012-12-15

Date palm (Phoenix dactylifera L.) is an important tree in the Middle East and North Africa due to the nutritional value of its fruit. Molecular Breeding would accelerate genetic improvement of fruit tree through marker assisted selection. However, the lack of molecular markers in date palm restricts the application of molecular breeding. In this study, we analyzed 28,889 EST sequences from the date palm genome database to identify simple-sequence repeats (SSRs) and to develop gene-based markers, i.e. expressed sequence tag-SSRs (EST-SSRs). We identified 4,609 ESTs as containing SSRs, among which, trinucleotide motifs (69.7%) were the most common, followed by tetranucleotide (10.4%) and dinucleotide motifs (9.6%). The motif AG (85.7%) was most abundant in dinucleotides, while motifs AGG (26.8%), AAG (19.3%), and AGC (16.1%) were most common among trinucleotides. A total of 4,967 primer pairs were designed for EST-SSR markers from the computational data. In a follow up laboratory study, we tested a sample of 20 random selected primer pairs for amplification and polymorphism detection using genomic DNA from date palm cultivars. Nearly one-third of these primer pairs detected DNA polymorphism to differentiate the twelve date palm cultivars used. Functional categorization of EST sequences containing SSRs revealed that 3,108 (67.4%) of such ESTs had homology with known proteins. Date palm EST sequences exhibits a good resource for developing gene-based markers. These genic markers identified in our study may provide a valuable genetic and genomic tool for further genetic research and varietal development in date palm, such as diversity study, QTL mapping, and molecular breeding.
Evolution and function of CAG/polyglutamine repeats in protein–protein interaction networks

PubMed Central

Schaefer, Martin H.; Wanker, Erich E.; Andrade-Navarro, Miguel A.

2012-01-01

Expanded runs of consecutive trinucleotide CAG repeats encoding polyglutamine (polyQ) stretches are observed in the genes of a large number of patients with different genetic diseases such as Huntington's and several Ataxias. Protein aggregation, which is a key feature of most of these diseases, is thought to be triggered by these expanded polyQ sequences in disease-related proteins. However, polyQ tracts are a normal feature of many human proteins, suggesting that they have an important cellular function. To clarify the potential function of polyQ repeats in biological systems, we systematically analyzed available information stored in sequence and protein interaction databases. By integrating genomic, phylogenetic, protein interaction network and functional information, we obtained evidence that polyQ tracts in proteins stabilize protein interactions. This happens most likely through structural changes whereby the polyQ sequence extends a neighboring coiled-coil region to facilitate its interaction with a coiled-coil region in another protein. Alteration of this important biological function due to polyQ expansion results in gain of abnormal interactions, leading to pathological effects like protein aggregation. Our analyses suggest that research on polyQ proteins should shift focus from expanded polyQ proteins into the characterization of the influence of the wild-type polyQ on protein interactions. PMID:22287626
Development of simple sequence repeat markers and diversity analysis in alfalfa (Medicago sativa L.).

PubMed

Wang, Zan; Yan, Hongwei; Fu, Xinnian; Li, Xuehui; Gao, Hongwen

2013-04-01

Efficient and robust molecular markers are essential for molecular breeding in plant. Compared to dominant and bi-allelic markers, multiple alleles of simple sequence repeat (SSR) markers are particularly informative and superior in genetic linkage map and QTL mapping in autotetraploid species like alfalfa. The objective of this study was to enrich SSR markers directly from alfalfa expressed sequence tags (ESTs). A total of 12,371 alfalfa ESTs were retrieved from the National Center for Biotechnology Information. Total 774 SSR-containing ESTs were identified from 716 ESTs. On average, one SSR was found per 7.7 kb of EST sequences. Tri-nucleotide repeats (48.8 %) was the most abundant motif type, followed by di-(26.1 %), tetra-(11.5 %), penta-(9.7 %), and hexanucleotide (3.9 %). One hundred EST-SSR primer pairs were successfully designed and 29 exhibited polymorphism among 28 alfalfa accessions. The allele number per marker ranged from two to 21 with an average of 6.8. The PIC values ranged from 0.195 to 0.896 with an average of 0.608, indicating a high level of polymorphism of the EST-SSR markers. Based on the 29 EST-SSR markers, assessment of genetic diversity was conducted and found that Medicago sativa ssp. sativa was clearly different from the other subspecies. The high transferability of those EST-SSR markers was also found for relative species.
Development of highly polymorphic EST-SSR markers and segregation in F₁ hybrid population of Vitis vinifera L.

PubMed

Kayesh, E; Zhang, Y Y; Liu, G S; Bilkish, N; Sun, X; Leng, X P; Fang, J G

2013-09-23

The objectives of this investigation were to develop and validate the expressed sequence tag (EST)-simple sequence repeat (SSR) markers from large EST sequences, and to study the segregation and distribution of SSRs within two grapevine parental lines. In total, 94 F₁ lines crossed between "Early Rose" and "Red Globe" were studied. Approximately 2100 EST-SSR sequences of Vitis vinifera L. were searched for SSRs and analyzed for the design of polymerase chain reaction (PCR) primers amplifying the SSR-rich regions. Trinucleotide repeats were found to be the most abundant, followed by other nucleotide repeats. A total of 182 SSR primer pairs were first developed for the study on the parental polymorphism. Among the 182 SSR primers, 142 primer pairs (78%) could amplify the anticipated PCR products, among which only 52 primer pairs (36.62%) showed polymorphism between the two parents. These polymorphic bands were further surveyed among the 94 F₁ lines, and the results showed that a total of 162 bands were amplified, and 98 of them were polymorphic in both parents (60.86% polymorphism), with an average of 1.88 polymorphic DNA bands for each primer pair. After testing with the chi-square test, 33 of the clearly amplified polymorphic bands followed a 3:1 ratio, and 37 followed a 1:1 ratio. The rest showed distorted segregation ratios.
Statistical Enrichment of Epigenetic States Around Triplet Repeats that Can Undergo Expansions

PubMed Central

Essebier, Alexandra; Vera Wolf, Patricia; Cao, Minh Duc; Carroll, Bernard J.; Balasubramanian, Sureshkumar; Bodén, Mikael

2016-01-01

More than 30 human genetic diseases are linked to tri-nucleotide repeat expansions. There is no known mechanism that explains repeat expansions in full, but changes in the epigenetic state of the associated locus has been implicated in the disease pathology for a growing number of examples. A comprehensive comparative analysis of the genomic features associated with diverse repeat expansions has been lacking. Here, in an effort to decipher the propensity of repeats to undergo expansion and result in a disease state, we determine the genomic coordinates of tri-nucleotide repeat tracts at base pair resolution and computationally establish epigenetic profiles around them. Using three complementary statistical tests, we reveal that several epigenetic states are enriched around repeats that are associated with disease, even in cells that do not harbor expansion, relative to a carefully stratified background. Analysis of over one hundred cell types reveals that epigenetic states generally tend to vary widely between genic regions and cell types. However, there is qualified consistency in the epigenetic signatures of repeats associated with disease suggesting that changes to the chromatin and the DNA around an expanding repeat locus are likely to be similar. These epigenetic signatures may be exploited further to develop models that could explain the propensity of repeats to undergo expansions. PMID:27013954
Environmental Stress Induces Trinucleotide Repeat Mutagenesis in Human Cells by Alt-Nonhomologous End Joining Repair.

PubMed

Chatterjee, Nimrat; Lin, Yunfu; Yotnda, Patricia; Wilson, John H

2016-07-31

Multiple pathways modulate the dynamic mutability of trinucleotide repeats (TNRs), which are implicated in neurodegenerative disease and evolution. Recently, we reported that environmental stresses induce TNR mutagenesis via stress responses and rereplication, with more than 50% of mutants carrying deletions or insertions-molecular signatures of DNA double-strand break repair. We now show that knockdown of alt-nonhomologous end joining (alt-NHEJ) components-XRCC1, LIG3, and PARP1-suppresses stress-induced TNR mutagenesis, in contrast to the components of homologous recombination and NHEJ, which have no effect. Thus, alt-NHEJ, which contributes to genetic mutability in cancer cells, also plays a novel role in environmental stress-induced TNR mutagenesis. Published by Elsevier Ltd.
Small molecule alteration of RNA sequence in cells and animals.

PubMed

Guan, Lirui; Luo, Yiling; Ja, William W; Disney, Matthew D

2017-10-18

RNA regulation and maintenance are critical for proper cell function. Small molecules that specifically alter RNA sequence would be exceptionally useful as probes of RNA structure and function or as potential therapeutics. Here, we demonstrate a photochemical approach for altering the trinucleotide expanded repeat causative of myotonic muscular dystrophy type 1 (DM1), r(CUG) exp . The small molecule, 2H-4-Ru, binds to r(CUG) exp and converts guanosine residues to 8-oxo-7,8-dihydroguanosine upon photochemical irradiation. We demonstrate targeted modification upon irradiation in cell culture and in Drosophila larvae provided a diet containing 2H-4-Ru. Our results highlight a general chemical biology approach for altering RNA sequence in vivo by using small molecules and photochemistry. Furthermore, these studies show that addition of 8-oxo-G lesions into RNA 3' untranslated regions does not affect its steady state levels. Copyright © 2017 Elsevier Ltd. All rights reserved.
Microsatellite markers for the yam bean Pachyrhizus (Fabaceae).

PubMed

Delêtre, Marc; Soengas, Beatriz; Utge, José; Lambourdière, Josie; Sørensen, Marten

2013-07-01

Microsatellite loci were developed for the understudied root crop yam bean (Pachyrhizus spp.) to investigate intraspecific diversity and interspecific relationships within the genus Pachyrhizus. • Seventeen nuclear simple sequence repeat (SSR) markers with perfect di- and trinucleotide repeats were developed from 454 pyrosequencing of SSR-enriched genomic libraries. Loci were characterized in P. ahipa and wild and cultivated populations of four closely related species. All loci successfully cross-amplified and showed high levels of polymorphism, with number of alleles ranging from three to 12 and expected heterozygosity ranging from 0.095 to 0.831 across the genus. • By enabling rapid assessment of genetic diversity in three native neotropical crops, P. ahipa, P. erosus, and P. tuberosus, and two wild relatives, P. ferrugineus and P. panamensis, these markers will allow exploration of the genetic diversity and evolutionary history of the genus Pachyrhizus.
New RNAi strategy for selective suppression of a mutant allele in polyglutamine disease.

PubMed

Kubodera, Takayuki; Yokota, Takanori; Ishikawa, Kinya; Mizusawa, Hidehiro

2005-12-01

In gene therapy of dominantly inherited diseases with small interfering RNA (siRNA), mutant allele specific suppression may be necessary for diseases in which the defective gene normally has an important role. It is difficult, however, to design a mutant allele-specific siRNA for trinucleotide repeat diseases in which the difference of sequences is only repeat length. To overcome this problem, we use a new RNA interference (RNAi) strategy for selective suppression of mutant alleles. Both mutant and wild-type alleles are inhibited by the most effective siRNA, and wild-type protein is restored using the wild-type mRNA modified to be resistant to the siRNA. Here, we applied this method to spinocerebellar ataxia type 6 (SCA6). We discuss its feasibility and problems for future gene therapy.
Validation of a screening tool for the rapid and reliable detection of CGG trinucleotide repeat expansions in FMR1.

PubMed

Basehore, Monica J; Marlowe, Natalia M; Jones, Julie R; Behlendorf, Deborah E; Laver, Thomas A; Friez, Michael J

2012-06-01

Most individuals with intellectual disability and/or autism are tested for Fragile X syndrome at some point in their lifetime. Greater than 99% of individuals with Fragile X have an expanded CGG trinucleotide repeat motif in the promoter region of the FMR1 gene, and diagnostic testing involves determining the size of the CGG repeat as well as methylation status when an expansion is present. Using a previously described triplet repeat-primed polymerase chain reaction, we have performed additional validation studies using two cohorts with previous diagnostic testing results available for comparison purposes. The first cohort (n=88) consisted of both males and females and had a high percentage of abnormal samples, while the second cohort (n=624) consisted of only females and was not enriched for expansion mutations. Data from each cohort were completely concordant with the results previously obtained during the course of diagnostic testing. This study further demonstrates the utility of using laboratory-developed triplet repeat-primed FMR1 testing in a clinical setting.
Triplet repeat RNA structure and its role as pathogenic agent and therapeutic target

PubMed Central

Krzyzosiak, Wlodzimierz J.; Sobczak, Krzysztof; Wojciechowska, Marzena; Fiszer, Agnieszka; Mykowska, Agnieszka; Kozlowski, Piotr

2012-01-01

This review presents detailed information about the structure of triplet repeat RNA and addresses the simple sequence repeats of normal and expanded lengths in the context of the physiological and pathogenic roles played in human cells. First, we discuss the occurrence and frequency of various trinucleotide repeats in transcripts and classify them according to the propensity to form RNA structures of different architectures and stabilities. We show that repeats capable of forming hairpin structures are overrepresented in exons, which implies that they may have important functions. We further describe long triplet repeat RNA as a pathogenic agent by presenting human neurological diseases caused by triplet repeat expansions in which mutant RNA gains a toxic function. Prominent examples of these diseases include myotonic dystrophy type 1 and fragile X-associated tremor ataxia syndrome, which are triggered by mutant CUG and CGG repeats, respectively. In addition, we discuss RNA-mediated pathogenesis in polyglutamine disorders such as Huntington's disease and spinocerebellar ataxia type 3, in which expanded CAG repeats may act as an auxiliary toxic agent. Finally, triplet repeat RNA is presented as a therapeutic target. We describe various concepts and approaches aimed at the selective inhibition of mutant transcript activity in experimental therapies developed for repeat-associated diseases. PMID:21908410
Transcription arrest by a G quadruplex forming-trinucleotide repeat sequence from the human c-myb gene.

PubMed

Broxson, Christopher; Beckett, Joshua; Tornaletti, Silvia

2011-05-17

Non canonical DNA structures correspond to genomic regions particularly susceptible to genetic instability. The transcription process facilitates formation of these structures and plays a major role in generating the instability associated with these genomic sites. However, little is known about how non canonical structures are processed when encountered by an elongating RNA polymerase. Here we have studied the behavior of T7 RNA polymerase (T7RNAP) when encountering a G quadruplex forming-(GGA)(4) repeat located in the human c-myb proto-oncogene. To make direct correlations between formation of the structure and effects on transcription, we have taken advantage of the ability of the T7 polymerase to transcribe single-stranded substrates and of G4 DNA to form in single-stranded G-rich sequences in the presence of potassium ions. Under physiological KCl concentrations, we found that T7 RNAP transcription was arrested at two sites that mapped to the c-myb (GGA)(4) repeat sequence. The extent of arrest did not change with time, indicating that the c-myb repeat represented an absolute block and not a transient pause to T7 RNAP. Consistent with G4 DNA formation, arrest was not observed in the absence of KCl or in the presence of LiCl. Furthermore, mutations in the c-myb (GGA)(4) repeat, expected to prevent transition to G4, also eliminated the transcription block. We show T7 RNAP arrest at the c-myb repeat in double-stranded DNA under conditions mimicking the cellular concentration of biomolecules and potassium ions, suggesting that the G4 structure formed in the c-myb repeat may represent a transcription roadblock in vivo. Our results support a mechanism of transcription-coupled DNA repair initiated by arrest of transcription at G4 structures.
Molecular analysis and test of linkage between the FMR-I gene and infantile autism in multiplex families

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hallmayer, J.; Pintado, E.; Lotspeich, L.

Approximately 2%-5% of autistic children show cytogenetic evidence of the fragile X syndrome. This report tests whether infantile autism in multiplex autism families arises from an unusual manifestion of the fragile X syndrome. This could arise either by expansion of the (CGG)n trinucleotide repeat in FMR-1 or from a mutation elsewhere in the gene. We studied 35 families that met stringent criteria for multiplex autism. Amplification of the trinucleotide repeat and analysis of methylation status were performed in 79 autistic children and in 31 of their unaffected siblings by Southern blot analysis. No examples of amplified repeats were seen inmore » the autistic or control children or in their parents or grandparents. We next examined the hypothesis that there was a mutation elsewhere in the FMR-1 gene, by linkage analysis in 32 of these families. We tested four different dominant models and a recessive model. Linkage to FMR-1 could be excluded (lod score between -24 and -62) in all models by using probes DXS548, FRAXAC1, and FRAXAC2 and the CGG repeat itself. Tests for heterogeneity in this sample were negative, and the occurrence of positive lod scores in this data set could be attributed to chance. Analysis of the data by the affected-sib method also did not show evidence for linkage of any marker to autism. These results enable us to reject the hypothesis that multiplex autism arises from expansion of the (CGG)n trinucleotide repeat in FMR-1. Further, because the overall lod scores for all probes in all models tested were highly negative, linkage to FMR-1 can also be ruled out in multiplex autistic families. 35 refs., 2 figs., 5 tabs.« less

Environmental stress induces trinucleotide repeat mutagenesis in human cells

PubMed Central

Chatterjee, Nimrat; Lin, Yunfu; Santillan, Beatriz A.; Yotnda, Patricia; Wilson, John H.

2015-01-01

The dynamic mutability of microsatellite repeats is implicated in the modification of gene function and disease phenotype. Studies of the enhanced instability of long trinucleotide repeats (TNRs)—the cause of multiple human diseases—have revealed a remarkable complexity of mutagenic mechanisms. Here, we show that cold, heat, hypoxic, and oxidative stresses induce mutagenesis of a long CAG repeat tract in human cells. We show that stress-response factors mediate the stress-induced mutagenesis (SIM) of CAG repeats. We show further that SIM of CAG repeats does not involve mismatch repair, nucleotide excision repair, or transcription, processes that are known to promote TNR mutagenesis in other pathways of instability. Instead, we find that these stresses stimulate DNA rereplication, increasing the proportion of cells with >4 C-value (C) DNA content. Knockdown of the replication origin-licensing factor CDT1 eliminates both stress-induced rereplication and CAG repeat mutagenesis. In addition, direct induction of rereplication in the absence of stress also increases the proportion of cells with >4C DNA content and promotes repeat mutagenesis. Thus, environmental stress triggers a unique pathway for TNR mutagenesis that likely is mediated by DNA rereplication. This pathway may impact normal cells as they encounter stresses in their environment or during development or abnormal cells as they evolve metastatic potential. PMID:25775519
Environmental stress induces trinucleotide repeat mutagenesis in human cells.

PubMed

Chatterjee, Nimrat; Lin, Yunfu; Santillan, Beatriz A; Yotnda, Patricia; Wilson, John H

2015-03-24

The dynamic mutability of microsatellite repeats is implicated in the modification of gene function and disease phenotype. Studies of the enhanced instability of long trinucleotide repeats (TNRs)-the cause of multiple human diseases-have revealed a remarkable complexity of mutagenic mechanisms. Here, we show that cold, heat, hypoxic, and oxidative stresses induce mutagenesis of a long CAG repeat tract in human cells. We show that stress-response factors mediate the stress-induced mutagenesis (SIM) of CAG repeats. We show further that SIM of CAG repeats does not involve mismatch repair, nucleotide excision repair, or transcription, processes that are known to promote TNR mutagenesis in other pathways of instability. Instead, we find that these stresses stimulate DNA rereplication, increasing the proportion of cells with >4 C-value (C) DNA content. Knockdown of the replication origin-licensing factor CDT1 eliminates both stress-induced rereplication and CAG repeat mutagenesis. In addition, direct induction of rereplication in the absence of stress also increases the proportion of cells with >4C DNA content and promotes repeat mutagenesis. Thus, environmental stress triggers a unique pathway for TNR mutagenesis that likely is mediated by DNA rereplication. This pathway may impact normal cells as they encounter stresses in their environment or during development or abnormal cells as they evolve metastatic potential.
Genetic analysis of children of atomic bomb survivors.

PubMed Central

Satoh, C; Takahashi, N; Asakawa, J; Kodaira, M; Kuick, R; Hanash, S M; Neel, J V

1996-01-01

Studies are under way for the detection of potential genetic effects of atomic bomb radiation at the DNA level in the children of survivors. In a pilot study, we have examined six minisatellites and five microsatellites in DNA derived from 100 families including 124 children. We detected a total of 28 mutations in three minisatellite loci. The mean mutation rates per locus per gamete in the six minisatellite loci were 1.5% for 65 exposed gametes for which mean parental gonadal dose was 1.9 Sv and 2.0% for 183 unexposed gametes. We detected four mutations in two tetranucleotide repeat sequences but no mutations in three trinucleotide repeat sequences. The mean mutation rate per locus per gamete was o% for the exposed gametes and 0.5% for the unexposed gametes in the five microsatellite loci. No significant differences in the mutation rates between the exposed and the unexposed gametes were detected in these repetitive sequences. Additional loci are being analyzed to increase the power of our study to observe a significant difference in the mutation rates at the 0.05 level of significance. Images Figure 1. Figure 2. Figure 2. Figure 2. Figure 2. Figure 2. Figure 2. PMID:8781374
RNA-binding Protein Trinucleotide repeat-containing 6A Regulates the Formation of Circular RNA 0006916, with Important Functions in Lung Cancer Cells.

PubMed

Dai, Xin; Zhang, Nan; Cheng, Ying; Yang, Ti; Chen, Yingnan; Liu, Zhenzhong; Wang, Zhishan; Yang, Chengfeng; Jiang, Yiguo

2018-05-03

Circular RNAs (circRNAs) are widespread and diverse endogenous RNAs distinct from traditional linear RNAs, which may regulate gene expression in eukaryotes. However, the function of human circRNAs, including their potential role in lung cancer, remains largely unknown. We screened the circRNA circ0006916, which was evidently down-regulated in 16HBE-T cells (anti-benzopyrene-trans-7, 8-dihydrodiol-9, 10-epoxide-transformed human bronchial epithelial cells), and in A549 and H460 cell lines. Silencing of circ0006916, but not its parental gene homer scaffolding protein 1 (HOMER1), promoted cell proliferation via speeding up the cell cycle process rather than by inhibiting apoptosis; conversely, overexpression of circ0006916 had the opposite effect. Luciferase screening assay indicated that circ0006916 bound to miR-522-3p and inhibited pleckstrin homology domain and leucine rich repeat protein phosphatase 1 (PHLPP1) activity. We also explored the effect of the RNA-binding protein trinucleotide repeat-containing 6A (TNRC6A) on circ0006916 production. Circ0006916 expression was decreased after silencing TNRC6A. TNRC6A bound to the intron regions around the circRNA-forming exons of circ0006916, as shown by RNA immunoprecipitation assay combined with sequencing analysis. The association of circ0006916 with TNRC6A was further verified by RNA pull-down assays. We then constructed a carrier and confirmed that TNRC6A binding to the flanked intron region of circ0006916 was necessary for generation of circ0006916. These results demonstrate that TNRC6A regulates the biogenesis of the circRNA circ0006916, which has a regulatory role in cell growth.
Elaeis oleifera Genomic-SSR Markers: Exploitation in Oil Palm Germplasm Diversity and Cross-Amplification in Arecaceae

PubMed Central

Zaki, Noorhariza Mohd; Singh, Rajinder; Rosli, Rozana; Ismail, Ismanizan

2012-01-01

Species-specific simple sequence repeat (SSR) markers are favored for genetic studies and marker-assisted selection (MAS) breeding for oil palm genetic improvement. This report characterizes 20 SSR markers from an Elaeis oleifera genomic library (gSSR). Characterization of the repeat type in 2000 sequences revealed a high percentage of di-nucleotides (63.6%), followed by tri-nucleotides (24.2%). Primer pairs were successfully designed for 394 of the E. oleifera gSSRs. Subsequent analysis showed the ability of the 20 selected E. oleifera gSSR markers to reveal genetic diversity in the genus Elaeis. The average Polymorphism Information Content (PIC) value for the SSRs was 0.402, with the tri-repeats showing the highest average PIC (0.626). Low values of observed heterozygosity (Ho) (0.164) and highly positive fixation indices (Fis) in the E. oleifera germplasm collection, compared to the E. guineensis, indicated an excess of homozygosity in E. oleifera. The transferability of the markers to closely related palms, Elaeis guineensis, Cocos nucifera and ornamental palms is also reported. Sequencing the amplicons of three selected E. oleifera gSSRs across both species and palm taxa revealed variations in the repeat-units. The study showed the potential of E. oleifera gSSR markers to reveal genetic diversity in the genus Elaeis. The markers are also a valuable genetic resource for studying E. oleifera and other genus in the Arecaceae family. PMID:22605966
Formation and Repair of Mismatches Containing Ribonucleotides and Oxidized Bases at Repeated DNA Sequences*

PubMed Central

Cilli, Piera; Minoprio, Anna; Bossa, Cecilia; Bignami, Margherita; Mazzei, Filomena

2015-01-01

The cellular pool of ribonucleotide triphosphates (rNTPs) is higher than that of deoxyribonucleotide triphosphates. To ensure genome stability, DNA polymerases must discriminate against rNTPs and incorporated ribonucleotides must be removed by ribonucleotide excision repair (RER). We investigated DNA polymerase β (POL β) capacity to incorporate ribonucleotides into trinucleotide repeated DNA sequences and the efficiency of base excision repair (BER) and RER enzymes (OGG1, MUTYH, and RNase H2) when presented with an incorrect sugar and an oxidized base. POL β incorporated rAMP and rCMP opposite 7,8-dihydro-8-oxoguanine (8-oxodG) and extended both mispairs. In addition, POL β was able to insert and elongate an oxidized rGMP when paired with dA. We show that RNase H2 always preserves the capacity to remove a single ribonucleotide when paired to an oxidized base or to incise an oxidized ribonucleotide in a DNA duplex. In contrast, BER activity is affected by the presence of a ribonucleotide opposite an 8-oxodG. In particular, MUTYH activity on 8-oxodG:rA mispairs is fully inhibited, although its binding capacity is retained. This results in the reduction of RNase H2 incision capability of this substrate. Thus complex mispairs formed by an oxidized base and a ribonucleotide can compromise BER and RER in repeated sequences. PMID:26338705
Microsatellite markers for the yam bean Pachyrhizus (Fabaceae)1

PubMed Central

Delêtre, Marc; Soengas, Beatriz; Utge, José; Lambourdière, Josie; Sørensen, Marten

2013-01-01

• Premise of the study: Microsatellite loci were developed for the understudied root crop yam bean (Pachyrhizus spp.) to investigate intraspecific diversity and interspecific relationships within the genus Pachyrhizus. • Methods and Results: Seventeen nuclear simple sequence repeat (SSR) markers with perfect di- and trinucleotide repeats were developed from 454 pyrosequencing of SSR-enriched genomic libraries. Loci were characterized in P. ahipa and wild and cultivated populations of four closely related species. All loci successfully cross-amplified and showed high levels of polymorphism, with number of alleles ranging from three to 12 and expected heterozygosity ranging from 0.095 to 0.831 across the genus. • Conclusions: By enabling rapid assessment of genetic diversity in three native neotropical crops, P. ahipa, P. erosus, and P. tuberosus, and two wild relatives, P. ferrugineus and P. panamensis, these markers will allow exploration of the genetic diversity and evolutionary history of the genus Pachyrhizus. PMID:25202568
Microsatellite loci in Vallisneria natans (Hydrocharitaceae) and cross-reactivity with V. spinulosa and V. denseserrulata.

PubMed

Wang, Bin; Liao, Hui; Zhao, Yao; Li, Wei; Song, Zhiping

2011-03-01

Microsatellite primers were characterized in Vallisneria natans, a dominant submerged macrophyte occurring in freshwater bodies of tropical and subtropical zones. Using the Microsatellite Sequence Enrichment protocol, 16 novel polymorphic codominant loci were developed and characterized in V. natans. In addition to these, six existing microsatellite loci from V. spinulosa were successfully amplified and characterized for V. natans. These primers amplified di- and trinucleotide repeats with 2-7 alleles per locus. Most primers also amplified successfully in V. spinulosa and V. denseserrulata. These results indicate the utility of primers in V. natans for future studies of population genetic structure, as well as their applicability across the genus.
Genome Survey Sequencing for the Characterization of the Genetic Background of Rosa roxburghii Tratt and Leaf Ascorbate Metabolism Genes.

PubMed

Lu, Min; An, Huaming; Li, Liangliang

2016-01-01

Rosa roxburghii Tratt is an important commercial horticultural crop in China that is recognized for its nutritional and medicinal values. In spite of the economic significance, genomic information on this rose species is currently unavailable. In the present research, a genome survey of R. roxburghii was carried out using next-generation sequencing (NGS) technologies. Total 30.29 Gb sequence data was obtained by HiSeq 2500 sequencing and an estimated genome size of R. roxburghii was 480.97 Mb, in which the guanine plus cytosine (GC) content was calculated to be 38.63%. All of these reads were technically assembled and a total of 627,554 contigs with a N50 length of 1.484 kb and furthermore 335,902 scaffolds with a total length of 409.36 Mb were obtained. Transposable elements (TE) sequence of 90.84 Mb which comprised 29.20% of the genome, and 167,859 simple sequence repeats (SSRs) were identified from the scaffolds. Among these, the mono-(66.30%), di-(25.67%), and tri-(6.64%) nucleotide repeats contributed to nearly 99% of the SSRs, and sequence motifs AG/CT (28.81%) and GAA/TTC (14.76%) were the most abundant among the dinucleotide and trinucleotide repeat motifs, respectively. Genome analysis predicted a total of 22,721 genes which have an average length of 2311.52 bp, an average exon length of 228.15 bp, and average intron length of 401.18 bp. Eleven genes putatively involved in ascorbate metabolism were identified and its expression in R. roxburghii leaves was validated by quantitative real-time PCR (qRT-PCR). This is the first report of genome-wide characterization of this rose species.
Crosstalk between MSH2-MSH3 and polβ promotes trinucleotide repeat expansion during base excision repair.

PubMed

Lai, Yanhao; Budworth, Helen; Beaver, Jill M; Chan, Nelson L S; Zhang, Zunzhen; McMurray, Cynthia T; Liu, Yuan

2016-08-22

Studies in knockout mice provide evidence that MSH2-MSH3 and the BER machinery promote trinucleotide repeat (TNR) expansion, yet how these two different repair pathways cause the mutation is unknown. Here we report the first molecular crosstalk mechanism, in which MSH2-MSH3 is used as a component of the BER machinery to cause expansion. On its own, pol β fails to copy TNRs during DNA synthesis, and bypasses them on the template strand to cause deletion. Remarkably, MSH2-MSH3 not only stimulates pol β to copy through the repeats but also enhances formation of the flap precursor for expansion. Our results provide direct evidence that MMR and BER, operating together, form a novel hybrid pathway that changes the outcome of TNR instability from deletion to expansion during the removal of oxidized bases. We propose that cells implement crosstalk strategies and share machinery when a canonical pathway is ineffective in removing a difficult lesion.
Crosstalk between MSH2–MSH3 and polβ promotes trinucleotide repeat expansion during base excision repair

PubMed Central

Lai, Yanhao; Budworth, Helen; Beaver, Jill M.; Chan, Nelson L. S.; Zhang, Zunzhen; McMurray, Cynthia T.; Liu, Yuan

2016-01-01

Studies in knockout mice provide evidence that MSH2–MSH3 and the BER machinery promote trinucleotide repeat (TNR) expansion, yet how these two different repair pathways cause the mutation is unknown. Here we report the first molecular crosstalk mechanism, in which MSH2–MSH3 is used as a component of the BER machinery to cause expansion. On its own, pol β fails to copy TNRs during DNA synthesis, and bypasses them on the template strand to cause deletion. Remarkably, MSH2–MSH3 not only stimulates pol β to copy through the repeats but also enhances formation of the flap precursor for expansion. Our results provide direct evidence that MMR and BER, operating together, form a novel hybrid pathway that changes the outcome of TNR instability from deletion to expansion during the removal of oxidized bases. We propose that cells implement crosstalk strategies and share machinery when a canonical pathway is ineffective in removing a difficult lesion. PMID:27546332
Transcriptome de novo assembly sequencing and analysis of the toxic dinoflagellate Alexandrium catenella using the Illumina platform.

PubMed

Zhang, Shu; Sui, Zhenghong; Chang, Lianpeng; Kang, Kyoungho; Ma, Jinhua; Kong, Fanna; Zhou, Wei; Wang, Jinguo; Guo, Liliang; Geng, Huili; Zhong, Jie; Ma, Qingxia

2014-03-10

In this article, high-throughput de novo transcriptomic sequencing was performed in Alexandrium catenella, which provided the first view of the gene repertoire in this dinoflagellate based on next-generation sequencing (NGS) technologies. A total of 118,304 unigenes were identified with an average length of 673bp (base pair). Of these unigenes, 77,936 (65.9%) were annotated with known proteins based on sequence similarities, among which 24,149 and 22,956 unigenes were assigned to gene ontology categories (GO) and clusters of orthologous groups (COGs), respectively. Furthermore, 16,467 unigenes were mapped onto 322 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG). We also detected 1143 simple sequence repeats (SSRs), in which the tri-nucleotide repeat motif (69.3%) was the most abundant. The genetic facts and significance derived from the transcriptome dataset were suggested and discussed. All four core nucleosomal histones and linker histones were detected, in addition to the unigenes involved in histone modifications.190 unigenes were identified as being involved in the endocytosis pathway, and clathrin-dependent endocytosis was suggested to play a role in the heterotrophy of A. catenella. A conserved 22-nt spliced leader (SL) was identified in 21 unigenes which suggested the existence of trans-splicing processing of mRNA in A. catenella. Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.
A Trial of Metformin in Individuals With Fragile X Syndrome

ClinicalTrials.gov

2018-06-05

Fragile X Syndrome; Fragile X Mental Retardation Syndrome; Mental Retardation, X Linked; Genetic Diseases, X-Linked; Trinucleotide Repeat Expansion; Fra(X) Syndrome; Intellectual Disability; FXS; Neurobehavioral Manifestations; Sex Chromosome Disorders
Analyses of Expressed Sequence Tags from Apple1

PubMed Central

Newcomb, Richard D.; Crowhurst, Ross N.; Gleave, Andrew P.; Rikkerink, Erik H.A.; Allan, Andrew C.; Beuning, Lesley L.; Bowen, Judith H.; Gera, Emma; Jamieson, Kim R.; Janssen, Bart J.; Laing, William A.; McArtney, Steve; Nain, Bhawana; Ross, Gavin S.; Snowden, Kimberley C.; Souleyre, Edwige J.F.; Walton, Eric F.; Yauk, Yar-Khing

2006-01-01

The domestic apple (Malus domestica; also known as Malus pumila Mill.) has become a model fruit crop in which to study commercial traits such as disease and pest resistance, grafting, and flavor and health compound biosynthesis. To speed the discovery of genes involved in these traits, develop markers to map genes, and breed new cultivars, we have produced a substantial expressed sequence tag collection from various tissues of apple, focusing on fruit tissues of the cultivar Royal Gala. Over 150,000 expressed sequence tags have been collected from 43 different cDNA libraries representing 34 different tissues and treatments. Clustering of these sequences results in a set of 42,938 nonredundant sequences comprising 17,460 tentative contigs and 25,478 singletons, together representing what we predict are approximately one-half the expressed genes from apple. Many potential molecular markers are abundant in the apple transcripts. Dinucleotide repeats are found in 4,018 nonredundant sequences, mainly in the 5′-untranslated region of the gene, with a bias toward one repeat type (containing AG, 88%) and against another (repeats containing CG, 0.1%). Trinucleotide repeats are most common in the predicted coding regions and do not show a similar degree of sequence bias in their representation. Bi-allelic single-nucleotide polymorphisms are highly abundant with one found, on average, every 706 bp of transcribed DNA. Predictions of the numbers of representatives from protein families indicate the presence of many genes involved in disease resistance and the biosynthesis of flavor and health-associated compounds. Comparisons of some of these gene families with Arabidopsis (Arabidopsis thaliana) suggest instances where there have been duplications in the lineages leading to apple of biosynthetic and regulatory genes that are expressed in fruit. This resource paves the way for a concerted functional genomics effort in this important temperate fruit crop. PMID:16531485
TRStalker: an efficient heuristic for finding fuzzy tandem repeats.

PubMed

Pellegrini, Marco; Renda, M Elena; Vecchio, Alessio

2010-06-15

Genomes in higher eukaryotic organisms contain a substantial amount of repeated sequences. Tandem Repeats (TRs) constitute a large class of repetitive sequences that are originated via phenomena such as replication slippage and are characterized by close spatial contiguity. They play an important role in several molecular regulatory mechanisms, and also in several diseases (e.g. in the group of trinucleotide repeat disorders). While for TRs with a low or medium level of divergence the current methods are rather effective, the problem of detecting TRs with higher divergence (fuzzy TRs) is still open. The detection of fuzzy TRs is propaedeutic to enriching our view of their role in regulatory mechanisms and diseases. Fuzzy TRs are also important as tools to shed light on the evolutionary history of the genome, where higher divergence correlates with more remote duplication events. We have developed an algorithm (christened TRStalker) with the aim of detecting efficiently TRs that are hard to detect because of their inherent fuzziness, due to high levels of base substitutions, insertions and deletions. To attain this goal, we developed heuristics to solve a Steiner version of the problem for which the fuzziness is measured with respect to a motif string not necessarily present in the input string. This problem is akin to the 'generalized median string' that is known to be an NP-hard problem. Experiments with both synthetic and biological sequences demonstrate that our method performs better than current state of the art for fuzzy TRs and that the fuzzy TRs of the type we detect are indeed present in important biological sequences. TRStalker will be integrated in the web-based TRs Discovery Service (TReaDS) at bioalgo.iit.cnr.it. Supplementary data are available at Bioinformatics online.
Survey and analysis of simple sequence repeats in the Laccaria bicolor genome, with development of microsatellite markers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Labbe, Jessy L; Murat, Claude; Morin, Emmanuelle

It is becoming clear that simple sequence repeats (SSRs) play a significant role in fungal genome organization, and they are a large source of genetic markers for population genetics and meiotic maps. We identified SSRs in the Laccaria bicolor genome by in silico survey and analyzed their distribution in the different genomic regions. We also compared the abundance and distribution of SSRs in L. bicolor with those of the following fungal genomes: Phanerochaete chrysosporium, Coprinopsis cinerea, Ustilago maydis, Cryptococcus neoformans, Aspergillus nidulans, Magnaporthe grisea, Neurospora crassa and Saccharomyces cerevisiae. Using the MISA computer program, we detected 277,062 SSRs in themore » L. bicolor genome representing 8% of the assembled genomic sequence. Among the analyzed basidiomycetes, L. bicolor exhibited the highest SSR density although no correlation between relative abundance and the genome sizes was observed. In most genomes the short motifs (mono- to trinucleotides) were more abundant than the longer repeated SSRs. Generally, in each organism, the occurrence, relative abundance, and relative density of SSRs decreased as the repeat unit increased. Furthermore, each organism had its own common and longest SSRs. In the L. bicolor genome, most of the SSRs were located in intergenic regions (73.3%) and the highest SSR density was observed in transposable elements (TEs; 6,706 SSRs/Mb). However, 81% of the protein-coding genes contained SSRs in their exons, suggesting that SSR polymorphism may alter gene phenotypes. Within a L. bicolor offspring, sequence polymorphism of 78 SSRs was mainly detected in non-TE intergenic regions. Unlike previously developed microsatellite markers, these new ones are spread throughout the genome; these markers could have immediate applications in population genetics.« less
A novel dysfunctional germline P53 mutation identified in a family with Li-Fraumeni syndrome.

PubMed

Ji, Min; Wang, Lin; Shao, Yuguo; Cao, Wei; Xu, Ting; Chen, Shujie; Wang, Zhiwei; He, Qi; Yang, Kuo

2018-01-01

Li-Fraumeni Syndrome (LFS), which is a rare dominantly inherited cancer predisposition syndrome, is associated with germline P53 mutations. Mutations of the tumor suppressor protein P53 are associated with more than 50% of human cancers; however, almost 30% of P53 mutations occur rarely and this has raised questions about their significance. It therefore appeared of particular interest that we identified a novel mutation in a patient suffering from breast cancer and fulfilling the diagnostic criteria of LFS. In this study, a patient with remarkable family history developed breast cancer and was diagnosed with LFS. By performing next-generation sequencing on the patient and subsequent verification by Sanger sequencing among other family members, a new germ-line P53 replication error, a trinucleotide repeat mutation in the coding region, was identified in two generations of this Li-Fraumeni family.
Genetic Contributors to Intergenerational CAG Repeat Instability in Huntington’s Disease Knock-In Mice

PubMed Central

Neto, João Luís; Lee, Jong-Min; Afridi, Ali; Gillis, Tammy; Guide, Jolene R.; Dempsey, Stephani; Lager, Brenda; Alonso, Isabel; Wheeler, Vanessa C.; Pinto, Ricardo Mouro

2017-01-01

Huntington’s disease (HD) is a neurodegenerative disorder caused by the expansion of a CAG trinucleotide repeat in exon 1 of the HTT gene. Longer repeat sizes are associated with increased disease penetrance and earlier ages of onset. Intergenerationally unstable transmissions are common in HD families, partly underlying the genetic anticipation seen in this disorder. HD CAG knock-in mouse models also exhibit a propensity for intergenerational repeat size changes. In this work, we examine intergenerational instability of the CAG repeat in over 20,000 transmissions in the largest HD knock-in mouse model breeding datasets reported to date. We confirmed previous observations that parental sex drives the relative ratio of expansions and contractions. The large datasets further allowed us to distinguish effects of paternal CAG repeat length on the magnitude and frequency of expansions and contractions, as well as the identification of large repeat size jumps in the knock-in models. Distinct degrees of intergenerational instability were observed between knock-in mice of six background strains, indicating the occurrence of trans-acting genetic modifiers. We also found that lines harboring a neomycin resistance cassette upstream of Htt showed reduced expansion frequency, indicative of a contributing role for sequences in cis, with the expanded repeat as modifiers of intergenerational instability. These results provide a basis for further understanding of the mechanisms underlying intergenerational repeat instability. PMID:27913616
Genetic Contributors to Intergenerational CAG Repeat Instability in Huntington's Disease Knock-In Mice.

PubMed

Neto, João Luís; Lee, Jong-Min; Afridi, Ali; Gillis, Tammy; Guide, Jolene R; Dempsey, Stephani; Lager, Brenda; Alonso, Isabel; Wheeler, Vanessa C; Pinto, Ricardo Mouro

2017-02-01

Huntington's disease (HD) is a neurodegenerative disorder caused by the expansion of a CAG trinucleotide repeat in exon 1 of the HTT gene. Longer repeat sizes are associated with increased disease penetrance and earlier ages of onset. Intergenerationally unstable transmissions are common in HD families, partly underlying the genetic anticipation seen in this disorder. HD CAG knock-in mouse models also exhibit a propensity for intergenerational repeat size changes. In this work, we examine intergenerational instability of the CAG repeat in over 20,000 transmissions in the largest HD knock-in mouse model breeding datasets reported to date. We confirmed previous observations that parental sex drives the relative ratio of expansions and contractions. The large datasets further allowed us to distinguish effects of paternal CAG repeat length on the magnitude and frequency of expansions and contractions, as well as the identification of large repeat size jumps in the knock-in models. Distinct degrees of intergenerational instability were observed between knock-in mice of six background strains, indicating the occurrence of trans-acting genetic modifiers. We also found that lines harboring a neomycin resistance cassette upstream of Htt showed reduced expansion frequency, indicative of a contributing role for sequences in cis, with the expanded repeat as modifiers of intergenerational instability. These results provide a basis for further understanding of the mechanisms underlying intergenerational repeat instability. Copyright © 2017 by the Genetics Society of America.
Expanded CAG/CTG Repeat DNA Induces a Checkpoint Response That Impacts Cell Proliferation in Saccharomyces cerevisiae

PubMed Central

Sundararajan, Rangapriya; Freudenreich, Catherine H.

2011-01-01

Repetitive DNA elements are mutational hotspots in the genome, and their instability is linked to various neurological disorders and cancers. Although it is known that expanded trinucleotide repeats can interfere with DNA replication and repair, the cellular response to these events has not been characterized. Here, we demonstrate that an expanded CAG/CTG repeat elicits a DNA damage checkpoint response in budding yeast. Using microcolony and single cell pedigree analysis, we found that cells carrying an expanded CAG repeat frequently experience protracted cell division cycles, persistent arrests, and morphological abnormalities. These phenotypes were further exacerbated by mutations in DSB repair pathways, including homologous recombination and end joining, implicating a DNA damage response. Cell cycle analysis confirmed repeat-dependent S phase delays and G2/M arrests. Furthermore, we demonstrate that the above phenotypes are due to the activation of the DNA damage checkpoint, since expanded CAG repeats induced the phosphorylation of the Rad53 checkpoint kinase in a rad52Δ recombination deficient mutant. Interestingly, cells mutated for the MRX complex (Mre11-Rad50-Xrs2), a central component of DSB repair which is required to repair breaks at CAG repeats, failed to elicit repeat-specific arrests, morphological defects, or Rad53 phosphorylation. We therefore conclude that damage at expanded CAG/CTG repeats is likely sensed by the MRX complex, leading to a checkpoint response. Finally, we show that repeat expansions preferentially occur in cells experiencing growth delays. Activation of DNA damage checkpoints in repeat-containing cells could contribute to the tissue degeneration observed in trinucleotide repeat expansion diseases. PMID:21437275

Identification of Expanded Alleles of the "FMR1" Gene in the CHildhood Autism Risks from Genes and Environment (CHARGE) Study

ERIC Educational Resources Information Center

Tassone, Flora; Choudhary, Nimrah S.; Tassone, Federica; Durbin-Johnson, Blythe; Hansen, Robin; Hertz-Picciotto, Irva; Pessah, Isaac

2013-01-01

Fragile X syndrome (FXS) is a neuro-developmental disorder characterized by intellectual disabilities and autism spectrum disorders (ASD). Expansion of a CGG trinucleotide repeat (greater than 200 repeats) in the 5'UTR of the fragile X mental retardation gene, is the single most prevalent cause of cognitive disabilities. Several screening studies…
Small interfering RNAs based on huntingtin trinucleotide repeats are highly toxic to cancer cells.

PubMed

Murmann, Andrea E; Gao, Quan Q; Putzbach, William E; Patel, Monal; Bartom, Elizabeth T; Law, Calvin Y; Bridgeman, Bryan; Chen, Siquan; McMahon, Kaylin M; Thaxton, C Shad; Peter, Marcus E

2018-03-01

Trinucleotide repeat (TNR) expansions in the genome cause a number of degenerative diseases. A prominent TNR expansion involves the triplet CAG in the huntingtin (HTT) gene responsible for Huntington's disease (HD). Pathology is caused by protein and RNA generated from the TNR regions including small siRNA-sized repeat fragments. An inverse correlation between the length of the repeats in HTT and cancer incidence has been reported for HD patients. We now show that siRNAs based on the CAG TNR are toxic to cancer cells by targeting genes that contain long reverse complementary TNRs in their open reading frames. Of the 60 siRNAs based on the different TNRs, the six members in the CAG/CUG family of related TNRs are the most toxic to both human and mouse cancer cells. siCAG/CUG TNR-based siRNAs induce cell death in vitro in all tested cancer cell lines and slow down tumor growth in a preclinical mouse model of ovarian cancer with no signs of toxicity to the mice. We propose to explore TNR-based siRNAs as a novel form of anticancer reagents. © 2018 The Authors.
Physical organisation of simple sequence repeats (SSRs) in Triticeae: structural, functional and evolutionary implications.

PubMed

Cuadrado, A; Cardoso, M; Jouve, N

2008-01-01

A significant fraction of the nuclear DNA of all eukaryotes is occupied by simple sequence repeats (SSRs) or microsatellites. This type of sequence has sparked great interest as a means of studying genetic variation, linkage mapping, gene tagging and evolution. Although SSRs at different positions in a gene help determine the regulation of expression and the function of the protein produced, little attention has been paid to the chromosomal organisation and distribution of these sequences, even in model species. This review discusses the main achievements in the characterisation of long-range SSR organisation in the chromosomes of Triticum aestivum L., Secale cereale L., and Hordeum vulgare L. (all members of Triticeae). We have detected SSRs using an improved FISH technique based on the random primer labelling of synthetic oligonucleotides (15-24 bases) in multi-colour experiments. Detailed information on the presence and distribution of AC, AG and all the possible classes of trinucleotide repeats has been acquired. These data have revealed the motif-dependent and non-random chromosome distributions of SSRs in the different genomes, and allowed the correlation of particular SSRs with chromosome areas characterised by specific features (e.g., heterochromatin, euchromatin and centromeres) in all three species. The present review provides a detailed comparative study of the distribution of these SSRs in each of the seven chromosomes of the genomes A, B and D of wheat, H of barley and R of rye. The importance of SSRs in plant breeding and their possible role in chromosome structure, function and evolution is discussed. 2008 S. Karger AG, Basel
DOE Office of Scientific and Technical Information (OSTI.GOV)

Hirst, M.; Grewal, P.; Flannery, A.

Screening of families clinically ascertained for the fragile X syndrome phenotype revealed two mentally impaired males who were cytogenetically negative for the fragile X chromosome. In both cases, screening for the FMR1 trinucleotide expansion mutation revealed a rearrangement within the FMR1 gene. In the first case, a 660-bp deletion is present in 40% of peripheral lymphocytes. PCR and sequence analysis revealed it to include the CpG island and the CGG trinucleotide repeat, thus removing the FMR1 promoter region and putative mRNA start site. In the second case, PCR analysis demonstrated that a deletion extended from a point proximal to FMR1more » to 25 kb into the gene, removing all the region 5{prime} to exon 11. The distal breakpoint was confirmed by Southern blot analysis and localized to a 600-bp region, and FMR1-mRNA analysis in a cell line established from this individual confirmed the lack of a transcript. These deletion patients provide further confirmatory evidence that loss of FMR1 gene expression is indeed responsible for mental retardation. Additionally, these cases highlight the need for the careful examination of the FMR1 gene, even in the absence of cytogenetic expression, particularly when several fragile X-like clinical features are present. 31 refs., 6 figs.« less
A gene (ETM) for essential tremor maps to chromosome 2p22-p25.

PubMed

Higgins, J J; Pho, L T; Nee, L E

1997-11-01

We report the results of linkage analysis in a large American family of Czech descent with dominantly inherited "pure" essential tremor (ET) and genetic anticipation. Genetic loci on chromosome 2p22-p25 establish linkage to this region with a maximum LOD score (Zmax) = 5.92 for the locus, D2S272. Obligate recombinant events place the ETM gene in a 15-cM candidate interval between the genetic loci D2S168 and D2S224. Repeat expansion detection analysis suggests that expanded CAG trinucleotide sequences are associated with ET. These findings will facilitate the search for an ETM gene and may further our understanding of the human motor system.
Formation and Repair of Mismatches Containing Ribonucleotides and Oxidized Bases at Repeated DNA Sequences.

PubMed

Cilli, Piera; Minoprio, Anna; Bossa, Cecilia; Bignami, Margherita; Mazzei, Filomena

2015-10-23

The cellular pool of ribonucleotide triphosphates (rNTPs) is higher than that of deoxyribonucleotide triphosphates. To ensure genome stability, DNA polymerases must discriminate against rNTPs and incorporated ribonucleotides must be removed by ribonucleotide excision repair (RER). We investigated DNA polymerase β (POL β) capacity to incorporate ribonucleotides into trinucleotide repeated DNA sequences and the efficiency of base excision repair (BER) and RER enzymes (OGG1, MUTYH, and RNase H2) when presented with an incorrect sugar and an oxidized base. POL β incorporated rAMP and rCMP opposite 7,8-dihydro-8-oxoguanine (8-oxodG) and extended both mispairs. In addition, POL β was able to insert and elongate an oxidized rGMP when paired with dA. We show that RNase H2 always preserves the capacity to remove a single ribonucleotide when paired to an oxidized base or to incise an oxidized ribonucleotide in a DNA duplex. In contrast, BER activity is affected by the presence of a ribonucleotide opposite an 8-oxodG. In particular, MUTYH activity on 8-oxodG:rA mispairs is fully inhibited, although its binding capacity is retained. This results in the reduction of RNase H2 incision capability of this substrate. Thus complex mispairs formed by an oxidized base and a ribonucleotide can compromise BER and RER in repeated sequences. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Development and application of microsatellites in candidate genes related to wood properties in the Chinese white poplar (Populus tomentosa Carr.).

PubMed

Du, Qingzhang; Gong, Chenrui; Pan, Wei; Zhang, Deqiang

2013-02-01

Gene-derived simple sequence repeats (genic SSRs), also known as functional markers, are often preferred over random genomic markers because they represent variation in gene coding and/or regulatory regions. We characterized 544 genic SSR loci derived from 138 candidate genes involved in wood formation, distributed throughout the genome of Populus tomentosa, a key ecological and cultivated wood production species. Of these SSRs, three-quarters were located in the promoter or intron regions, and dinucleotide (59.7%) and trinucleotide repeat motifs (26.5%) predominated. By screening 15 wild P. tomentosa ecotypes, we identified 188 polymorphic genic SSRs with 861 alleles, 2-7 alleles for each marker. Transferability analysis of 30 random genic SSRs, testing whether these SSRs work in 26 genotypes of five genus Populus sections (outgroup, Salix matsudana), showed that 72% of the SSRs could be amplified in Turanga and 100% could be amplified in Leuce. Based on genotyping of these 26 genotypes, a neighbour-joining analysis showed the expected six phylogenetic groupings. In silico analysis of SSR variation in 220 sequences that are homologous between P. tomentosa and Populus trichocarpa suggested that genic SSR variations between relatives were predominantly affected by repeat motif variations or flanking sequence mutations. Inheritance tests and single-marker associations demonstrated the power of genic SSRs in family-based linkage mapping and candidate gene-based association studies, as well as marker-assisted selection and comparative genomic studies of P. tomentosa and related species.
De novo assembly of pen shell ( Atrina pectinata) transcriptome and screening of its genic microsatellites

NASA Astrophysics Data System (ADS)

Sun, Xiujun; Li, Dongming; Liu, Zhihong; Zhou, Liqing; Wu, Biao; Yang, Aiguo

2017-10-01

The pen shell ( Atrina pectinata) is a large wedge-shaped bivalve, which belongs to family Pinnidae. Due to its large and nutritious adductor muscle, it is the popular seafood with high commercial value in Asia-Pacific countries. However, limiting genomic and transcriptomic data have hampered its genetic investigations. In this study, the transcriptome of A. pectinata was deeply sequenced using Illumina pair-end sequencing technology. After assembling, a total of 127263 unigenes were obtained. Functional annotation indicated that the highest percentage of unigenes (18.60%) was annotated on GO database, followed by 18.44% on PFAM database and 17.04% on NR database. There were 270 biological pathways matched with those in KEGG database. Furthermore, a total of 23452 potential simple sequence repeats (SSRs) were identified, of them the most abundant type was mono-nucleotide repeats (12902, 55.01%), which was followed by di-nucleotide (8132, 34.68%), tri-nucleotide (2010, 8.57%), tetra-nucleotide (401, 1.71%), and penta-nucleotide (7, 0.03%) repeats. Sixty SSRs were selected for validating and developing genic SSR markers, of them 23 showed polymorphism in a cultured population with the average observed and expected heterozygosities of 0.412 and 0.579, respectively. In this study, we established the first comprehensive transcript dataset of A. pectinata genes. Our results demonstrated that RNA-Seq is a fast and cost-effective method for genic SSR development in non-model species.
Multiplexed microsatellite recovery using massively parallel sequencing

USGS Publications Warehouse

Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C.

2011-01-01

Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5M (USD).
Proteins containing expanded polyglutamine tracts and neurodegenerative disease

PubMed Central

Adegbuyiro, Adewale; Sedighi, Faezeh; Pilkington, Albert W.; Groover, Sharon; Legleiter, Justin

2017-01-01

Several hereditary neurological and neuromuscular diseases are caused by an abnormal expansion of trinucleotide repeats. To date, there have been ten of these trinucleotide repeat disorders associated with an expansion of the codon CAG encoding glutamine (Q). For these polyglutamine (polyQ) diseases, there is a critical threshold length of the CAG repeat required for disease, and further expansion beyond this threshold is correlated with age of onset and symptom severity. PolyQ expansion in the translated proteins promotes their self-assembly into a variety of oligomeric and fibrillar aggregate species that accumulate into the hallmark proteinaceous inclusion bodies associated with each disease. Here, we review aggregation mechanisms of proteins with expanded polyQ-tracts, structural consequences of expanded polyQ ranging from monomers to fibrillar aggregates, the impact of protein context and post translational modifications on aggregation, and a potential role for lipids membranes in aggregation. As the pathogenic mechanisms that underlie these disorders are often classified as either a gain of toxic function or loss of normal protein function, some toxic mechanisms associated with mutant polyQ tracts will also be discussed. PMID:28170216
Genome Survey Sequencing of Luffa Cylindrica L. and Microsatellite High Resolution Melting (SSR-HRM) Analysis for Genetic Relationship of Luffa Genotypes.

PubMed

An, Jianyu; Yin, Mengqi; Zhang, Qin; Gong, Dongting; Jia, Xiaowen; Guan, Yajing; Hu, Jin

2017-09-11

Luffa cylindrica (L.) Roem. is an economically important vegetable crop in China. However, the genomic information on this species is currently unknown. In this study, for the first time, a genome survey of L. cylindrica was carried out using next-generation sequencing (NGS) technology. In total, 43.40 Gb sequence data of L. cylindrica , about 54.94× coverage of the estimated genome size of 789.97 Mb, were obtained from HiSeq 2500 sequencing, in which the guanine plus cytosine (GC) content was calculated to be 37.90%. The heterozygosity of genome sequences was only 0.24%. In total, 1,913,731 contigs (>200 bp) with 525 bp N 50 length and 1,410,117 scaffolds (>200 bp) with 885.01 Mb total length were obtained. From the initial assembled L. cylindrica genome, 431,234 microsatellites (SSRs) (≥5 repeats) were identified. The motif types of SSR repeats included 62.88% di-nucleotide, 31.03% tri-nucleotide, 4.59% tetra-nucleotide, 0.96% penta-nucleotide and 0.54% hexa-nucleotide. Eighty genomic SSR markers were developed, and 51/80 primers could be used in both "Zheda 23" and "Zheda 83". Nineteen SSRs were used to investigate the genetic diversity among 32 accessions through SSR-HRM analysis. The unweighted pair group method analysis (UPGMA) dendrogram tree was built by calculating the SSR-HRM raw data. SSR-HRM could be effectively used for genotype relationship analysis of Luffa species.
Expressed sequence tag analysis of functional genes associated with adventitious rooting in Liriodendron hybrids.

PubMed

Zhong, Y D; Sun, X Y; Liu, E Y; Li, Y Q; Gao, Z; Yu, F X

2016-06-24

Liriodendron hybrids (Liriodendron chinense x L. tulipifera) are important landscaping and afforestation hardwood trees. To date, little genomic research on adventitious rooting has been reported in these hybrids, as well as in the genus Liriodendron. In the present study, we used adventitious roots to construct the first cDNA library for Liriodendron hybrids. A total of 5176 expressed sequence tags (ESTs) were generated and clustered into 2921 unigenes. Among these unigenes, 2547 had significant homology to the non-redundant protein database representing a wide variety of putative functions. Homologs of these genes regulated many aspects of adventitious rooting, including those for auxin signal transduction and root hair development. Results of quantitative real-time polymerase chain reaction showed that AUX1, IRE, and FB1 were highly expressed in adventitious roots and the expression of AUX1, ARF1, NAC1, RHD1, and IRE increased during the development of adventitious roots. Additionally, 181 simple sequence repeats were identified from 166 ESTs and more than 91.16% of these were dinucleotide and trinucleotide repeats. To the best of our knowledge, the present study reports the identification of the genes associated with adventitious rooting in the genus Liriodendron for the first time and provides a valuable resource for future genomic studies. Expression analysis of selected genes could allow us to identify regulatory genes that may be essential for adventitious rooting.
Microsatellite marker development by partial sequencing of the sour passion fruit genome (Passiflora edulis Sims).

PubMed

Araya, Susan; Martins, Alexandre M; Junqueira, Nilton T V; Costa, Ana Maria; Faleiro, Fábio G; Ferreira, Márcio E

2017-07-21

The Passiflora genus comprises hundreds of wild and cultivated species of passion fruit used for food, industrial, ornamental and medicinal purposes. Efforts to develop genomic tools for genetic analysis of P. edulis, the most important commercial Passiflora species, are still incipient. In spite of many recognized applications of microsatellite markers in genetics and breeding, their availability for passion fruit research remains restricted. Microsatellite markers in P. edulis are usually limited in number, show reduced polymorphism, and are mostly based on compound or imperfect repeats. Furthermore, they are confined to only a few Passiflora species. We describe the use of NGS technology to partially assemble the P. edulis genome in order to develop hundreds of new microsatellite markers. A total of 14.11 Gbp of Illumina paired-end sequence reads were analyzed to detect simple sequence repeat sites in the sour passion fruit genome. A sample of 1300 contigs containing perfect repeat microsatellite sequences was selected for PCR primer development. Panels of di- and tri-nucleotide repeat markers were then tested in P. edulis germplasm accessions for validation. DNA polymorphism was detected in 74% of the markers (PIC = 0.16 to 0.77; number of alleles/locus = 2 to 7). A core panel of highly polymorphic markers (PIC = 0.46 to 0.77) was used to cross-amplify PCR products in 79 species of Passiflora (including P. edulis), belonging to four subgenera (Astrophea, Decaloba, Distephana and Passiflora). Approximately 71% of the marker/species combinations resulted in positive amplicons in all species tested. DNA polymorphism was detected in germplasm accessions of six closely related Passiflora species (P. edulis, P. alata, P. maliformis, P. nitida, P. quadrangularis and P. setacea) and the data used for accession discrimination and species assignment. A database of P. edulis DNA sequences obtained by NGS technology was examined to identify microsatellite repeats in the sour passion fruit genome. Markers were submitted to evaluation using accessions of cultivated and wild Passiflora species. The new microsatellite markers detected high levels of DNA polymorphism in sour passion fruit and can potentially be used in genetic analysis of P. edulis and other Passiflora species.
Integration of Lupinus angustifolius L. (narrow-leafed lupin) genome maps and comparative mapping within legumes.

PubMed

Wyrwa, Katarzyna; Książkiewicz, Michał; Szczepaniak, Anna; Susek, Karolina; Podkowiński, Jan; Naganowska, Barbara

2016-09-01

Narrow-leafed lupin (Lupinus angustifolius L.) has recently been considered a reference genome for the Lupinus genus. In the present work, genetic and cytogenetic maps of L. angustifolius were supplemented with 30 new molecular markers representing lupin genome regions, harboring genes involved in nitrogen fixation during the symbiotic interaction of legumes and soil bacteria (Rhizobiaceae). Our studies resulted in the precise localization of bacterial artificial chromosomes (BACs) carrying sequence variants for early nodulin 40, nodulin 26, nodulin 45, aspartate aminotransferase P2, asparagine synthetase, cytosolic glutamine synthetase, and phosphoenolpyruvate carboxylase. Together with previously mapped chromosomes, the integrated L. angustifolius map encompasses 73 chromosome markers, including 5S ribosomal DNA (rDNA) and 45S rDNA, and anchors 20 L. angustifolius linkage groups to corresponding chromosomes. Chromosomal identification using BAC fluorescence in situ hybridization identified two BAC clones as narrow-leafed lupin centromere-specific markers, which served as templates for preliminary studies of centromere composition within the genus. Bioinformatic analysis of these two BACs revealed that centromeric/pericentromeric regions of narrow-leafed lupin chromosomes consisted of simple sequence repeats ordered into tandem repeats containing the trinucleotide and pentanucleotide simple sequence repeats AGG and GATAC, structured into long arrays. Moreover, cross-genus microsynteny analysis revealed syntenic patterns of 31 single-locus BAC clones among several legume species. The gene and chromosome level findings provide evidence of ancient duplication events that must have occurred very early in the divergence of papilionoid lineages. This work provides a strong foundation for future comparative mapping among legumes and may facilitate understanding of mechanisms involved in shaping legume chromosomes.
Analysis of polyglutamine-coding repeats in the TATA-binding protein in different human populations and in patients with schizophrenia an bipolar affective disorder

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rubinsztein, D.C.; Leggo, J.; Crow, T.J.

A new class of disease (including Huntington disease, Kennedy disease, and spinocerebellar ataxias types 1 and 3) results from abnormal expansions of CAG trinucleotides in the coding regions of genes. In all of these diseases the CAG repeats are thought to be translated into polyglutamine tracts. There is accumulating evidence arguing for CAG trinucleotide expansions as one of the causative disease mutations in schizophrenia and bipolar affective disorder. We and others believe that the TATA-binding protein (TBP) is an important candidate to investigate in these diseases as it contains a highly polymorphic stretch of glutamine codons, which are close tomore » the threshold length where the polyglutamine tracts start to be associated with disease. Thus, we examined the lengths of this polyglutamine repeat in normal unrelated East Anglians, South African Blacks, sub-Saharan Africans mainly from Nigeria, and Asian Indians. We also examined 43 bipolar affective disorder patients and 65 schizophrenic patients. The range of polyglutamine tract-lengths that we found in humans was from 26-42 codons. No patients with bipolar affective disorder and schizophrenia had abnormal expansions at this locus. 22 refs., 1 tab.« less
Somatic mosaicism of androgen receptor CAG repeats in colorectal carcinoma epithelial cells from men.

PubMed

Di Fabio, Francesco; Alvarado, Carlos; Gologan, Adrian; Youssef, Emad; Voda, Linda; Mitmaker, Elliot; Beitel, Lenore K; Gordon, Philip H; Trifiro, Mark

2009-06-01

The X-linked human androgen receptor gene (AR) contains an exonic polymorphic trinucleotide CAG. The length of this encoded CAG tract inversely affects AR transcriptional activity. Colorectal carcinoma is known to express the androgen receptor, but data on somatic CAG repeat lengths variations in malignant and normal epithelial cells are still sporadic. Using laser capture microdissection (LCM), epithelial cells from colorectal carcinoma and normal-appearing mucosa were collected from the fresh tissue of eight consecutive male patients undergoing surgery (mean age, 70 y; range, 54-82). DNA isolated from each LCM sample underwent subsequent PCR and DNA sequencing to precisely determine AR CAG repeat lengths and the presence of microsatellite instability (MSI). Different AR CAG repeat lengths were observed in colorectal carcinoma (ranging from 0 to 36 CAG repeats), mainly in the form of multiple shorter repeat lengths. This genetic heterogeneity (somatic mosaicism) was also found in normal-appearing colorectal mucosa. Half of the carcinoma cases examined tended to have a higher number of AR CAG repeat lengths with a wider range of repeat size variation compared to normal mucosa. MSI carcinomas tended to have longer median AR CAG repeat lengths (n = 17) compared to microsatellite stable carcinomas (n = 14), although the difference was not significant (P = 0.31, Mann-Whitney test). Multiple unique somatic mutations of the AR CAG repeats occur in colorectal mucosa and in carcinoma, predominantly resulting in shorter alleles. Colorectal epithelial cells carrying AR alleles with shorter CAG repeat lengths may be more androgen-sensitive and therefore have a growth advantage.
Gene-based SSR markers for common bean (Phaseolus vulgaris L.) derived from root and leaf tissue ESTs: an integration of the BMc series.

PubMed

Blair, Matthew W; Hurtado, Natalia; Chavarro, Carolina M; Muñoz-Torres, Monica C; Giraldo, Martha C; Pedraza, Fabio; Tomkins, Jeff; Wing, Rod

2011-03-22

Sequencing of cDNA libraries for the development of expressed sequence tags (ESTs) as well as for the discovery of simple sequence repeats (SSRs) has been a common method of developing microsatellites or SSR-based markers. In this research, our objective was to further sequence and develop common bean microsatellites from leaf and root cDNA libraries derived from the Andean gene pool accession G19833 and the Mesoamerican gene pool accession DOR364, mapping parents of a commonly used reference map. The root libraries were made from high and low phosphorus treated plants. A total of 3,123 EST sequences from leaf and root cDNA libraries were screened and used for direct simple sequence repeat discovery. From these EST sequences we found 184 microsatellites; the majority containing tri-nucleotide motifs, many of which were GC rich (ACC, AGC and AGG in particular). Di-nucleotide motif microsatellites were about half as common as the tri-nucleotide motif microsatellites but most of these were AGn microsatellites with a moderate number of ATn microsatellites in root ESTs followed by few ACn and no GCn microsatellites. Out of the 184 new SSR loci, 120 new microsatellite markers were developed in the BMc (Bean Microsatellites from cDNAs) series and these were evaluated for their capacity to distinguish bean diversity in a germplasm panel of 18 genotypes. We developed a database with images of the microsatellites and their polymorphism information content (PIC), which averaged 0.310 for polymorphic markers. The present study produced information about microsatellite frequency in root and leaf tissues of two important genotypes for common bean genomics: namely G19833, the Andean genotype selected for whole genome shotgun sequencing from race Peru, and DOR364 a race Mesoamerica subgroup 2 genotype that is a small-red seeded, released variety in Central America. Both race Peru and Mesoamerica subgroup 2 (small red beans) have been understudied in comparison to race Nueva Granada and Mesoamerica subgroup 1 (black beans) both with regards to gene expression and as sources of markers. However, we found few differences between SSR type and frequency between the G19833 leaf and DOR364 root tissue-derived ESTs. Overall, our work adds to the analysis of microsatellite frequency evaluation for common bean and provides a new set of 120 BMc markers which combined with the 248 previously developed BMc markers brings the total in this series to 368 markers. Once we include BMd markers, which are derived from GenBank sequences, the current total of gene-based markers from our laboratory surpasses 500 markers. These markers are basic for studies of the transcriptome of common bean and can form anchor points for genetic mapping studies in the future.
Genome Survey Sequencing of Luffa Cylindrica L. and Microsatellite High Resolution Melting (SSR-HRM) Analysis for Genetic Relationship of Luffa Genotypes

PubMed Central

An, Jianyu; Yin, Mengqi; Zhang, Qin; Gong, Dongting; Jia, Xiaowen; Guan, Yajing; Hu, Jin

2017-01-01

Luffa cylindrica (L.) Roem. is an economically important vegetable crop in China. However, the genomic information on this species is currently unknown. In this study, for the first time, a genome survey of L. cylindrica was carried out using next-generation sequencing (NGS) technology. In total, 43.40 Gb sequence data of L. cylindrica, about 54.94× coverage of the estimated genome size of 789.97 Mb, were obtained from HiSeq 2500 sequencing, in which the guanine plus cytosine (GC) content was calculated to be 37.90%. The heterozygosity of genome sequences was only 0.24%. In total, 1,913,731 contigs (>200 bp) with 525 bp N50 length and 1,410,117 scaffolds (>200 bp) with 885.01 Mb total length were obtained. From the initial assembled L. cylindrica genome, 431,234 microsatellites (SSRs) (≥5 repeats) were identified. The motif types of SSR repeats included 62.88% di-nucleotide, 31.03% tri-nucleotide, 4.59% tetra-nucleotide, 0.96% penta-nucleotide and 0.54% hexa-nucleotide. Eighty genomic SSR markers were developed, and 51/80 primers could be used in both “Zheda 23” and “Zheda 83”. Nineteen SSRs were used to investigate the genetic diversity among 32 accessions through SSR-HRM analysis. The unweighted pair group method analysis (UPGMA) dendrogram tree was built by calculating the SSR-HRM raw data. SSR-HRM could be effectively used for genotype relationship analysis of Luffa species. PMID:28891982
Partners in crime: bidirectional transcription in unstable microsatellite disease.

PubMed

Batra, Ranjan; Charizanis, Konstantinos; Swanson, Maurice S

2010-04-15

Nearly two decades have passed since the discovery that the expansion of microsatellite trinucleotide repeats is responsible for a prominent class of neurological disorders, including Huntington disease and fragile X syndrome. These hereditary diseases are characterized by genetic anticipation or the intergenerational increase in disease severity accompanied by a decrease in age-of-onset. The revelation that the variable expansion of simple sequence repeats accounted for anticipation spawned a number of pathogenesis models and a flurry of studies designed to reveal the molecular events affected by these expansions. This work led to our current understanding that expansions in protein-coding regions result in extended homopolymeric amino acid tracts, often polyglutamine or polyQ, and deleterious protein gain-of-function effects. In contrast, expansions in noncoding regions cause RNA-mediated toxicity. However, the realization that the transcriptome is considerably more complex than previously imagined, as well as the emerging regulatory importance of antisense RNAs, has blurred this distinction. In this review, we summarize evidence for bidirectional transcription of microsatellite disease genes and discuss recent suggestions that some repeat expansions produce variable levels of both toxic RNAs and proteins that influence cell viability, disease penetrance and pathological severity.
The mismatch repair system protects against intergenerational GAA repeat instability in a Friedreich ataxia mouse model.

PubMed

Ezzatizadeh, Vahid; Pinto, Ricardo Mouro; Sandi, Chiranjeevi; Sandi, Madhavi; Al-Mahdawi, Sahar; Te Riele, Hein; Pook, Mark A

2012-04-01

Friedreich ataxia (FRDA) is an autosomal recessive neurodegenerative disorder caused by a dynamic GAA repeat expansion mutation within intron 1 of the FXN gene. Studies of mouse models for other trinucleotide repeat (TNR) disorders have revealed an important role of mismatch repair (MMR) proteins in TNR instability. To explore the potential role of MMR proteins on intergenerational GAA repeat instability in FRDA, we have analyzed the transmission of unstable GAA repeat expansions from FXN transgenic mice which have been crossed with mice that are deficient for Msh2, Msh3, Msh6 or Pms2. We find in all cases that absence of parental MMR protein not only maintains transmission of GAA expansions and contractions, but also increases GAA repeat mutability (expansions and/or contractions) in the offspring. This indicates that Msh2, Msh3, Msh6 and Pms2 proteins are not the cause of intergenerational GAA expansions or contractions, but act in their canonical MMR capacity to protect against GAA repeat instability. We further identified differential modes of action for the four MMR proteins. Thus, Msh2 and Msh3 protect against GAA repeat contractions, while Msh6 protects against both GAA repeat expansions and contractions, and Pms2 protects against GAA repeat expansions and also promotes contractions. Furthermore, we detected enhanced occupancy of Msh2 and Msh3 proteins downstream of the FXN expanded GAA repeat, suggesting a model in which Msh2/3 dimers are recruited to this region to repair mismatches that would otherwise produce intergenerational GAA contractions. These findings reveal substantial differences in the intergenerational dynamics of expanded GAA repeat sequences compared with expanded CAG/CTG repeats, where Msh2 and Msh3 are thought to actively promote repeat expansions. Copyright Â© 2012 Elsevier Inc. All rights reserved.

The mismatch repair system protects against intergenerational GAA repeat instability in a Friedreich ataxia mouse model

PubMed Central

Ezzatizadeh, Vahid; Pinto, Ricardo Mouro; Sandi, Chiranjeevi; Sandi, Madhavi; Al-Mahdawi, Sahar; te Riele, Hein; Pook, Mark A.

2013-01-01

Friedreich ataxia (FRDA) is an autosomal recessive neurodegenerative disorder caused by a dynamic GAA repeat expansion mutation within intron 1 of the FXN gene. Studies of mouse models for other trinucleotide repeat (TNR) disorders have revealed an important role of mismatch repair (MMR) proteins in TNR instability. To explore the potential role of MMR proteins on intergenerational GAA repeat instability in FRDA, we have analyzed the transmission of unstable GAA repeat expansions from FXN transgenic mice which have been crossed with mice that are deficient for Msh2, Msh3, Msh6 or Pms2. We find in all cases that absence of parental MMR protein not only maintains transmission of GAA expansions and contractions, but also increases GAA repeat mutability (expansions and/or contractions) in the offspring. This indicates that Msh2, Msh3, Msh6 and Pms2 proteins are not the cause of intergenerational GAA expansions or contractions, but act in their canonical MMR capacity to protect against GAA repeat instability. We further identified differential modes of action for the four MMR proteins. Thus, Msh2 and Msh3 protect against GAA repeat contractions, while Msh6 protects against both GAA repeat expansions and contractions, and Pms2 protects against GAA repeat expansions and also promotes contractions. Furthermore, we detected enhanced occupancy of Msh2 and Msh3 proteins downstream of the FXN expanded GAA repeat, suggesting a model in which Msh2/3 dimers are recruited to this region to repair mismatches that would otherwise produce intergenerational GAA contractions. These findings reveal substantial differences in the intergenerational dynamics of expanded GAA repeat sequences compared with expanded CAG/CTG repeats, where Msh2 and Msh3 are thought to actively promote repeat expansions. PMID:22289650
Short Interspersed Nuclear Element (SINE) Sequences in the Genome of the Human Pathogenic Fungus Aspergillus fumigatus Af293

PubMed Central

Kanhayuwa, Lakkhana; Coutts, Robert H. A.

2016-01-01

Novel families of short interspersed nuclear element (SINE) sequences in the human pathogenic fungus Aspergillus fumigatus, clinical isolate Af293, were identified and categorised into tRNA-related and 5S rRNA-related SINEs. Eight predicted tRNA-related SINE families originating from different tRNAs, and nominated as AfuSINE2 sequences, contained target site duplications of short direct repeat sequences (4–14 bp) flanking the elements, an extended tRNA-unrelated region and typical features of RNA polymerase III promoter sequences. The elements ranged in size from 140–493 bp and were present in low copy number in the genome and five out of eight were actively transcribed. One putative tRNAArg-derived sequence, AfuSINE2-1a possessed a unique feature of repeated trinucleotide ACT residues at its 3’-terminus. This element was similar in sequence to the I-4_AO element found in A. oryzae and an I-1_AF long nuclear interspersed element-like sequence identified in A. fumigatus Af293. Families of 5S rRNA-related SINE sequences, nominated as AfuSINE3, were also identified and their 5'-5S rRNA-related regions show 50–65% and 60–75% similarity to respectively A. fumigatus 5S rRNAs and SINE3-1_AO found in A. oryzae. A. fumigatus Af293 contains five copies of AfuSINE3 sequences ranging in size from 259–343 bp and two out of five AfuSINE3 sequences were actively transcribed. Investigations on AfuSINE distribution in the fungal genome revealed that the elements are enriched in pericentromeric and subtelomeric regions and inserted within gene-rich regions. We also demonstrated that some, but not all, AfuSINE sequences are targeted by host RNA silencing mechanisms. Finally, we demonstrated that infection of the fungus with mycoviruses had no apparent effects on SINE activity. PMID:27736869
Short Interspersed Nuclear Element (SINE) Sequences in the Genome of the Human Pathogenic Fungus Aspergillus fumigatus Af293.

PubMed

Kanhayuwa, Lakkhana; Coutts, Robert H A

2016-01-01

Novel families of short interspersed nuclear element (SINE) sequences in the human pathogenic fungus Aspergillus fumigatus, clinical isolate Af293, were identified and categorised into tRNA-related and 5S rRNA-related SINEs. Eight predicted tRNA-related SINE families originating from different tRNAs, and nominated as AfuSINE2 sequences, contained target site duplications of short direct repeat sequences (4-14 bp) flanking the elements, an extended tRNA-unrelated region and typical features of RNA polymerase III promoter sequences. The elements ranged in size from 140-493 bp and were present in low copy number in the genome and five out of eight were actively transcribed. One putative tRNAArg-derived sequence, AfuSINE2-1a possessed a unique feature of repeated trinucleotide ACT residues at its 3'-terminus. This element was similar in sequence to the I-4_AO element found in A. oryzae and an I-1_AF long nuclear interspersed element-like sequence identified in A. fumigatus Af293. Families of 5S rRNA-related SINE sequences, nominated as AfuSINE3, were also identified and their 5'-5S rRNA-related regions show 50-65% and 60-75% similarity to respectively A. fumigatus 5S rRNAs and SINE3-1_AO found in A. oryzae. A. fumigatus Af293 contains five copies of AfuSINE3 sequences ranging in size from 259-343 bp and two out of five AfuSINE3 sequences were actively transcribed. Investigations on AfuSINE distribution in the fungal genome revealed that the elements are enriched in pericentromeric and subtelomeric regions and inserted within gene-rich regions. We also demonstrated that some, but not all, AfuSINE sequences are targeted by host RNA silencing mechanisms. Finally, we demonstrated that infection of the fungus with mycoviruses had no apparent effects on SINE activity.
Characterization of expressed sequence tag-derived simple sequence repeat markers for Aspergillus flavus: emphasis on variability of isolates from the southern United States.

PubMed

Wang, Xinwang; Wadl, Phillip A; Wood-Jones, Alicia; Windham, Gary; Trigiano, Robert N; Scruggs, Mary; Pilgrim, Candace; Baird, Richard

2012-12-01

Simple sequence repeat (SSR) markers were developed from Aspergillus flavus expressed sequence tag (EST) database to conduct an analysis of genetic relationships of Aspergillus isolates from numerous host species and geographical regions, but primarily from the United States. Twenty-nine primers were designed from 362 tri-nucleotide EST-SSR sequences. Eighteen polymorphic loci were used to genotype 96 Aspergillus species isolates. The number of alleles detected per locus ranged from 2 to 24 with a mean of 8.2 alleles. Haploid diversity ranged from 0.28 to 0.91. Genetic distance matrix was used to perform principal coordinates analysis (PCA) and to generate dendrograms using unweighted pair group method with arithmetic mean (UPGMA). Two principal coordinates explained more than 75 % of the total variation among the isolates. One clade was identified for A. flavus isolates (n = 87) with the other Aspergillus species (n = 7) using PCA, but five distinct clusters were present when the others taxa were excluded from the analysis. Six groups were noted when the EST-SSR data were compared using UPGMA. However, the latter PCA or UPGMA comparison resulted in no direct associations with host species, geographical region or aflatoxin production. Furthermore, there was no direct correlation to visible morphological features such as sclerotial types. The isolates from Mississippi Delta region, which contained the largest percentage of isolates, did not show any unusual clustering except for isolates K32, K55, and 199. Further studies of these three isolates are warranted to evaluate their pathogenicity, aflatoxin production potential, additional gene sequences (e.g., RPB2), and morphological comparisons.
Generation and analysis of expressed sequence tags in the extreme large genomes Lilium and Tulipa.

PubMed

Shahin, Arwa; van Kaauwen, Martijn; Esselink, Danny; Bargsten, Joachim W; van Tuyl, Jaap M; Visser, Richard G F; Arens, Paul

2012-11-20

Bulbous flowers such as lily and tulip (Liliaceae family) are monocot perennial herbs that are economically very important ornamental plants worldwide. However, there are hardly any genetic studies performed and genomic resources are lacking. To build genomic resources and develop tools to speed up the breeding in both crops, next generation sequencing was implemented. We sequenced and assembled transcriptomes of four lily and five tulip genotypes using 454 pyro-sequencing technology. Successfully, we developed the first set of 81,791 contigs with an average length of 514 bp for tulip, and enriched the very limited number of 3,329 available ESTs (Expressed Sequence Tags) for lily with 52,172 contigs with an average length of 555 bp. The contigs together with singletons covered on average 37% of lily and 39% of tulip estimated transcriptome. Mining lily and tulip sequence data for SSRs (Simple Sequence Repeats) showed that di-nucleotide repeats were twice more abundant in UTRs (UnTranslated Regions) compared to coding regions, while tri-nucleotide repeats were equally spread over coding and UTR regions. Two sets of single nucleotide polymorphism (SNP) markers suitable for high throughput genotyping were developed. In the first set, no SNPs flanking the target SNP (50 bp on either side) were allowed. In the second set, one SNP in the flanking regions was allowed, which resulted in a 2 to 3 fold increase in SNP marker numbers compared with the first set. Orthologous groups between the two flower bulbs: lily and tulip (12,017 groups) and among the three monocot species: lily, tulip, and rice (6,900 groups) were determined using OrthoMCL. Orthologous groups were screened for common SNP markers and EST-SSRs to study synteny between lily and tulip, which resulted in 113 common SNP markers and 292 common EST-SSR. Lily and tulip contigs generated were annotated and described according to Gene Ontology terminology. Two transcriptome sets were built that are valuable resources for marker development, comparative genomic studies and candidate gene approaches. Next generation sequencing of leaf transcriptome is very effective; however, deeper sequencing and using more tissues and stages is advisable for extended comparative studies.
Evidence suggesting possible SCA1 gene involvement in schizophrenia

DOE Office of Scientific and Technical Information (OSTI.GOV)

Diehl, S.R.; Wange, S.; Sun, C.

Several findings suggest a possible role for the SCA1 gene on chromosome 6p in some cases of schizophrenia. First, linkage analyses in Irish pedigrees provided LOD scores up to 3.0 for one model tested using microsatellites closely linked to SCA1. Reanalysis of these data using affected sibpair methods yielded a significant result (p = 0.01) for one marker. An attempt to replicate this linkage finding was made using 44 NIMH families (206 individuals, 80 affected) and 12 Utah families (120 individuals, 49 affected). LOD scores were negative in these new families, even allowing for heterogeneity, as were results using affectedmore » sibpair methods. However, one Utah family provided a LOD score of 1.3. We also screened the SCA1 trinucleotide repeat to search for expansions characteristic of this disorder in these families and in 38 additional unrelated schizophrenics. We found 1 schizophrenic with 41 repeats, which is substantially larger than the maximum size of 36 repeats observed in previous studies of several hundred controls. We are now assessing whether the distribution of SCA1 repeats differs significantly in schizophrenia versus controls. Recent reports suggest possible anticipation in schizophrenia (also characteristic of SCA1) and a few cases of psychiatric symptoms suggesting schizophrenia have been observed in the highly related disorder DRPLA (SCA2), which is also based on trinucleotide repeat expansion. These findings suggest that further investigations of this gene and chromosome region may be a priority.« less
MutSβ and histone deacetylase complexes promote expansions of trinucleotide repeats in human cells

PubMed Central

Gannon, Anne-Marie M.; Frizzell, Aisling; Healy, Evan; Lahue, Robert S.

2012-01-01

Trinucleotide repeat (TNR) expansions cause at least 17 heritable neurological diseases, including Huntington’s disease. Expansions are thought to arise from abnormal processing of TNR DNA by specific trans-acting proteins. For example, the DNA repair complex MutSβ (MSH2–MSH3 heterodimer) is required in mice for on-going expansions of long, disease-causing alleles. A distinctive feature of TNR expansions is a threshold effect, a narrow range of repeat units (∼30–40 in humans) at which mutation frequency rises dramatically and disease can initiate. The goal of this study was to identify factors that promote expansion of threshold-length CTG•CAG repeats in a human astrocytic cell line. siRNA knockdown of the MutSβ subunits MSH2 or MSH3 impeded expansions of threshold-length repeats, while knockdown of the MutSα subunit MSH6 had no effect. Chromatin immunoprecipitation experiments indicated that MutSβ, but not MutSα, was enriched at the TNR. These findings imply a direct role for MutSβ in promoting expansion of threshold-length CTG•CAG tracts. We identified the class II deacetylase HDAC5 as a novel promoting factor for expansions, joining the class I deacetylase HDAC3 that was previously identified. Double knockdowns were consistent with the possibility that MutSβ, HDAC3 and HDAC5 act through a common pathway to promote expansions of threshold-length TNRs. PMID:22941650
[SSR loci information analysis in transcriptome of Andrographis paniculata].

PubMed

Li, Jun-Ren; Chen, Xiu-Zhen; Tang, Xiao-Ting; He, Rui; Zhan, Ruo-Ting

2018-06-01

To study the SSR loci information and develop molecular markers, a total of 43 683 Unigenes in transcriptome of Andrographis paniculata were used to explore SSR. The distribution frequency of SSR and the basic characteristics of repeat motifs were analyzed using MicroSAtellite software, SSR primers were designed by Primer 3.0 software and then validated by PCR. Moreover, the gene function analysis of SSR Unigene was obtained by Blast. The results showed that 14 135 SSR loci were found in the transcriptome of A. paniculata, which distributed in 9 973 Unigenes with a distribution frequency of 32.36%. Di-nucleotide and Tri-nucleotide repeat were the main types, accounted for 75.54% of all SSRs. The repeat motifs of AT/AT and CCG/CGG were the predominant repeat types of Di-nucleotide and Tri-nucleotide, respectively. A total of 4 740 pairs of SSR primers with the potential to produce polymorphism were designed for maker development. Ten pairs of primers in 20 pairs of randomly picked primers produced fragments with expected molecular size. The gene function of Unigenes containing SSR were mostly related to the basic metabolism function of A. paniculata. The SSR markers in transcriptome of A. paniculata show rich type, strong specificity and high potential of polymorphism, which will benefit the candidate gene mining and marker-assisted breeding. Copyright© by the Chinese Pharmaceutical Association.
Development of chloroplast simple sequence repeats (cpSSRs) for the intraspecific study of Gracilaria tenuistipitata (Gracilariales, Rhodophyta) from different populations

PubMed Central

2014-01-01

Background Gracilaria tenuistipitata is an agarophyte with substantial economic potential because of its high growth rate and tolerance to a wide range of environment factors. This red seaweed is intensively cultured in China for the production of agar and fodder for abalone. Microsatellite markers were developed from the chloroplast genome of G. tenuistipitata var. liui to differentiate G. tenuistipitata obtained from six different localities: four from Peninsular Malaysia, one from Thailand and one from Vietnam. Eighty G. tenuistipitata specimens were analyzed using eight simple sequence repeat (SSR) primer-pairs that we developed for polymerase chain reaction (PCR) amplification. Findings Five mononucleotide primer-pairs and one trinucleotide primer-pair exhibited monomorphic alleles, whereas the other two primer-pairs separated the G. tenuistipitata specimens into two main clades. G. tenuistipitata from Thailand and Vietnam were grouped into one clade, and the populations from Batu Laut, Middle Banks and Kuah (Malaysia) were grouped into another clade. The combined dataset of these two primer-pairs separated G. tenuistipitata obtained from Kelantan, Malaysia from that obtained from other localities. Conclusions Based on the variations in repeated nucleotides of microsatellite markers, our results suggested that the populations of G. tenuistipitata were distributed into two main geographical regions: (i) populations in the west coast of Peninsular Malaysia and (ii) populations facing the South China Sea. The correct identification of G. tenuistipitata strains with traits of high economic potential will be advantageous for the mass cultivation of seaweeds. PMID:24490797
Report on the development of putative functional SSR and SNP markers in passion fruits.

PubMed

da Costa, Zirlane Portugal; Munhoz, Carla de Freitas; Vieira, Maria Lucia Carneiro

2017-09-06

Passionflowers Passiflora edulis and Passiflora alata are diploid, outcrossing and understudied fruit bearing species. In Brazil, passion fruit cultivation began relatively recently and has earned the country an outstanding position as the world's top producer of passion fruit. The fruit's main economic value lies in the production of juice, an essential exotic ingredient in juice blends. Currently, crop improvement strategies, including those for underexploited tropical species, tend to incorporate molecular genetic approaches. In this study, we examined a set of P. edulis transcripts expressed in response to infection by Xanthomonas axonopodis, (the passion fruit's main bacterial pathogen that attacks the vines), aiming at the development of putative functional markers, i.e. SSRs (simple sequence repeats) and SNPs (single nucleotide polymorphisms). A total of 210 microsatellites were found in 998 sequences, and trinucleotide repeats were found to be the most frequent (31.4%). Of the sequences selected for designing primers, 80.9% could be used to develop SSR markers, and 60.6% SNP markers for P. alata. SNPs were all biallelic and found within 15 gene fragments of P. alata. Overall, gene fragments generated 10,003 bp. SNP frequency was estimated as one SNP every 294 bp. Polymorphism rates revealed by SSR and SNP loci were 29.4 and 53.6%, respectively. Passiflora edulis transcripts were useful for the development of putative functional markers for P. alata, suggesting a certain level of sequence conservation between these cultivated species. The markers developed herein could be used for genetic mapping purposes and also in diversity studies.
The Chromatin Remodeler Isw1 Prevents CAG Repeat Expansions During Transcription in Saccharomyces cerevisiae

PubMed Central

Koch, Melissa R.; House, Nealia C. M.; Cosetta, Casey M.; Jong, Robyn M.; Salomon, Christelle G.; Joyce, Cailin E.; Philips, Elliot A.; Su, Xiaofeng A.; Freudenreich, Catherine H.

2018-01-01

CAG/CTG trinucleotide repeats are unstable sequences that are difficult to replicate, repair, and transcribe due to their structure-forming nature. CAG repeats strongly position nucleosomes; however, little is known about the chromatin remodeling needed to prevent repeat instability. In a Saccharomyces cerevisiae model system with CAG repeats carried on a YAC, we discovered that the chromatin remodeler Isw1 is required to prevent CAG repeat expansions during transcription. CAG repeat expansions in the absence of Isw1 were dependent on both transcription-coupled repair (TCR) and base-excision repair (BER). Furthermore, isw1∆ mutants are sensitive to methyl methanesulfonate (MMS) and exhibit synergistic MMS sensitivity when combined with BER or TCR pathway mutants. We conclude that CAG expansions in the isw1∆ mutant occur during a transcription-coupled excision repair process that involves both TCR and BER pathways. We observed increased RNA polymerase II (RNAPII) occupancy at the CAG repeat when transcription of the repeat was induced, but RNAPII binding did not change in isw1∆ mutants, ruling out a role for Isw1 remodeling in RNAPII progression. However, nucleosome occupancy over a transcribed CAG tract was altered in isw1∆ mutants. Based on the known role of Isw1 in the reestablishment of nucleosomal spacing after transcription, we suggest that a defect in this function allows DNA structures to form within repetitive DNA tracts, resulting in inappropriate excision repair and repeat-length changes. These results establish a new function for Isw1 in directly maintaining the chromatin structure at the CAG repeat, thereby limiting expansions that can occur during transcription-coupled excision repair. PMID:29305386
Development of simple sequence repeat (SSR) markers from a genome survey of Chinese bayberry (Myrica rubra)

PubMed Central

2012-01-01

Background Chinese bayberry (Myrica rubra Sieb. and Zucc.) is a subtropical evergreen tree originating in China. It has been cultivated in southern China for several thousand years, and annual production has reached 1.1 million tons. The taste and high level of health promoting characters identified in the fruit in recent years has stimulated its extension in China and introduction to Australia. A limited number of co-dominant markers have been developed and applied in genetic diversity and identity studies. Here we report, for the first time, a survey of whole genome shotgun data to develop a large number of simple sequence repeat (SSR) markers to analyse the genetic diversity of the common cultivated Chinese bayberry and the relationship with three other Myrica species. Results The whole genome shotgun survey of Chinese bayberry produced 9.01Gb of sequence data, about 26x coverage of the estimated genome size of 323 Mb. The genome sequences were highly heterozygous, but with little duplication. From the initial assembled scaffold covering 255 Mb sequence data, 28,602 SSRs (≥5 repeats) were identified. Dinucleotide was the most common repeat motif with a frequency of 84.73%, followed by 13.78% trinucleotide, 1.34% tetranucleotide, 0.12% pentanucleotide and 0.04% hexanucleotide. From 600 primer pairs, 186 polymorphic SSRs were developed. Of these, 158 were used to screen 29 Chinese bayberry accessions and three other Myrica species: 91.14%, 89.87% and 46.84% SSRs could be used in Myrica adenophora, Myrica nana and Myrica cerifera, respectively. The UPGMA dendrogram tree showed that cultivated Myrica rubra is closely related to Myrica adenophora and Myrica nana, originating in southwest China, and very distantly related to Myrica cerifera, originating in America. These markers can be used in the construction of a linkage map and for genetic diversity studies in Myrica species. Conclusion Myrica rubra has a small genome of about 323 Mb with a high level of heterozygosity. A large number of SSRs were identified, and 158 polymorphic SSR markers developed, 91% of which can be transferred to other Myrica species. PMID:22621340
Length Variation of Cag/Caa Trinucleotide Repeats in Natural Populations of Drosophila Melanogaster and Its Relation to the Recombination Rate

PubMed Central

Michalakis, Y.; Veuille, M.

1996-01-01

Eleven genes distributed along the Drosophila melanogaster chromosome 2 and showing exonic tandem repeats of glutamine codons (CAG or CAA) were surveyed for length variation in a sample of four European and African populations. Only one gene was monomorphic. Eight genes were polymorphic in all populations, with a total number of alleles varying between five and 12 for 120 chromosomes. The average heterozygozity per locus and population was 0.41. Selective neutrality in length variation could not be rejected under the assumptions of the infinite allele model. Significant population subdivision was found though no geographical pattern emerged, all populations being equally different. Significant linkage disequilibrium was found in four out of seven cases where the genetic distance between loci was <1 cM and was negligible when the distance was larger. There is evidence that these associations were established after the populations separated. An unexpected result was that variation at each locus was independent of the coefficient of exchange, although the latter ranged from zero to the relatively high value of 6.7%. This would indicate that background selection and selective hitchhiking, which are thought to affect levels of nucleotide substitution polymorphism, have no effect on trinucleotide repeat variation. PMID:8844158
A Comparative Proteomic Analysis of the Simple Amino Acid Repeat Distributions in Plasmodia Reveals Lineage Specific Amino Acid Selection

PubMed Central

Dalby, Andrew R.

2009-01-01

Background Microsatellites have been used extensively in the field of comparative genomics. By studying microsatellites in coding regions we have a simple model of how genotypic changes undergo selection as they are directly expressed in the phenotype as altered proteins. The simplest of these tandem repeats in coding regions are the tri-nucleotide repeats which produce a repeat of a single amino acid when translated into proteins. Tri-nucleotide repeats are often disease associated, and are also known to be unstable to both expansion and contraction. This makes them sensitive markers for studying proteome evolution, in closely related species. Results The evolutionary history of the family of malarial causing parasites Plasmodia is complex because of the life-cycle of the organism, where it interacts with a number of different hosts and goes through a series of tissue specific stages. This study shows that the divergence between the primate and rodent malarial parasites has resulted in a lineage specific change in the simple amino acid repeat distribution that is correlated to A–T content. The paper also shows that this altered use of amino acids in SAARs is consistent with the repeat distributions being under selective pressure. Conclusions The study shows that simple amino acid repeat distributions can be used to group related species and to examine their phylogenetic relationships. This study also shows that an outgroup species with a similar A–T content can be distinguished based only on the amino acid usage in repeats, and suggest that this might be a useful feature for proteome clustering. The lineage specific use of amino acids in repeat regions suggests that comparative studies of SAAR distributions between proteomes gives an insight into the mechanisms of expansion and the selective pressures acting on the organism. PMID:19597555
DNA interrogation by the CRISPR RNA-guided endonuclease Cas9.

PubMed

Sternberg, Samuel H; Redding, Sy; Jinek, Martin; Greene, Eric C; Doudna, Jennifer A

2014-03-06

The clustered regularly interspaced short palindromic repeats (CRISPR)-associated enzyme Cas9 is an RNA-guided endonuclease that uses RNA-DNA base-pairing to target foreign DNA in bacteria. Cas9-guide RNA complexes are also effective genome engineering agents in animals and plants. Here we use single-molecule and bulk biochemical experiments to determine how Cas9-RNA interrogates DNA to find specific cleavage sites. We show that both binding and cleavage of DNA by Cas9-RNA require recognition of a short trinucleotide protospacer adjacent motif (PAM). Non-target DNA binding affinity scales with PAM density, and sequences fully complementary to the guide RNA but lacking a nearby PAM are ignored by Cas9-RNA. Competition assays provide evidence that DNA strand separation and RNA-DNA heteroduplex formation initiate at the PAM and proceed directionally towards the distal end of the target sequence. Furthermore, PAM interactions trigger Cas9 catalytic activity. These results reveal how Cas9 uses PAM recognition to quickly identify potential target sites while scanning large DNA molecules, and to regulate scission of double-stranded DNA.
DNA interrogation by the CRISPR RNA-guided endonuclease Cas9

NASA Astrophysics Data System (ADS)

Sternberg, Samuel H.; Redding, Sy; Jinek, Martin; Greene, Eric C.; Doudna, Jennifer A.

2014-03-01

The clustered regularly interspaced short palindromic repeats (CRISPR)-associated enzyme Cas9 is an RNA-guided endonuclease that uses RNA-DNA base-pairing to target foreign DNA in bacteria. Cas9-guide RNA complexes are also effective genome engineering agents in animals and plants. Here we use single-molecule and bulk biochemical experiments to determine how Cas9-RNA interrogates DNA to find specific cleavage sites. We show that both binding and cleavage of DNA by Cas9-RNA require recognition of a short trinucleotide protospacer adjacent motif (PAM). Non-target DNA binding affinity scales with PAM density, and sequences fully complementary to the guide RNA but lacking a nearby PAM are ignored by Cas9-RNA. Competition assays provide evidence that DNA strand separation and RNA-DNA heteroduplex formation initiate at the PAM and proceed directionally towards the distal end of the target sequence. Furthermore, PAM interactions trigger Cas9 catalytic activity. These results reveal how Cas9 uses PAM recognition to quickly identify potential target sites while scanning large DNA molecules, and to regulate scission of double-stranded DNA.
Single-stranded DNA-binding Protein in Vitro Eliminates the Orientation-dependent Impediment to Polymerase Passage on CAG/CTG Repeats*

PubMed Central

Delagoutte, Emmanuelle; Goellner, Geoffrey M.; Guo, Jie; Baldacci, Giuseppe; McMurray, Cynthia T.

2008-01-01

Small insertions and deletions of trinucleotide repeats (TNRs) can occur by polymerase slippage and hairpin formation on either template or newly synthesized strands during replication. Although not predicted by a slippage model, deletions occur preferentially when 5′-CTG is in the lagging strand template and are highly favored over insertion events in rapidly replicating cells. The mechanism for the deletion bias and the orientation dependence of TNR instability is poorly understood. We report here that there is an orientation-dependent impediment to polymerase progression on 5′-CAG and 5′-CTG repeats that can be relieved by the binding of single-stranded DNA-binding protein. The block depends on the primary sequence of the TNR but does not correlate with the thermodynamic stability of hairpins. The orientation-dependent block of polymerase passage is the strongest when 5′-CAG is the template. We propose a “template-push” model in which the slow speed of DNA polymerase across the 5′-CAG leading strand template creates a threat to helicase-polymerase coupling. To prevent uncoupling, the TNR template is pushed out and by-passed. Hairpins do not cause the block, but appear to occur as a consequence of polymerase pass-over. PMID:18263578
Close encounters: Moving along bumps, breaks, and bubbles on expanded trinucleotide tracts

DOE Office of Scientific and Technical Information (OSTI.GOV)

Polyzos, Aris A.; McMurray, Cynthia T.

2017-06-09

Expansion of simple triplet repeats (TNR) underlies greater than 30 severe degenerative diseases. There is a good understanding of the major pathways generating an expansion, and the associated polymerases that operate during gap filling synthesis at these “difficult to copy” sequences. However, the mechanism by which a TNR is repaired depends on the type of lesion, the structural features imposed by the lesion, the assembled replication/repair complex, and the polymerase that encounters it. The relationships among these parameters are exceptionally complex and how they direct pathway choice is poorly understood. In this review, we consider the properties of polymerases, andmore » how encounters with GC-rich or abnormal structures might influence polymerase choice and the success of replication and repair. Insights over the last three years have highlighted new mechanisms that provide interesting choices to consider in protecting genome stability.« less
Trinucleotide Insertions, Deletions, and Point Mutations in Glucose Transporters Confer K+ Uptake in Saccharomyces cerevisiae

PubMed Central

Liang, Hong; Ko, Christopher H.; Herman, Todd; Gaber, Richard F.

1998-01-01

Deletion of TRK1 and TRK2 abolishes high-affinity K+ uptake in Saccharomyces cerevisiae, resulting in the inability to grow on typical synthetic growth medium unless it is supplemented with very high concentrations of potassium. Selection for spontaneous suppressors that restored growth of trk1Δ trk2Δ cells on K+-limiting medium led to the isolation of cells with unusual gain-of-function mutations in the glucose transporter genes HXT1 and HXT3 and the glucose/galactose transporter gene GAL2. 86Rb uptake assays demonstrated that the suppressor mutations conferred increased uptake of the ion. In addition to K+, the mutant hexose transporters also conferred permeation of other cations, including Na+. Because the selection strategy required such gain of function, mutations that disrupted transporter maturation or localization to the plasma membrane were avoided. Thus, the importance of specific sites in glucose transport could be independently assessed by testing for the ability of the mutant transporter to restore glucose-dependent growth to cells containing null alleles of all of the known functional glucose transporter genes. Twelve sites, most of which are conserved among eukaryotic hexose transporters, were revealed to be essential for glucose transport. Four of these have previously been shown to be essential for glucose transport by animal or plant transporters. Eight represented sites not previously known to be crucial for glucose uptake. Each suppressor mutant harbored a single mutation that altered an amino acid(s) within or immediately adjacent to a putative transmembrane domain of the transporter. Seven of 38 independent suppressor mutations consisted of in-frame insertions or deletions. The nature of the insertions and deletions revealed a striking DNA template dependency: each insertion generated a trinucleotide repeat, and each deletion involved the removal of a repeated nucleotide sequence. PMID:9447989
Rediscovering Medicinal Plants' Potential with OMICS: Microsatellite Survey in Expressed Sequence Tags of Eleven Traditional Plants with Potent Antidiabetic Properties

PubMed Central

Sahu, Jagajjit; Sen, Priyabrata; Choudhury, Manabendra Dutta; Dehury, Budheswar; Barooah, Madhumita; Modi, Mahendra Kumar

2014-01-01

Abstract Herbal medicines and traditionally used medicinal plants present an untapped potential for novel molecular target discovery using systems science and OMICS biotechnology driven strategies. Since up to 40% of the world's poor people have no access to government health services, traditional and folk medicines are often the only therapeutics available to them. In this vein, North East (NE) India is recognized for its rich bioresources. As part of the Indo-Burma hotspot, it is regarded as an epicenter of biodiversity for several plants having myriad traditional uses, including medicinal use. However, the improvement of these valuable bioresources through molecular breeding strategies, for example, using genic microsatellites or Simple Sequence Repeats (SSRs) or Expressed Sequence Tags (ESTs)-derived SSRs has not been fully utilized in large scale to date. In this study, we identified a total of 47,700 microsatellites from 109,609 ESTs of 11 medicinal plants (pineapple, papaya, noyontara, bitter orange, bermuda brass, ratalu, barbados nut, mango, mulberry, lotus, and guduchi) having proven antidiabetic properties. A total of 58,159 primer pairs were designed for the non-redundant 8060 SSR-positive ESTs and putative functions were assigned to 4483 unique contigs. Among the identified microsatellites, excluding mononucleotide repeats, di-/trinucleotides are predominant, among which repeat motifs of AG/CT and AAG/CTT were most abundant. Similarity search of SSR containing ESTs and antidiabetic gene sequences revealed 11 microsatellites linked to antidiabetic genes in five plants. GO term enrichment analysis revealed a total of 80 enriched GO terms widely distributed in 53 biological processes, 17 molecular functions, and 10 cellular components associated with the 11 markers. The present study therefore provides concrete insights into the frequency and distribution of SSRs in important medicinal resources. The microsatellite markers reported here markedly add to the genetic stock for cross transferability in these plants and the literature on biomarkers and novel drug discovery for common chronic diseases such as diabetes. PMID:24802971

Rediscovering medicinal plants' potential with OMICS: microsatellite survey in expressed sequence tags of eleven traditional plants with potent antidiabetic properties.

PubMed

Sahu, Jagajjit; Sen, Priyabrata; Choudhury, Manabendra Dutta; Dehury, Budheswar; Barooah, Madhumita; Modi, Mahendra Kumar; Talukdar, Anupam Das

2014-05-01

Herbal medicines and traditionally used medicinal plants present an untapped potential for novel molecular target discovery using systems science and OMICS biotechnology driven strategies. Since up to 40% of the world's poor people have no access to government health services, traditional and folk medicines are often the only therapeutics available to them. In this vein, North East (NE) India is recognized for its rich bioresources. As part of the Indo-Burma hotspot, it is regarded as an epicenter of biodiversity for several plants having myriad traditional uses, including medicinal use. However, the improvement of these valuable bioresources through molecular breeding strategies, for example, using genic microsatellites or Simple Sequence Repeats (SSRs) or Expressed Sequence Tags (ESTs)-derived SSRs has not been fully utilized in large scale to date. In this study, we identified a total of 47,700 microsatellites from 109,609 ESTs of 11 medicinal plants (pineapple, papaya, noyontara, bitter orange, bermuda brass, ratalu, barbados nut, mango, mulberry, lotus, and guduchi) having proven antidiabetic properties. A total of 58,159 primer pairs were designed for the non-redundant 8060 SSR-positive ESTs and putative functions were assigned to 4483 unique contigs. Among the identified microsatellites, excluding mononucleotide repeats, di-/trinucleotides are predominant, among which repeat motifs of AG/CT and AAG/CTT were most abundant. Similarity search of SSR containing ESTs and antidiabetic gene sequences revealed 11 microsatellites linked to antidiabetic genes in five plants. GO term enrichment analysis revealed a total of 80 enriched GO terms widely distributed in 53 biological processes, 17 molecular functions, and 10 cellular components associated with the 11 markers. The present study therefore provides concrete insights into the frequency and distribution of SSRs in important medicinal resources. The microsatellite markers reported here markedly add to the genetic stock for cross transferability in these plants and the literature on biomarkers and novel drug discovery for common chronic diseases such as diabetes.
CRISPR/Cas9-Induced (CTG⋅CAG)n Repeat Instability in the Myotonic Dystrophy Type 1 Locus: Implications for Therapeutic Genome Editing.

PubMed

van Agtmaal, Ellen L; André, Laurène M; Willemse, Marieke; Cumming, Sarah A; van Kessel, Ingeborg D G; van den Broek, Walther J A A; Gourdon, Geneviève; Furling, Denis; Mouly, Vincent; Monckton, Darren G; Wansink, Derick G; Wieringa, Bé

2017-01-04

Myotonic dystrophy type 1 (DM1) is caused by (CTG⋅CAG) n -repeat expansion within the DMPK gene and thought to be mediated by a toxic RNA gain of function. Current attempts to develop therapy for this disease mainly aim at destroying or blocking abnormal properties of mutant DMPK (CUG)n RNA. Here, we explored a DNA-directed strategy and demonstrate that single clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9-cleavage in either its 5' or 3' unique flank promotes uncontrollable deletion of large segments from the expanded trinucleotide repeat, rather than formation of short indels usually seen after double-strand break repair. Complete and precise excision of the repeat tract from normal and large expanded DMPK alleles in myoblasts from unaffected individuals, DM1 patients, and a DM1 mouse model could be achieved at high frequency by dual CRISPR/Cas9-cleavage at either side of the (CTG⋅CAG)n sequence. Importantly, removal of the repeat appeared to have no detrimental effects on the expression of genes in the DM1 locus. Moreover, myogenic capacity, nucleocytoplasmic distribution, and abnormal RNP-binding behavior of transcripts from the edited DMPK gene were normalized. Dual sgRNA-guided excision of the (CTG⋅CAG)n tract by CRISPR/Cas9 technology is applicable for developing isogenic cell lines for research and may provide new therapeutic opportunities for patients with DM1. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Generation and analysis of expressed sequence tags in the extreme large genomes Lilium and Tulipa

PubMed Central

2012-01-01

Background Bulbous flowers such as lily and tulip (Liliaceae family) are monocot perennial herbs that are economically very important ornamental plants worldwide. However, there are hardly any genetic studies performed and genomic resources are lacking. To build genomic resources and develop tools to speed up the breeding in both crops, next generation sequencing was implemented. We sequenced and assembled transcriptomes of four lily and five tulip genotypes using 454 pyro-sequencing technology. Results Successfully, we developed the first set of 81,791 contigs with an average length of 514 bp for tulip, and enriched the very limited number of 3,329 available ESTs (Expressed Sequence Tags) for lily with 52,172 contigs with an average length of 555 bp. The contigs together with singletons covered on average 37% of lily and 39% of tulip estimated transcriptome. Mining lily and tulip sequence data for SSRs (Simple Sequence Repeats) showed that di-nucleotide repeats were twice more abundant in UTRs (UnTranslated Regions) compared to coding regions, while tri-nucleotide repeats were equally spread over coding and UTR regions. Two sets of single nucleotide polymorphism (SNP) markers suitable for high throughput genotyping were developed. In the first set, no SNPs flanking the target SNP (50 bp on either side) were allowed. In the second set, one SNP in the flanking regions was allowed, which resulted in a 2 to 3 fold increase in SNP marker numbers compared with the first set. Orthologous groups between the two flower bulbs: lily and tulip (12,017 groups) and among the three monocot species: lily, tulip, and rice (6,900 groups) were determined using OrthoMCL. Orthologous groups were screened for common SNP markers and EST-SSRs to study synteny between lily and tulip, which resulted in 113 common SNP markers and 292 common EST-SSR. Lily and tulip contigs generated were annotated and described according to Gene Ontology terminology. Conclusions Two transcriptome sets were built that are valuable resources for marker development, comparative genomic studies and candidate gene approaches. Next generation sequencing of leaf transcriptome is very effective; however, deeper sequencing and using more tissues and stages is advisable for extended comparative studies. PMID:23167289
Roles of the nucleolus in the CAG RNA-mediated toxicity.

PubMed

Tsoi, Ho; Chan, Ho Yin Edwin

2014-06-01

The nucleolus is a subnuclear compartment within the cell nucleus that serves as the site for ribosomal RNA (rRNA) transcription and the assembly of ribosome subunits. Apart from its classical role in ribosomal biogenesis, a number of cellular regulatory roles have recently been assigned to the nucleolus, including governing the induction of apoptosis. "Nucleolar stress" is a term that is used to describe a signaling pathway through which the nucleolus communicates with other subcellular compartments, including the mitochondria, to induce apoptosis. It is an effective mechanism for eliminating cells that are incapable of performing protein synthesis efficiently due to ribosome biogenesis defects. The down-regulation of rRNA transcription is a common cause of nucleolar function disruption that subsequently triggers nucleolar stress, and has been associated with the pathogenesis of neurological disorders such as spinocerebellar ataxias (SCAs) and Huntington's diseases (HD). This article discusses recent advances in mechanistic studies of how expanded CAG trinucleotide repeat RNA transcripts trigger nucleolar stress in SCAs, HD and other trinucleotide repeat disorders. This article is part of a Special Issue entitled: Role of the Nucleolus in Human Disease. Copyright © 2013 Elsevier B.V. All rights reserved.
Normal CAG and CCG repeats in the Huntington`s disease genes of Parkinson`s disease patients

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rubinsztein, D.C.; Leggo, J.; Barton, D.E.

1995-04-24

The clinical features of Parkinson`s disease, particularly rigidity and bradykinesia and occasionally tremor, are seen in juvenile-onset Huntington`s disease. Therefore, the CAG and CCG repeats in the Huntington`s disease gene were investigated in 45 Parkinson`s disease patients and compared to 40 control individuals. All of the Parkinson`s disease chromosomes fell within the normal size ranges. In addition, the distributions of the two repeats in the Parkinson`s disease patients did not differ significantly from those of the control population. Therefore, abnormalities of these trinucleotide repeats in the Huntington`s disease gene are not likely to contribute to the pathogenesis of Parkinson`s disease.more » 12 refs., 2 figs.« less
Genetic instability associated with loop or stem–loop structures within transcription units can be independent of nucleotide excision repair

PubMed Central

Burns, John A; Chowdhury, Moinuddin A; Cartularo, Laura; Berens, Christian; Scicchitano, David A

2018-01-01

Abstract Simple sequence repeats (SSRs) are found throughout the genome, and under some conditions can change in length over time. Germline and somatic expansions of trinucleotide repeats are associated with a series of severely disabling illnesses, including Huntington's disease. The underlying mechanisms that effect SSR expansions and contractions have been experimentally elusive, but models suggesting a role for DNA repair have been proposed, in particular the involvement of transcription-coupled nucleotide excision repair (TCNER) that removes transcription-blocking DNA damage from the transcribed strand of actively expressed genes. If the formation of secondary DNA structures that are associated with SSRs were to block RNA polymerase progression, TCNER could be activated, resulting in the removal of the aberrant structure and a concomitant change in the region's length. To test this, TCNER activity in primary human fibroblasts was assessed on defined DNA substrates containing extrahelical DNA loops that lack discernible internal base pairs or DNA stem–loops that contain base pairs within the stem. The results show that both structures impede transcription elongation, but there is no corresponding evidence that nucleotide excision repair (NER) or TCNER operates to remove them. PMID:29474673
Phosphorodiamidate morpholino oligomers suppress mutant huntingtin expression and attenuate neurotoxicity

PubMed Central

Sun, Xin; Marque, Leonard O.; Cordner, Zachary; Pruitt, Jennifer L.; Bhat, Manik; Li, Pan P.; Kannan, Geetha; Ladenheim, Ellen E.; Moran, Timothy H.; Margolis, Russell L.; Rudnicki, Dobrila D.

2014-01-01

Huntington's disease (HD) is a neurodegenerative disorder caused by a CAG trinucleotide repeat expansion in the huntingtin (HTT) gene. Disease pathogenesis derives, at least in part, from the long polyglutamine tract encoded by mutant HTT. Therefore, considerable effort has been dedicated to the development of therapeutic strategies that significantly reduce the expression of the mutant HTT protein. Antisense oligonucleotides (ASOs) targeted to the CAG repeat region of HTT transcripts have been of particular interest due to their potential capacity to discriminate between normal and mutant HTT transcripts. Here, we focus on phosphorodiamidate morpholino oligomers (PMOs), ASOs that are especially stable, highly soluble and non-toxic. We designed three PMOs to selectively target expanded CAG repeat tracts (CTG22, CTG25 and CTG28), and two PMOs to selectively target sequences flanking the HTT CAG repeat (HTTex1a and HTTex1b). In HD patient–derived fibroblasts with expanded alleles containing 44, 77 or 109 CAG repeats, HTTex1a and HTTex1b were effective in suppressing the expression of mutant and non-mutant transcripts. CTGn PMOs also suppressed HTT expression, with the extent of suppression and the specificity for mutant transcripts dependent on the length of the targeted CAG repeat and on the CTG repeat length and concentration of the PMO. PMO CTG25 reduced HTT-induced cytotoxicity in vitro and suppressed mutant HTT expression in vivo in the N171-82Q transgenic mouse model. Finally, CTG28 reduced mutant HTT expression and improved the phenotype of HdhQ7/Q150 knock-in HD mice. These data demonstrate the potential of PMOs as an approach to suppressing the expression of mutant HTT. PMID:25035419
Novel microsatellite markers for the oriental fruit moth Grapholita molesta (Lepidoptera: Tortricidae) and effects of null alleles on population genetics analyses.

PubMed

Song, W; Cao, L-J; Wang, Y-Z; Li, B-Y; Wei, S-J

2017-06-01

The oriental fruit moth (OFM) Grapholita molesta (Lepidoptera: Tortricidae) is an important economic pest of stone and pome fruits worldwide. We sequenced the OFM genome using next-generation sequencing and characterized the microsatellite distribution. In total, 56,674 microsatellites were identified, with 11,584 loci suitable for primer design. Twenty-seven polymorphic microsatellites, including 24 loci with trinucleotide repeat and three with pentanucleotide repeat, were validated in 95 individuals from four natural populations. The allele numbers ranged from 4 to 40, with an average value of 13.7 per locus. A high frequency of null alleles was observed in most loci developed for the OFM. Three marker panels, all of the loci, nine loci with the lowest null allele frequencies, and nine loci with the highest null allele frequencies, were established for population genetics analyses. The null allele influenced estimations of genetic diversity parameters but not the OFM's genetic structure. Both a STRUCTURE analysis and a discriminant analysis of principal components, using the three marker panels, divided the four natural populations into three groups. However, more individuals were incorrectly assigned by the STRUCTURE analysis when the marker panel with the highest null allele frequency was used compared with the other two panels. Our study provides empirical research on the effects of null alleles on population genetics analyses. The microsatellites developed will be valuable markers for genetic studies of the OFM.
Development of a EST dataset and characterization of EST-SSRs in a traditional Chinese medicinal plant, Epimedium sagittatum (Sieb. Et Zucc.) Maxim

PubMed Central

2010-01-01

Background Epimedium sagittatum (Sieb. Et Zucc.) Maxim, a traditional Chinese medicinal plant species, has been used extensively as genuine medicinal materials. Certain Epimedium species are endangered due to commercial overexploition, while sustainable application studies, conservation genetics, systematics, and marker-assisted selection (MAS) of Epimedium is less-studied due to the lack of molecular markers. Here, we report a set of expressed sequence tags (ESTs) and simple sequence repeats (SSRs) identified in these ESTs for E. sagittatum. Results cDNAs of E. sagittatum are sequenced using 454 GS-FLX pyrosequencing technology. The raw reads are cleaned and assembled into a total of 76,459 consensus sequences comprising of 17,231 contigs and 59,228 singlets. About 38.5% (29,466) of the consensus sequences significantly match to the non-redundant protein database (E-value < 1e-10), 22,295 of which are further annotated using Gene Ontology (GO) terms. A total of 2,810 EST-SSRs is identified from the Epimedium EST dataset. Trinucleotide SSR is the dominant repeat type (55.2%) followed by dinucleotide (30.4%), tetranuleotide (7.3%), hexanucleotide (4.9%), and pentanucleotide (2.2%) SSR. The dominant repeat motif is AAG/CTT (23.6%) followed by AG/CT (19.3%), ACC/GGT (11.1%), AT/AT (7.5%), and AAC/GTT (5.9%). Thirty-two SSR-ESTs are randomly selected and primer pairs are synthesized for testing the transferability across 52 Epimedium species. Eighteen primer pairs (85.7%) could be successfully transferred to Epimedium species and sixteen of those show high genetic diversity with 0.35 of observed heterozygosity (Ho) and 0.65 of expected heterozygosity (He) and high number of alleles per locus (11.9). Conclusion A large EST dataset with a total of 76,459 consensus sequences is generated, aiming to provide sequence information for deciphering secondary metabolism, especially for flavonoid pathway in Epimedium. A total of 2,810 EST-SSRs is identified from EST dataset and ~1580 EST-SSR markers are transferable. E. sagittatum EST-SSR transferability to the major Epimedium germplasm is up to 85.7%. Therefore, this EST dataset and EST-SSRs will be a powerful resource for further studies such as taxonomy, molecular breeding, genetics, genomics, and secondary metabolism in Epimedium species. PMID:20141623
Ubiquitin-Positive Intranuclear Inclusions in Neuronal and Glial Cells in a Mouse Model of the Fragile-X Premutation

PubMed Central

Wenzel, H. Jürgen; Hunsaker, Michael R.; Greco, Claudia M.; Willemsen, Rob; Berman, Robert F.

2010-01-01

Fragile X-associated tremor/ataxia syndrome (FXTAS) is an adult-onset neurodegenerative disorder caused by CGG trinucleotide repeat expansions in the fragile X mental retardation 1 (FMR1) gene. The neuropathological hallmark of the disease is the presence of ubiquitin-positive intranuclear inclusions in neurons and in astrocytes. Ubiquitin-positive intranuclear inclusions have also been found in the neurons of transgenic mice model carrying an expanded CGG(98) trinucleotide repeat of human origin, but have not previously been described in glial cells. Therefore, we used immunocytochemical methods to determine the pathological features of nuclear and/or cytoplasmic inclusions in astrocytes, Bergmann glia and neurons, as well as relationships between inclusion patterns, age, and repeat length in CGG knock-in (KI) mice in comparison with wild type mice. In CGG KI mice, ubiquitin-positive intranuclear inclusions were found in neurons (e.g., pyramidal cells, GABAergic neurons) throughout the brain in cortical and subcortical brain regions; these inclusions increased in number and size with advanced age. Ubiquitin-positive intranuclear inclusions were also present in protoplasmic astrocytes, including Bergmann glia in the cerebellum. The morphology of intranuclear inclusions in CGG KI mice was compared to that of typical inclusions in human neurons and astrocytes in postmortem FXTAS brain tissue. This new finding of previously unreported pathology in astrocytes of CGG KI mice now provides an important mouse model to study astrocyte pathology in human FXTAS. PMID:20051238
DOE Office of Scientific and Technical Information (OSTI.GOV)

Andrew, S.E.; Goldberg, Y.P.; Squitieri, F.

Huntington disease (HD) is one of 7 disorders now known to be caused by expansion of a trinucleotide repeat. The HD mutation is a polymorphic trinucleotide (CAG) repeat in the 5{prime} region of a novel gene that expands beyond the normal range of 10-35 repeats in persons destined to develop the disease. Haplotype analysis of other dynamic mutation disorders such as myotonic dystrophy and Fragil X have suggested that a rare ancestral expansion event on a normal chromosome is followed by subsequent expansion events, resulting in a pool of chromosomes in the premutation range, which is inherently unstable and pronemore » to further multiple expansion events leading to disease range chromosomes. Haplotype analysis of 67 HD and 84 control chromosomes using 5 polymorphic markers, both intragenic and 5{prime} to the disease mutation, demonstrate that multiple haplotypes underlie HD. However, 94% of the chromosomes can be grouped under two major haplotypes. These two haplotypes are also present in the normal population. A third major haplotype is seen on 38% of normal chromosomes but rarely on HD chromosomes (6%). CAG lengths on the normal chromosomes with the two haplotypes seen in the HD population are higher than those seen on the normal chromosomes with the haplotype rarely seen on HD chromosomes. Furthermore, in populations with a diminished frequency of HD, CAG length on normal chromosomes is significantly less than other populations with higher prevalence rates for HD. These data suggest that CAG length on normal chromosomes may be a significant factor contributing to repeat instability that eventually leads to chromosomes with CAG repeat lengths in the HD range. Haplotypes on the HD chromosomes are identical to those normal chromosomes which have CAG lengths in the high range of normal, suggesting that further expansions of this pool of chromosomes leads to chromosomes with CAG repeat sizes within the disease range, consistent with a multistep model.« less
Microsatellite abundance across the Anthozoa and Hydrozoa in the phylum Cnidaria.

PubMed

Ruiz-Ramos, Dannise V; Baums, Iliana B

2014-10-27

Microsatellite loci have high mutation rates and thus are indicative of mutational processes within the genome. By concentrating on the symbiotic and aposymbiotic cnidarians, we investigated if microsatellite abundances follow a phylogenetic or ecological pattern. Individuals from eight species were shotgun sequenced using 454 GS-FLX Titanium technology. Sequences from the three available cnidarian genomes (Nematostella vectensis, Hydra magnipapillata and Acropora digitifera) were added to the analysis for a total of eleven species representing two classes, three subclasses and eight orders within the phylum Cnidaria. Trinucleotide and tetranucleotide repeats were the most abundant motifs, followed by hexa- and dinucleotides. Pentanucleotides were the least abundant motif in the data set. Hierarchical clustering and log likelihood ratio tests revealed a weak relationship between phylogeny and microsatellite content. Further, comparisons between cnidaria harboring intracellular dinoflagellates and those that do not, show microsatellite coverage is higher in the latter group. Our results support previous studies that found tri- and tetranucleotides to be the most abundant motifs in invertebrates. Differences in microsatellite coverage and composition between symbiotic and non-symbiotic cnidaria suggest the presence/absence of dinoflagellates might place restrictions on the host genome.
Comparative analysis of the 5{prime} genomic and promoter regions between the mouse (Hdh) and human Huntington disease (HD) gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kalchman, M.; Lin, B.; Nasir, J.

1994-09-01

The mouse homologue of the Huntington disease gene (Hdh) has recently been cloned and mapped to a region of synteny with the human, on mouse chromosome 5. The two genes share a high degree of both coding (90% amino acid) and nucleotide (86.2%) identity. We have subsequently performed a detailed comparison of the genomic organization of the 5{prime} region of the two genes encompassing the promoter region and first five exons of both the human and mouse genes. The comparative sequence analysis of the promoter region between HD and Hdh reveals two highly conserved regions. One region (-56 to -118)more » (+1 is the ATG start codon), shared 84% nucleotide identity and another region (-130 to -206) had 81% nucleotide identity. Nine putative Sp1 sites appear in the human promoter region contrasted with only 3 in a similar region in the mouse. Furthermore, 17 and 20 base pair direct repeats present in the HD 5{prime} region are absent in the similar Hdh region. Although both the mouse and human intron/exon boundaries conform to the GT/AG rule, the intron sizes between HD and Hdh are markedly different. The first four introns in Hdh are 15, 7, 5 and 0.5 kb compared to sizes of 10, 15, 7 and 0.5 kb, respectively. Comparison between the mouse and human intronic sequences immediately adjacent to the first five exons (excluding exon 1) reveals only about 46 to 50% identity within the first 60 bp of intronic sequence. Furthermore, we have identified novel polymorphic di-, tri- and tetra-nucleotide repeats in Hdh introns of various mouse strains that are not present in the human. For example, polymorphic CT repeats are present in introns 2 and 4 of Hdh and a novel mouse 56 AAG trinucleotide repeat (interrupted by an AAGG) is also located within intron 2. This information concerning the promoter and genomic organization of both HD and Hdh is critical for designing appropriate gene targetting vectors for studying the normal function of the HD and Hdh genes in model systems.« less
Sequence Discrimination by Alternatively Spliced Isoforms of a DNA Binding Zinc Finger Domain

NASA Astrophysics Data System (ADS)

Gogos, Joseph A.; Hsu, Tien; Bolton, Jesse; Kafatos, Fotis C.

1992-09-01

Two major developmentally regulated isoforms of the Drosophila chorion transcription factor CF2 differ by an extra zinc finger within the DNA binding domain. The preferred DNA binding sites were determined and are distinguished by an internal duplication of TAT in the site recognized by the isoform with the extra finger. The results are consistent with modular interactions between zinc fingers and trinucleotides and also suggest rules for recognition of AT-rich DNA sites by zinc finger proteins. The results show how modular finger interactions with trinucleotides can be used, in conjunction with alternative splicing, to alter the binding specificity and increase the spectrum of sites recognized by a DNA binding domain. Thus, CF2 may potentially regulate distinct sets of target genes during development.
Fatty Acid Profile and Unigene-Derived Simple Sequence Repeat Markers in Tung Tree (Vernicia fordii)

PubMed Central

Zhang, Lin; Jia, Baoguang; Tan, Xiaofeng; Thammina, Chandra S.; Long, Hongxu; Liu, Min; Wen, Shanna; Song, Xianliang; Cao, Heping

2014-01-01

Tung tree (Vernicia fordii) provides the sole source of tung oil widely used in industry. Lack of fatty acid composition and molecular markers hinders biochemical, genetic and breeding research. The objectives of this study were to determine fatty acid profiles and develop unigene-derived simple sequence repeat (SSR) markers in tung tree. Fatty acid profiles of 41 accessions showed that the ratio of α-eleostearic acid was increasing continuously with a parallel trend to the amount of tung oil accumulation while the ratios of other fatty acids were decreasing in different stages of the seeds and that α-eleostearic acid (18∶3) consisted of 77% of the total fatty acids in tung oil. Transcriptome sequencing identified 81,805 unigenes from tung cDNA library constructed using seed mRNA and discovered 6,366 SSRs in 5,404 unigenes. The di- and tri-nucleotide microsatellites accounted for 92% of the SSRs with AG/CT and AAG/CTT being the most abundant SSR motifs. Fifteen polymorphic genic-SSR markers were developed from 98 unigene loci tested in 41 cultivated tung accessions by agarose gel and capillary electrophoresis. Genbank database search identified 10 of them putatively coding for functional proteins. Quantitative PCR demonstrated that all 15 polymorphic SSR-associated unigenes were expressed in tung seeds and some of them were highly correlated with oil composition in the seeds. Dendrogram revealed that most of the 41 accessions were clustered according to the geographic region. These new polymorphic genic-SSR markers will facilitate future studies on genetic diversity, molecular fingerprinting, comparative genomics and genetic mapping in tung tree. The lipid profiles in the seeds of 41 tung accessions will be valuable for biochemical and breeding studies. PMID:25167054
Neuropathological Comparison of Adult Onset and Juvenile Huntington's Disease with Cerebellar Atrophy: A Report of a Father and Son.

PubMed

Latimer, Caitlin S; Flanagan, Margaret E; Cimino, Patrick J; Jayadev, Suman; Davis, Marie; Hoffer, Zachary S; Montine, Thomas J; Gonzalez-Cuyar, Luis F; Bird, Thomas D; Keene, C Dirk

2017-01-01

Huntington's disease (HD) is an autosomal dominant neurodegenerative disease caused by a trinucleotide (CAG) repeat expansion in huntingtin (HTT) on chromosome 4. Anticipation can cause longer repeat expansions in children of HD patients. Juvenile Huntington's disease (JHD), defined as HD arising before age 20, accounts for 5-10% of HD cases, with cases arising in the first decade accounting for approximately 1%. Clinically, JHD differs from the predominately choreiform adult onset Huntington's disease (AOHD) with variable presentations, including symptoms such as myoclonus, seizures, Parkinsonism, and cognitive decline. The neuropathologic changes of AOHD are well characterized, but there are fewer reports that describe the neuropathology of JHD. Here we report a case of a six-year-old boy with paternally-inherited JHD caused by 169 CAG trinucleotide repeats who presented at age four with developmental delay, dysarthria, and seizures before dying at age 6. The boy's clinical presentation and neuropathological findings are directly compared to those of his father, who presented with AOHD and 54 repeats. A full autopsy was performed for the JHD case and a brain-only autopsy was performed for the AOHD case. Histochemically- and immunohistochemically-stained slides were prepared from formalin-fixed, paraffin-embedded tissue sections. Both cases had neuropathology corresponding to Vonsattel grade 3. The boy also had cerebellar atrophy with huntingtin-positive inclusions in the cerebellum, findings not present in the father. Autopsies of father and son provide a unique opportunity to compare and contrast the neuropathologic findings of juvenile and adult onset HD while also providing the first immunohistochemical evidence of cerebellar involvement in JHD. Additionally this is the first known report to include findings from peripheral tissue in a case of JHD.
Fragile X-Associated Tremor/Ataxia Syndrome (FXTAS) Motor Dysfunction Modeled in Mice.

PubMed

Foote, Molly; Arque, Gloria; Berman, Robert F; Santos, Mónica

2016-10-01

Fragile X-associated tremor/ataxia syndrome (FXTAS) is a late-onset neurodegenerative disorder that affects some carriers of the fragile X premutation (PM). In PM carriers, there is a moderate expansion of a CGG trinucleotide sequence (55-200 repeats) in the fragile X gene (FMR1) leading to increased FMR1 mRNA and small to moderate decreases in the fragile X mental retardation protein (FMRP) expression. The key symptoms of FXTAS include cerebellar gait ataxia, kinetic tremor, sensorimotor deficits, neuropsychiatric changes, and dementia. While the specific trigger(s) that causes PM carriers to progress to FXTAS pathogenesis remains elusive, the use of animal models has shed light on the underlying neurobiology of the altered pathways involved in disease development. In this review, we examine the current use of mouse models to study PM and FXTAS, focusing on recent advances in the field. Specifically, we will discuss the construct, face, and predictive validities of these PM mouse models, the insights into the underlying disease mechanisms, and potential treatments.
Development and evaluation of microsatellite markers for Acer miyabei (Sapindaceae), a threatened maple species in East Asia.

PubMed

Saeki, Ikuyo; Hirao, Akira S; Kenta, Tanaka

2015-06-01

Twelve microsatellite markers were developed and characterized in a threatened maple species, Acer miyabei (Sapindaceae), for use in population genetic analyses. Using Ion Personal Genome Machine (PGM) sequencing, we developed microsatellite markers with perfect di- and trinucleotide repeats. These markers were tested on a total of 44 individuals from two natural populations of A. miyabei subsp. miyabei f. miyabei in Hokkaido Island, Japan. The number of alleles per locus ranged from two to eight. The observed and expected heterozygosities per locus ranged from 0.05 to 0.75 and from 0.05 to 0.79, respectively. Some of the markers were successfully transferred to the closely related species A. campestre, A. platanoides, and A. pictum. The developed markers will be useful in characterizing the genetic structure and diversity of A. miyabei and will help to understand its spatial genetic variation, levels of inbreeding, and patterns of gene flow, thereby providing a basis for conservation.
Expansion of 50 CAG/CTG repeats excluded in schizophrenia by application of a highly efficient approach using repeat expansion detection and a PCR screening set

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bowen, T.; Guy, C.; Speight, G.

Studies of the transmission of schizophrenia in families with affected members in several generations have suggested that an expanded trinucleotide repeat mechanism may contribute to the genetic inheritance of this disorder. Using repeat expansion detection (RED), we and others have previously found that the distribution of CAG/CTG repeat size is larger in patients with schizophrenia than in controls. In an attempt to identify the specific expanded CAG/CTG locus or loci associated with schizophrenia, we have now used an approach based on a CAG/CTG PCR screening set combined with RED data. This has allowed us to minimize genotyping while excluding 43more » polymorphic autosomal loci and 7 X-chromosomal loci from the screening set as candidates for expansion in schizophrenia with a very high degree of confidence. 18 refs., 1 tab.« less
Isolation and characterization of microsatellite loci in the common milkweed, Asclepias syriaca (Apocynaceae).

PubMed

Kabat, Susan M; Dick, Christopher W; Hunter, Mark D

2010-05-01

Microsatellite primers were developed for the common milkweed, Asclepias syriaca L., to assist in genet identification and the analysis of spatial genetic structure. Using an enrichment cloning protocol, eight microsatellite loci were isolated and characterized in a Michigan population of A. syriaca. The primers amplified di- and trinucleotide repeats with 4-13 alleles per locus. The primers will be useful for studies of clonality and gene flow in natural populations.

Microsatellites in the Genome of the Edible Mushroom, Volvariella volvacea

PubMed Central

Chen, Mingjie; Wang, Hong; Bao, Dapeng

2014-01-01

Using bioinformatics software and database, we have characterized the microsatellite pattern in the V. volvacea genome and compared it with microsatellite patterns found in the genomes of four other edible fungi: Coprinopsis cinerea, Schizophyllum commune, Agaricus bisporus, and Pleurotus ostreatus. A total of 1346 microsatellites have been identified, with mono-nucleotides being the most frequent motif. The relative abundance of microsatellites was lower in coding regions with 21 No./Mb. However, the microsatellites in the V. volvacea gene models showed a greater tendency to be located in the CDS regions. There was also a higher preponderance of trinucleotide repeats, especially in the kinase genes, which implied a possible role in phenotypic variation. Among the five fungal genomes, microsatellite abundance appeared to be unrelated to genome size. Furthermore, the short motifs (mono- to tri-nucleotides) outnumbered other categories although these differed in proportion. Data analysis indicated a possible relationship between the most frequent microsatellite types and the genetic distance between the five fungal genomes. PMID:24575404
Microsatellites in the genome of the edible mushroom, Volvariella volvacea.

PubMed

Wang, Ying; Chen, Mingjie; Wang, Hong; Wang, Jing-Fang; Bao, Dapeng

2014-01-01

Using bioinformatics software and database, we have characterized the microsatellite pattern in the V. volvacea genome and compared it with microsatellite patterns found in the genomes of four other edible fungi: Coprinopsis cinerea, Schizophyllum commune, Agaricus bisporus, and Pleurotus ostreatus. A total of 1346 microsatellites have been identified, with mono-nucleotides being the most frequent motif. The relative abundance of microsatellites was lower in coding regions with 21 No./Mb. However, the microsatellites in the V. volvacea gene models showed a greater tendency to be located in the CDS regions. There was also a higher preponderance of trinucleotide repeats, especially in the kinase genes, which implied a possible role in phenotypic variation. Among the five fungal genomes, microsatellite abundance appeared to be unrelated to genome size. Furthermore, the short motifs (mono- to tri-nucleotides) outnumbered other categories although these differed in proportion. Data analysis indicated a possible relationship between the most frequent microsatellite types and the genetic distance between the five fungal genomes.
Association of premenstrual/menstrual symptoms with perinatal depression and a polymorphic repeat in the polyglutamine tract of the retinoic acid induced 1 gene.

PubMed

Tan, Ene-Choo; Tan, Hui-San; Chua, Tze-Ern; Lee, Theresa; Ng, Jasmine; Ch'ng, Ying-Chia; Choo, Chih-Huei; Chen, Helen Y

2014-06-01

Depression during pregnancy or after childbirth is the most frequent perinatal illness affecting women. We investigated the length distribution of a trinucleotide repeat in RAI1, which has not been studied in perinatal depression or in the Chinese population. Cases (n=139) with confirmed diagnosis of clinical (major) depression related to pregnancy/postpartum were recruited from the outpatient clinic. Controls were patients who came to the obstetrics clinics and scored <7 on the Edinburgh Postnatal Depression Scale (EPDS) (n=540). Saliva samples for DNA analysis, demographic information and self-reported frequency of occurrence of various premenstrual/menstrual symptoms were collected from all participants. Genomic DNA was extracted from saliva and relevant region sequenced to determine the number of CAG/CAA repeats that encodes the polyglutamine tract in the N terminal of the protein. Difference between groups was assessed by chi-square analysis for categorical variables and analysis of variance for quantitative scores. Compared to control subjects, patients with perinatal depression reported more frequent mood changes, cramps, nausea, vomiting, diarrhoea, and headache during premenstrual/menstrual periods (p=0.000). For the RAI1 gene CAG/CAA repeat, there was a statistically significant difference in the genotypic distribution between cases and controls (p=0.031). There was also a statistically significant association between the 14-repeat allele and perinatal depression (p=0.016). Family history, previous mental illness, and physical and psychological symptoms during the premenstrual/menstrual periods were self-reported. EPDS screening was done only once for controls. The RAI1 gene polyglutamine repeat has a different distribution in our population. The 14-repeat allele is associated with perinatal depression and more frequent experience of physical and psychological symptoms during menstrual period. Copyright © 2014 Elsevier B.V. All rights reserved.
Low abundance of microsatellite repeats in the genome of the Brown-headed Cowbird (Molothrus ater)

USGS Publications Warehouse

Longmire, Jonathan L.; Hahn, D.C.; Roach, J.L.

1999-01-01

A cosmid library made from brown-headed cowbird (Molothrus ater) DNA was examined for representation of 17 distinct microsatellite motifs including all possible mono-, di-, and trinucleotide microsatellites, and the tetranucleotide repeat (GATA)n. The overall density of microsatellites within cowbird DNA was found to be one repeat per 89 kb and the frequency of the most abundant motif, (AGC)n, was once every 382 kb. The abundance of microsatellites within the cowbird genome is estimated to be reduced approximately 15-fold compared to humans. The reduced frequency of microsatellites seen in this study is consistent with previous observations indicating reduced numbers of microsatellites and other interspersed repeats in avian DNA. In addition to providing new information concerning the abundance of microsatellites within an avian genome, these results provide useful insights for selecting cloning strategies that might be used in the development of locus-specific microsatellite markers for avian studies.
Construction of a cDNA library and preliminary analysis of expressed sequence tags in Piper hainanense.

PubMed

Fan, R; Ling, P; Hao, C Y; Li, F P; Huang, L F; Wu, B D; Wu, H S

2015-10-19

Black pepper is a perennial climbing vine. It is widely cultivated because its berries can be utilized not only as a spice in food but also for medicinal use. This study aimed to construct a standardized, high-quality cDNA library to facilitated identification of new Piper hainanense transcripts. For this, 262 unigenes were used to generate raw reads. The average length of these 262 unigenes was 774.8 bp. Of these, 94 genes (35.9%) were newly identified, according to the NCBI protein database. Thus, identification of new genes may broaden the molecular knowledge of P. hainanense on the basis of Clusters of Orthologous Groups and Gene Ontology categories. In addition, certain basic genes linked to physiological processes, which can contribute to disease resistance and thereby to the breeding of black pepper. A total of 26 unigenes were found to be SSR markers. Dinucleotide SSR was the main repeat motif, accounting for 61.54%, followed by trinucleotide SSR (23.07%). Eight primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism among twenty-one piper germplasm. These results present a novel sequence information of P. hainanense, which can serve as the foundation for further genetic research on this species.
Characterization and Comparative Analysis of the Complete Chloroplast Genome of the Critically Endangered Species Streptocarpus teitensis (Gesneriaceae).

PubMed

Kyalo, Cornelius M; Gichira, Andrew W; Li, Zhi-Zhong; Saina, Josphat K; Malombe, Itambo; Hu, Guang-Wan; Wang, Qing-Feng

2018-01-01

Streptocarpus teitensis (Gesneriaceae) is an endemic species listed as critically endangered in the International Union for Conservation of Nature (IUCN) red list of threatened species. However, the sequence and genome information of this species remains to be limited. In this article, we present the complete chloroplast genome structure of Streptocarpus teitensis and its evolution inferred through comparative studies with other related species. S. teitensis displayed a chloroplast genome size of 153,207 bp, sheltering a pair of inverted repeats (IR) of 25,402 bp each split by small and large single-copy (SSC and LSC) regions of 18,300 and 84,103 bp, respectively. The chloroplast genome was observed to contain 116 unique genes, of which 80 are protein-coding, 32 are transfer RNAs, and four are ribosomal RNAs. In addition, a total of 196 SSR markers were detected in the chloroplast genome of Streptocarpus teitensis with mononucleotides (57.1%) being the majority, followed by trinucleotides (33.2%) and dinucleotides and tetranucleotides (both 4.1%), and pentanucleotides being the least (1.5%). Genome alignment indicated that this genome was comparable to other sequenced members of order Lamiales. The phylogenetic analysis suggested that Streptocarpus teitensis is closely related to Lysionotus pauciflorus and Dorcoceras hygrometricum .
Salivary testosterone and a trinucleotide (CAG) length polymorphism in the androgen receptor gene predict amygdala reactivity in men.

PubMed

Manuck, Stephen B; Marsland, Anna L; Flory, Janine D; Gorka, Adam; Ferrell, Robert E; Hariri, Ahmad R

2010-01-01

In studies employing functional magnetic resonance imaging (fMRI), reactivity of the amygdala to threat-related sensory cues (viz., facial displays of negative emotion) has been found to correlate positively with interindividual variability in testosterone levels of women and young men and to increase on acute administration of exogenous testosterone. Many of the biological actions of testosterone are mediated by intracellular androgen receptors (ARs), which exert transcriptional control of androgen-dependent genes and are expressed in various regions of the brain, including the amygdala. Transactivation potential of the AR decreases (yielding relative androgen insensitivity) with expansion a polyglutamine stretch in the N-terminal domain of the AR protein, as encoded by a trinucleotide (CAG) repeat polymorphism in exon 1 of the X-chromosome AR gene. Here we examined whether amygdala reactivity to threat-related facial expressions (fear, anger) differs as a function of AR CAG length variation and endogenous (salivary) testosterone in a mid-life sample of 41 healthy men (mean age=45.6 years, range: 34-54 years; CAG repeats, range: 19-29). Testosterone correlated inversely with participant age (r=-0.39, p=0.012) and positively with number of CAG repeats (r=0.45, p=0.003). In partial correlations adjusted for testosterone level, reactivity in the ventral amygdala was lowest among men with largest number of CAG repeats. This inverse association was seen in both the right (r(p)=-0.34, p<0.05) and left (r(p)=-0.32, p<0.05) hemisphere. Activation of dorsal amygdala, correlated positively with individual differences in salivary testosterone, also in right (r=0.40, p<0.02) and left (r=0.32, p<0.05) hemisphere, but was not affected by number of CAG repeats. Hence, androgenic influences on threat-related reactivity in the ventral amygdala may be moderated partially by CAG length variation in the AR gene. Because individual differences in salivary testosterone also predicted dorsal amygdala reactivity and did so independently of CAG repeats, it is suggested that androgenic influences within this anatomically distinct region may be mediated, in part, by non-genomic or AR-independent mechanisms.
Structural analysis of the 5{prime} region of mouse and human Huntington disease genes reveals conservation of putative promoter region and Di- and trinucleotide polymorphisms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Biaoyang; Nasir, J.; Kalchman, M.A.

1995-02-10

We have previously cloned and characterized the murine homologue of the Huntington disease (HD) gene and shown that it maps to mouse chromosome 5 within a region of conserved synteny with human chromosome 4p16.3. Here we present a detailed comparison of the sequence of the putative promoter and the organization of the 5{prime} genomic region of the murine (Hdh) and human HD genes encompassing the first five exons. We show that in this region these two genes share identical exon boundaries, but have different-size introns. Two dinucleotide (CT) and one trinucleotide intronic polymorphism in Hdh and an intronic CA polymorphismmore » in the HD gene were identified. Comparison of 940-bp sequence 5{prime} to the putative translation start site reveals a highly conserved region (78.8% nucleotide identity) between Hdh and the HD gene from nucleotide -56 to -206 (of Hdh). Neither Hdh nor the HD gene have typical TATA or CCAAT elements, but both show one putative AP2 binding site and numerous potential Sp1 binding sites. The high sequence identity between Hdh and the HD gene for approximately 200 bp 5{prime} to the putative translation start site indicates that these sequences may play a role in regulating expression of the Huntington disease gene. 30 refs., 4 figs., 2 tabs.« less
Structural studies of CNG repeats

PubMed Central

Kiliszek, Agnieszka; Rypniewski, Wojciech

2014-01-01

CNG repeats (where N denotes one of the four natural nucleotides) are abundant in the human genome. Their tendency to undergo expansion can lead to hereditary diseases known as TREDs (trinucleotide repeat expansion disorders). The toxic factor can be protein, if the abnormal gene is expressed, or the gene transcript, or both. The gene transcripts have attracted much attention in the biomedical community, but their molecular structures have only recently been investigated. Model RNA molecules comprising CNG repeats fold into long hairpins whose stems generally conform to an A-type helix, in which the non-canonical N-N pairs are flanked by C-G and G-C pairs. Each homobasic pair is accommodated in the helical context in a unique manner, with consequences for the local helical parameters, solvent structure, electrostatic potential and potential to interact with ligands. The detailed three-dimensional profiles of RNA CNG repeats can be used in screening of compound libraries for potential therapeutics and in structure-based drug design. Here is a brief survey of the CNG structures published to date. PMID:24939898
Metformin ameliorates core deficits in a mouse model of fragile X syndrome.

PubMed

Gantois, Ilse; Khoutorsky, Arkady; Popic, Jelena; Aguilar-Valles, Argel; Freemantle, Erika; Cao, Ruifeng; Sharma, Vijendra; Pooters, Tine; Nagpal, Anmol; Skalecka, Agnieszka; Truong, Vinh T; Wiebe, Shane; Groves, Isabelle A; Jafarnejad, Seyed Mehdi; Chapat, Clément; McCullagh, Elizabeth A; Gamache, Karine; Nader, Karim; Lacaille, Jean-Claude; Gkogkas, Christos G; Sonenberg, Nahum

2017-06-01

Fragile X syndrome (FXS) is the leading monogenic cause of autism spectrum disorders (ASD). Trinucleotide repeat expansions in FMR1 abolish FMRP expression, leading to hyperactivation of ERK and mTOR signaling upstream of mRNA translation. Here we show that metformin, the most widely used drug for type 2 diabetes, rescues core phenotypes in Fmr1 -/y mice and selectively normalizes ERK signaling, eIF4E phosphorylation and the expression of MMP-9. Thus, metformin is a potential FXS therapeutic.
An improved assay for the determination of Huntington`s disease allele size

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reeves, C.; Klinger, K.; Miller, G.

1994-09-01

The hallmark of Huntington`s disease (HD) is the expansion of a polymorphic (CAG)n repeat. Several methods have been published describing PCR amplification of this region. Most of these assays require a complex PCR reaction mixture to amplify this GC-rich region. A consistent problem with trinucleotide repeat PCR amplification is the presence of a number of {open_quotes}stutter bands{close_quotes} which may be caused by primer or amplicon slippage during amplification or insufficient polymerase processivity. Most assays for HD arbitrarily select a particular band for diagnostic purposes. Without a clear choice for band selection such an arbitrary selection may result in inconsistent intra-more » or inter-laboratory findings. We present an improved protocol for the amplification of the HD trinucleotide repeat region. This method simplifies the PCR reaction buffer and results in a set of easily identifiable bands from which to determine allele size. HD alleles were identified by selecting bands of clearly greater signal intensity. Stutter banding was much reduced thus permitting easy identification of the most relevant PCR product. A second set of primers internal to the CCG polymorphism was used in selected samples to confirm allele size. The mechanism of action of N,N,N trimethylglycine in the PCR reaction is not clear. It may be possible that the minimal isostabilizing effect of N,N,N trimethylglycine at 2.5 M is significant enough to affect primer specificity. The use of N,N,N trimethylglycine in the PCR reaction facilitated identification of HD alleles and may be appropriate for use in other assays of this type.« less
A Robust and Versatile Method of Combinatorial Chemical Synthesis of Gene Libraries via Hierarchical Assembly of Partially Randomized Modules

PubMed Central

Popova, Blagovesta; Schubert, Steffen; Bulla, Ingo; Buchwald, Daniela; Kramer, Wilfried

2015-01-01

A major challenge in gene library generation is to guarantee a large functional size and diversity that significantly increases the chances of selecting different functional protein variants. The use of trinucleotides mixtures for controlled randomization results in superior library diversity and offers the ability to specify the type and distribution of the amino acids at each position. Here we describe the generation of a high diversity gene library using tHisF of the hyperthermophile Thermotoga maritima as a scaffold. Combining various rational criteria with contingency, we targeted 26 selected codons of the thisF gene sequence for randomization at a controlled level. We have developed a novel method of creating full-length gene libraries by combinatorial assembly of smaller sub-libraries. Full-length libraries of high diversity can easily be assembled on demand from smaller and much less diverse sub-libraries, which circumvent the notoriously troublesome long-term archivation and repeated proliferation of high diversity ensembles of phages or plasmids. We developed a generally applicable software tool for sequence analysis of mutated gene sequences that provides efficient assistance for analysis of library diversity. Finally, practical utility of the library was demonstrated in principle by assessment of the conformational stability of library members and isolating protein variants with HisF activity from it. Our approach integrates a number of features of nucleic acids synthetic chemistry, biochemistry and molecular genetics to a coherent, flexible and robust method of combinatorial gene synthesis. PMID:26355961
A Robust and Versatile Method of Combinatorial Chemical Synthesis of Gene Libraries via Hierarchical Assembly of Partially Randomized Modules.

PubMed

Popova, Blagovesta; Schubert, Steffen; Bulla, Ingo; Buchwald, Daniela; Kramer, Wilfried

2015-01-01

A major challenge in gene library generation is to guarantee a large functional size and diversity that significantly increases the chances of selecting different functional protein variants. The use of trinucleotides mixtures for controlled randomization results in superior library diversity and offers the ability to specify the type and distribution of the amino acids at each position. Here we describe the generation of a high diversity gene library using tHisF of the hyperthermophile Thermotoga maritima as a scaffold. Combining various rational criteria with contingency, we targeted 26 selected codons of the thisF gene sequence for randomization at a controlled level. We have developed a novel method of creating full-length gene libraries by combinatorial assembly of smaller sub-libraries. Full-length libraries of high diversity can easily be assembled on demand from smaller and much less diverse sub-libraries, which circumvent the notoriously troublesome long-term archivation and repeated proliferation of high diversity ensembles of phages or plasmids. We developed a generally applicable software tool for sequence analysis of mutated gene sequences that provides efficient assistance for analysis of library diversity. Finally, practical utility of the library was demonstrated in principle by assessment of the conformational stability of library members and isolating protein variants with HisF activity from it. Our approach integrates a number of features of nucleic acids synthetic chemistry, biochemistry and molecular genetics to a coherent, flexible and robust method of combinatorial gene synthesis.
Screening of repetitive motifs inside the genome of the flat oyster (Ostrea edulis): Transposable elements and short tandem repeats.

PubMed

Vera, Manuel; Bello, Xabier; Álvarez-Dios, Jose-Antonio; Pardo, Belen G; Sánchez, Laura; Carlsson, Jens; Carlsson, Jeanette E L; Bartolomé, Carolina; Maside, Xulio; Martinez, Paulino

2015-12-01

The flat oyster (Ostrea edulis) is one of the most appreciated molluscs in Europe, but its production has been greatly reduced by the parasite Bonamia ostreae. Here, new generation genomic resources were used to analyse the repetitive fraction of the oyster genome, with the aim of developing molecular markers to face this main oyster production challenge. The resulting oyster database, consists of two sets of 10,318 and 7159 unique contigs (4.8 Mbp and 6.8 Mbp in total length) representing the oyster's genome (WG) and haemocyte transcriptome (HT), respectively. A total of 1083 sequences were identified as TE-derived, which corresponded to 4.0% of WG and 1.1% of HT. They were clustered into 142 homology groups, most of which were assigned to the Penelope order of retrotransposons, and to the Helitron and TIR DNA-transposons. Simple repeats and rRNA pseudogenes, also made a significant contribution to the oyster's genome (0.5% and 0.3% of WG and HT, respectively).The most frequent short tandem repeats identified in WG were tetranucleotide motifs while trinucleotide motifs were in HT. Forty identified microsatellite loci, 20 from each database, were selected for technical validation. Success was much lower among WG than HT microsatellites (15% vs 55%), which could reflect higher variation in anonymous regions interfering with primer annealing. All microsatellites developed adjusted to Hardy-Weinberg proportions and represent a useful tool to support future breeding programmes and to manage genetic resources of natural flat oyster beds. Copyright © 2015 Elsevier B.V. All rights reserved.
Characterization and Transferable Utility of Microsatellite Markers in the Wild and Cultivated Arachis Species.

PubMed

Huang, Li; Wu, Bei; Zhao, Jiaojiao; Li, Haitao; Chen, Weigang; Zheng, Yanli; Ren, Xiaoping; Chen, Yuning; Zhou, Xiaojing; Lei, Yong; Liao, Boshou; Jiang, Huifang

2016-01-01

Microsatellite or simple sequence repeat (SSR) is one of the most widely distributed molecular markers that have been widely utilized to assess genetic diversity and genetic mapping for important traits in plants. However, the understanding of microsatellite characteristics in Arachis species and the currently available amount of high-quality SSR markers remain limited. In this study, we identified 16,435 genome survey sequences SSRs (GSS-SSRs) and 40,199 expressed sequence tag SSRs (EST-SSRs) in Arachis hypogaea and its wild relative species using the publicly available sequence data. The GSS-SSRs had a density of 159.9-239.8 SSRs/Mb for wild Arachis and 1,015.8 SSR/Mb for cultivated Arachis, whereas the EST-SSRs had the density of 173.5-384.4 SSR/Mb and 250.9 SSRs/Mb for wild and cultivated Arachis, respectively. The trinucleotide SSRs were predominant across Arachis species, except that the dinucleotide accounted for most in A. hypogaea GSSs. From Arachis GSS-SSR and EST-SSR sequences, we developed 2,589 novel SSR markers that showed a high polymorphism in six diverse A. hypogaea accessions. A genetic linkage map that contained 540 novel SSR loci and 105 anchor SSR loci was constructed by case of a recombinant inbred lines F6 population. A subset of 82 randomly selected SSR markers were used to screen 39 wild and 22 cultivated Arachis accessions, which revealed a high transferability of the novel SSRs across Arachis species. Our results provided informative clues to investigate microsatellite patterns across A. hypogaea and its wild relative species and potentially facilitate the germplasm evaluation and gene mapping in Arachis species.
Tics as an initial manifestation of juvenile Huntington's disease: case report and literature review.

PubMed

Cui, Shi-Shuang; Ren, Ru-Jing; Wang, Ying; Wang, Gang; Chen, Sheng-Di

2017-08-08

Huntington's disease (HD) is an autosomal dominant disorder, typically characterized by chorea due to a trinucleotide repeat expansion in the HTT gene, although the clinical manifestations of patients with juvenile HD (JHD) are atypical. A 17-year-old boy with initial presentation of tics attended our clinic and his DNA analysis demonstrated mutation in the HTT gene (49 CAG repeats). After treatment, his symptoms improved. Furthermore, we performed literature review through searching the databases and summarized clinical features in 33 JHD patients. The most prevalent symptoms are ataxia, and two cases reported that tics as initial and prominent manifestation in JHD. Among them, 88% patients carried CAG repeats beyond 60 and most of them have family history. This case here illustrates the variable range of clinical symptoms of JHD and the necessity of testing for the HD mutation in young patients with tics with symptoms unable to be explained by Tourette's syndrome (TS).
Linkage disequilibrium at the SCA2 locus

PubMed Central

Didierjean, O.; Cancel, G.; Stevanin, G.; Durr, A.; Burk, K.; Benomar, A.; Lezin, A.; Belal, S.; Abada-Bendid, M.; Klockgether, T.; Brice, A.

1999-01-01

Spinocerebellar ataxia type 2 (SCA2) is caused by the expansion of an unstable CAG repeat encoding a polyglutamine tract. Repeats with 32 to 200 CAGs are associated with the disease, whereas normal chromosomes contain 13 to 33 repeats. We tested 220 families of different geographical origins for the SCA2 mutation. Thirty three were positive (15%). Twenty three families with at least two affected subjects were tested for linkage disequilibium (LD) between the SCA2 mutation and three microsatellite markers, two of which (D12S1332-D12S1333) closely flanked the mutation; the other (D12S1672) was intragenic. Many different haplotypes were observed, indicating the occurrence of several ancestral mutations. However, the same haplotype, not observed in controls, was detected in the German, the Serbian, and some of the French families, suggesting a founder effect or recurrent mutations on an at risk haplotype.   Keywords: linkage disequilibrium; SCA2; trinucleotide repeat expansion; founder effect PMID:10353790
Induced Pluripotent Stem Cells from Patients with Huntington’s Disease Show CAG Repeat Expansion Associated Phenotypes

PubMed Central

Mattis, Virginia B; Svendsen, Soshana P; Ebert, Allison; Svendsen, Clive N; King, Alvin R; Casale, Malcolm; Winokur, Sara T; Batugedara, Gayani; Vawter, Marquis; Donovan, Peter J; Lock, Leslie F; Thompson, Leslie M; Zhu, Yu; Fossale, Elisa; Singh Atwal, Ranjit; Gillis, Tammy; Mysore, Jayalakshmi; Li, Jian-hong; Seong, IhnSik; Shen, Yiping; Chen, Xiaoli; Wheeler, Vanessa C; MacDonald, Marcy E; Gusella, James F; Akimov, Sergey; Arbez, Nicolas; Juopperi, Tarja; Ratovitski, Tamara; Chiang, Jason H; Kim, Woon Roung; Chighladze, Eka; Watkin, Erin; Zhong, Chun; Makri, Georgia; Cole, Robert N; Margolis, Russell L; Song, Hongjun; Ming, Guoli; Ross, Christopher A; Kaye, Julia A; Daub, Aaron; Sharma, Punita; Mason, Amanda R; Finkbeiner, Steven; Yu, Junying; Thomson, James A; Rushton, David; Brazier, Stephen P; Battersby, Alysia A; Redfern, Amanda; Tseng, Hsui-Er; Harrison, Alexander W; Kemp, Paul J; Allen, Nicholas D; Onorati, Marco; Castiglioni, Valentina; Cattaneo, Elena; Arjomand, Jamshid

2013-01-01

Huntington's disease (HD) is an inherited neurodegenerative disorder caused by an expanded stretch of CAG trinucleotide repeats that results in neuronal dysfunction and death. Here, the HD consortium reports the generation and characterization of 14 induced pluripotent stem cell (iPSC) lines from HD patients and controls. Microarray profiling revealed CAG expansion-associated gene expression patterns that distinguish patient lines from controls, and early onset versus late onset HD. Differentiated HD neural cells showed disease associated changes in electrophysiology, metabolism, cell adhesion, and ultimately cell death for lines with both medium and longer CAG repeat expansions. The longer repeat lines were however the most vulnerable to cellular stressors and BDNF withdrawal using a range of assays across consortium laboratories. The HD iPSC collection represents a unique and well-characterized resource to elucidate disease mechanisms in HD and provides a novel human stem cell platform for screening new candidate therapeutics. PMID:22748968
n-Nucleotide circular codes in graph theory.

PubMed

Fimmel, Elena; Michel, Christian J; Strüngmann, Lutz

2016-03-13

The circular code theory proposes that genes are constituted of two trinucleotide codes: the classical genetic code with 61 trinucleotides for coding the 20 amino acids (except the three stop codons {TAA,TAG,TGA}) and a circular code based on 20 trinucleotides for retrieving, maintaining and synchronizing the reading frame. It relies on two main results: the identification of a maximal C(3) self-complementary trinucleotide circular code X in genes of bacteria, eukaryotes, plasmids and viruses (Michel 2015 J. Theor. Biol. 380, 156-177. (doi:10.1016/j.jtbi.2015.04.009); Arquès & Michel 1996 J. Theor. Biol. 182, 45-58. (doi:10.1006/jtbi.1996.0142)) and the finding of X circular code motifs in tRNAs and rRNAs, in particular in the ribosome decoding centre (Michel 2012 Comput. Biol. Chem. 37, 24-37. (doi:10.1016/j.compbiolchem.2011.10.002); El Soufi & Michel 2014 Comput. Biol. Chem. 52, 9-17. (doi:10.1016/j.compbiolchem.2014.08.001)). The univerally conserved nucleotides A1492 and A1493 and the conserved nucleotide G530 are included in X circular code motifs. Recently, dinucleotide circular codes were also investigated (Michel & Pirillo 2013 ISRN Biomath. 2013, 538631. (doi:10.1155/2013/538631); Fimmel et al. 2015 J. Theor. Biol. 386, 159-165. (doi:10.1016/j.jtbi.2015.08.034)). As the genetic motifs of different lengths are ubiquitous in genes and genomes, we introduce a new approach based on graph theory to study in full generality n-nucleotide circular codes X, i.e. of length 2 (dinucleotide), 3 (trinucleotide), 4 (tetranucleotide), etc. Indeed, we prove that an n-nucleotide code X is circular if and only if the corresponding graph [Formula: see text] is acyclic. Moreover, the maximal length of a path in [Formula: see text] corresponds to the window of nucleotides in a sequence for detecting the correct reading frame. Finally, the graph theory of tournaments is applied to the study of dinucleotide circular codes. It has full equivalence between the combinatorics theory (Michel & Pirillo 2013 ISRN Biomath. 2013, 538631. (doi:10.1155/2013/538631)) and the group theory (Fimmel et al. 2015 J. Theor. Biol. 386, 159-165. (doi:10.1016/j.jtbi.2015.08.034)) of dinucleotide circular codes while its mathematical approach is simpler. © 2016 The Author(s).
Utility of EST-derived SSR in cultivated peanut (Arachis hypogaea L.) and Arachis wild species

PubMed Central

Liang, Xuanqiang; Chen, Xiaoping; Hong, Yanbin; Liu, Haiyan; Zhou, Guiyuan; Li, Shaoxiong; Guo, Baozhu

2009-01-01

Background Lack of sufficient molecular markers hinders current genetic research in peanuts (Arachis hypogaea L.). It is necessary to develop more molecular markers for potential use in peanut genetic research. With the development of peanut EST projects, a vast amount of available EST sequence data has been generated. These data offered an opportunity to identify SSR in ESTs by data mining. Results In this study, we investigated 24,238 ESTs for the identification and development of SSR markers. In total, 881 SSRs were identified from 780 SSR-containing unique ESTs. On an average, one SSR was found per 7.3 kb of EST sequence with tri-nucleotide motifs (63.9%) being the most abundant followed by di- (32.7%), tetra- (1.7%), hexa- (1.0%) and penta-nucleotide (0.7%) repeat types. The top six motifs included AG/TC (27.7%), AAG/TTC (17.4%), AAT/TTA (11.9%), ACC/TGG (7.72%), ACT/TGA (7.26%) and AT/TA (6.3%). Based on the 780 SSR-containing ESTs, a total of 290 primer pairs were successfully designed and used for validation of the amplification and assessment of the polymorphism among 22 genotypes of cultivated peanuts and 16 accessions of wild species. The results showed that 251 primer pairs yielded amplification products, of which 26 and 221 primer pairs exhibited polymorphism among the cultivated and wild species examined, respectively. Two to four alleles were found in cultivated peanuts, while 3–8 alleles presented in wild species. The apparent broad polymorphism was further confirmed by cloning and sequencing of amplified alleles. Sequence analysis of selected amplified alleles revealed that allelic diversity could be attributed mainly to differences in repeat type and length in the microsatellite regions. In addition, a few single base mutations were observed in the microsatellite flanking regions. Conclusion This study gives an insight into the frequency, type and distribution of peanut EST-SSRs and demonstrates successful development of EST-SSR markers in cultivated peanut. These EST-SSR markers could enrich the current resource of molecular markers for the peanut community and would be useful for qualitative and quantitative trait mapping, marker-assisted selection, and genetic diversity studies in cultivated peanut as well as related Arachis species. All of the 251 working primer pairs with names, motifs, repeat types, primer sequences, and alleles tested in cultivated and wild species are listed in Additional File 1. PMID:19309524

Structural studies of CNG repeats.

PubMed

Kiliszek, Agnieszka; Rypniewski, Wojciech

2014-07-01

CNG repeats (where N denotes one of the four natural nucleotides) are abundant in the human genome. Their tendency to undergo expansion can lead to hereditary diseases known as TREDs (trinucleotide repeat expansion disorders). The toxic factor can be protein, if the abnormal gene is expressed, or the gene transcript, or both. The gene transcripts have attracted much attention in the biomedical community, but their molecular structures have only recently been investigated. Model RNA molecules comprising CNG repeats fold into long hairpins whose stems generally conform to an A-type helix, in which the non-canonical N-N pairs are flanked by C-G and G-C pairs. Each homobasic pair is accommodated in the helical context in a unique manner, with consequences for the local helical parameters, solvent structure, electrostatic potential and potential to interact with ligands. The detailed three-dimensional profiles of RNA CNG repeats can be used in screening of compound libraries for potential therapeutics and in structure-based drug design. Here is a brief survey of the CNG structures published to date. © Published by Oxford University Press on behalf of Nucleic Acids Research.
Analysis of the Transcriptome of Erigeron breviscapus Uncovers Putative Scutellarin and Chlorogenic Acids Biosynthetic Genes and Genetic Markers

PubMed Central

Zhang, Jia-Jin; Shu, Li-Ping; Zhang, Wei; Long, Guang-Qiang; Liu, Tao; Meng, Zheng-Gui; Chen, Jun-Wen; Yang, Sheng-Chao

2014-01-01

Background Erigeron breviscapus (Vant.) Hand-Mazz. is a famous medicinal plant. Scutellarin and chlorogenic acids are the primary active components in this herb. However, the mechanisms of biosynthesis and regulation for scutellarin and chlorogenic acids in E. breviscapus are considerably unknown. In addition, genomic information of this herb is also unavailable. Principal Findings Using Illumina sequencing on GAIIx platform, a total of 64,605,972 raw sequencing reads were generated and assembled into 73,092 non-redundant unigenes. Among them, 44,855 unigenes (61.37%) were annotated in the public databases Nr, Swiss-Prot, KEGG, and COG. The transcripts encoding the known enzymes involved in flavonoids and in chlorogenic acids biosynthesis were discovered in the Illumina dataset. Three candidate cytochrome P450 genes were discovered which might encode flavone 6-hydroase converting apigenin to scutellarein. Furthermore, 4 unigenes encoding the homologues of maize P1 (R2R3-MYB transcription factors) were defined, which might regulate the biosynthesis of scutellarin. Additionally, a total of 11,077 simple sequence repeat (SSR) were identified from 9,255 unigenes. Of SSRs, tri-nucleotide motifs were the most abundant motif. Thirty-six primer pairs for SSRs were randomly selected for validation of the amplification and polymorphism. The result revealed that 34 (94.40%) primer pairs were successfully amplified and 19 (52.78%) primer pairs exhibited polymorphisms. Conclusion Using next generation sequencing (NGS) technology, this study firstly provides abundant genomic data for E. breviscapus. The candidate genes involved in the biosynthesis and transcriptional regulation of scutellarin and chlorogenic acids were obtained in this study. Additionally, a plenty of genetic makers were generated by identification of SSRs, which is a powerful tool for molecular breeding and genetics applications in this herb. PMID:24956277
Analysis of the transcriptome of Erigeron breviscapus uncovers putative scutellarin and chlorogenic acids biosynthetic genes and genetic markers.

PubMed

Jiang, Ni-Hao; Zhang, Guang-Hui; Zhang, Jia-Jin; Shu, Li-Ping; Zhang, Wei; Long, Guang-Qiang; Liu, Tao; Meng, Zheng-Gui; Chen, Jun-Wen; Yang, Sheng-Chao

2014-01-01

Erigeron breviscapus (Vant.) Hand-Mazz. is a famous medicinal plant. Scutellarin and chlorogenic acids are the primary active components in this herb. However, the mechanisms of biosynthesis and regulation for scutellarin and chlorogenic acids in E. breviscapus are considerably unknown. In addition, genomic information of this herb is also unavailable. Using Illumina sequencing on GAIIx platform, a total of 64,605,972 raw sequencing reads were generated and assembled into 73,092 non-redundant unigenes. Among them, 44,855 unigenes (61.37%) were annotated in the public databases Nr, Swiss-Prot, KEGG, and COG. The transcripts encoding the known enzymes involved in flavonoids and in chlorogenic acids biosynthesis were discovered in the Illumina dataset. Three candidate cytochrome P450 genes were discovered which might encode flavone 6-hydroase converting apigenin to scutellarein. Furthermore, 4 unigenes encoding the homologues of maize P1 (R2R3-MYB transcription factors) were defined, which might regulate the biosynthesis of scutellarin. Additionally, a total of 11,077 simple sequence repeat (SSR) were identified from 9,255 unigenes. Of SSRs, tri-nucleotide motifs were the most abundant motif. Thirty-six primer pairs for SSRs were randomly selected for validation of the amplification and polymorphism. The result revealed that 34 (94.40%) primer pairs were successfully amplified and 19 (52.78%) primer pairs exhibited polymorphisms. Using next generation sequencing (NGS) technology, this study firstly provides abundant genomic data for E. breviscapus. The candidate genes involved in the biosynthesis and transcriptional regulation of scutellarin and chlorogenic acids were obtained in this study. Additionally, a plenty of genetic makers were generated by identification of SSRs, which is a powerful tool for molecular breeding and genetics applications in this herb.
Treatment of Fragile X Syndrome with a Neuroactive Steroid

DTIC Science & Technology

2014-08-01

Figure 1) and GABA agonists (Figures 2 and 3). Currently, there are animal models of FXS that include the Fmr1-KO mouse and the Drosophila melanogaster ... the Drosophila (fruit fly) model of FXS that the GABAA system including multiple receptors is dramatically down-regulated. Ganaxolone is a drug that...810 males.14 The expansion of the trinucleotide sequence results in lowered FMRP levels. The premutation expansion results in a two- to eightfold
Microsatellite analysis and marker development in garlic: distribution in EST sequence, genetic diversity analysis, and marker transferability across Alliaceae.

PubMed

Barboza, Karina; Beretta, Vanesa; Kozub, Perla C; Salinas, Cecilia; Morgenfeld, Mauro M; Galmarini, Claudio R; Cavagnaro, Pablo F

2018-04-28

Allium vegetables, such as garlic and onion, have understudied genomes and limited molecular resources, hindering advances in genetic research and breeding of these species. In this study, we characterized and compared the simple sequence repeats (SSR) landscape in the transcriptomes of garlic and related Allium (A. cepa, A. fistulosum, and A. tuberosum) and non-Allium monocot species. In addition, 110 SSR markers were developed from garlic ESTs, and they were characterized-along with 112 previously developed SSRs-at various levels, including transferability across Alliaceae species, and their usefulness for genetic diversity analysis. Among the Allium species analyzed, garlic ESTs had the highest overall SSR density, the lowest frequency of trinucleotides, and the highest of di- and tetranucleotides. When compared to more distantly related monocots, outside the Asparagales order, it was evident that ESTs of Allium species shared major commonalities with regards to SSR density, frequency distribution, sequence motifs, and GC content. A significant fraction of the SSR markers were successfully transferred across Allium species, including crops for which no SSR markers have been developed yet, such as leek, shallot, chives, and elephant garlic. Diversity analysis of garlic cultivars with selected SSRs revealed 36 alleles, with 2-5 alleles/locus, and PIC = 0.38. Cluster analysis grouped the accessions according to their flowering behavior, botanical variety, and ecophysiological characteristics. Results from this study contribute to the characterization of Allium transcriptomes. The new SSR markers developed, along with the data from the polymorphism and transferability analyses, will aid in assisting genetic research and breeding in garlic and other Allium.
Instability of expanded CAG/CAA repeats in spinocerebellar ataxia type 17.

PubMed

Gao, Rui; Matsuura, Tohru; Coolbaugh, Mary; Zühlke, Christine; Nakamura, Koichiro; Rasmussen, Astrid; Siciliano, Michael J; Ashizawa, Tetsuo; Lin, Xi

2008-02-01

Trinucleotide repeat expansions are dynamic mutations causing many neurological disorders, and their instability is influenced by multiple factors. Repeat configuration seems particularly important, and pure repeats are thought to be more unstable than interrupted repeats. But direct evidence is still lacking. Here, we presented strong support for this hypothesis from our studies on spinocerebellar ataxia type 17 (SCA17). SCA17 is a typical polyglutamine disease caused by CAG repeat expansion in TBP (TATA binding protein), and is unique in that the pure expanded polyglutamine tract is coded by either a simple configuration with long stretches of pure CAGs or a complex configuration containing CAA interruptions. By small pool PCR (SP-PCR) analysis of blood DNA from SCA17 patients of distinct racial backgrounds, we quantitatively assessed the instability of these two types of expanded alleles coding similar length of polyglutamine expansion. Mutation frequency in patients harboring pure CAG repeats is 2-3 folds of those with CAA interruptions. Interestingly, the pure CAG repeats showed both expansion and deletion while the interrupted repeats exhibited mostly deletion at a significantly lower frequency. These data strongly suggest that repeat configuration is a critical determinant for instability, and CAA interruptions might serve as a limiting element for further expansion of CAG repeats in SCA17 locus, suggesting a molecular basis for lack of anticipation in SCA17 families with interrupted CAG expansion.
[DNA marker-assisted selection of medicinal plants (Ⅰ) .Breeding research of disease-resistant cultivars of Panax notoginseng].

PubMed

Li, Qing; Li, Biao; Guo, Shun-Xing

2017-01-01

SSR is one of the most important molecular markers used in molecular identification and genetic diversity research of Dendrobium nobile. In order to enrich the library of SSR and establish a method for rapid identification of D. nobile, the SSR information was analyzed in the transcriptome of D. nobile. A total of 32 709 SSRs were obtained from the transcriptome of D. nobile, distributed in 26 742 unigenes with the distribution frequency of 12.90%. SSR loci occurred every 3 748 bp. Mono-nucleotide repeat was the main type, account for as much as 72.18% of all SSRs, followed by di-nucleotide (15.97%) and tri-nucleotide (11.19%). Among all repeat types, A/T was the predominant one followed by AG/CT. Finally a total of 62 157 primer pairs were designed for marker development. Randomly 20 pairs of primers were selected for PCR amplification, 17 amplified on clear and reproducible bands, the amplification rate was 85.0%.Thirteen pairs were polymorphic among the 3 Dendrobium plants. The results indicated that the unigenes generated from transcriptome sequencing in D. nobile can be used as effective source to develop SSR markers. The SSR loci in the transcriptome of D. nobile have the characteristics of type riches, high density and high potential of polymorphism, and these characteristics might applied in the study of molecular identification, genetic diversity and marker-assisted breeding of D. nobile and its closely related species. Copyright© by the Chinese Pharmaceutical Association.
[Copy number variation of trinucleotide repeat in dynamic mutation sites of autosomal dominant cerebellar ataxias related genes].

PubMed

Chen, Pu; Ma, Mingyi; Shang, Huifang; Su, Dan; Zhang, Sizhong; Yang, Yuan

2009-12-01

To standardize the experimental procedure of the gene test for autosomal dominant cerebellar ataxias (ADCA), and provide the basis for quantitative criteria of the dynamic mutation of spinocerebellar ataxia (SCA) genes in Chinese population. Genotyping of the dynamic mutation loci of the SCA1, SCA2, SCA3, SCA6 and SCA7 genes was performed, using florescence PCR-capillary electrophoresis followed by DNA sequencing, to investigate the variation range of copy number of CAG tandem repeat of the genes in 263 probands of ADCA pedigrees and 261 non-related normal controls. Based on the sequencing result, the bias of the CAG copy number estimation using capillary electrophoresis with different DNA controls was compared to analyze the technical detailes of the electrophresis method in testing the dynamic mutation sites. PCR products containing dynamic mutation loci of the SCA genes showed significantly higher mobility than that of molecular weigh marker with relatively balanced GC content. This was particularly obvious in the SCA2, SCA 6 and SCA7 genes whereas the deviation of copy number could be corrected to +/-1 when known CAG copy number fragments were used as controls. The mobility of PCR products was primarily related to the copy number of CAG repeat when the fragments contained normal CAG repeat. In the 263 ADCA pedigrees, 6 (2.28%) carried SCA1 gene mutation, 8 (3.04%) had SCA2 mutation and 81 (30.80%) harbored SCA3 mutation. The gene mutation of SCA6 and SCA7 was not found. The normal variation range of the CAG repeat was 17-36 copies in SCA1 gene, 13-30 copies in SCA2, 14-39 copies in SCA3, 6-16 copies in SCA6 and 6-13 copies in SCA7. The heterozygosity was 76.1%, 17.7%, 74.4%, 72.1% and 41.3%, respectively. The mutation range of the CAG repeat was 49-56 copies in SCA1 gene, 36-41 copies in SCA2, 59-81 copies in SCA3. Neither homozygous mutation of an SCA gene nor double heterozygous mutation of the SCA genes was observed in the study. The copy number of the CAG repeat in SCA genes could be calculated accurately based on the result of florescence PCR-capillary electrophoresis when limited amount of known repeat copy number controls were used. Our result supported that the notion that SCA3 gene mutation was the most common cause for ADCA, and the obtained data would be helpful for establishing quantitative criteria of the dynamic mutation of the SCA genes in Chinese.
Base-Pairing Energies of Protonated Nucleoside Base Pairs of dCyd and m5dCyd: Implications for the Stability of DNA i-Motif Conformations

NASA Astrophysics Data System (ADS)

Yang, Bo; Rodgers, M. T.

2015-08-01

Hypermethylation of cytosine in expanded (CCG)n•(CGG)n trinucleotide repeats results in Fragile X syndrome, the most common cause of inherited mental retardation. The (CCG)n•(CGG)n repeats adopt i-motif conformations that are preferentially stabilized by base-pairing interactions of protonated base pairs of cytosine. Here we investigate the effects of 5-methylation and the sugar moiety on the base-pairing energies (BPEs) of protonated cytosine base pairs by examining protonated nucleoside base pairs of 2'-deoxycytidine (dCyd) and 5-methyl-2'-deoxycytidine (m5dCyd) using threshold collision-induced dissociation techniques. 5-Methylation of a single or both cytosine residues leads to very small change in the BPE. However, the accumulated effect may be dramatic in diseased state trinucleotide repeats where many methylated base pairs may be present. The BPEs of the protonated nucleoside base pairs examined here significantly exceed those of Watson-Crick dGuo•dCyd and neutral dCyd•dCyd base pairs, such that these base-pairing interactions provide the major forces responsible for stabilization of DNA i-motif conformations. Compared with isolated protonated nucleobase pairs of cytosine and 1-methylcytosine, the 2'-deoxyribose sugar produces an effect similar to the 1-methyl substituent, and leads to a slight decrease in the BPE. These results suggest that the base-pairing interactions may be slightly weaker in nucleic acids, but that the extended backbone is likely to exert a relatively small effect on the total BPE. The proton affinity (PA) of m5dCyd is also determined by competitive analysis of the primary dissociation pathways that occur in parallel for the protonated (m5dCyd)H+(dCyd) nucleoside base pair and the absolute PA of dCyd previously reported.
New primer for specific amplification of the CAG repeat in Huntington disease alleles

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bond, C.E.; Hodes, M.E.

1994-09-01

Huntington disease is an autosomal dominant neurodegenerative disorder caused by an expansion of a CAG trinucleotide repeat near the 5{prime} end of the gene for Huntington disease (IT15). The CAG repeat is flanked by a variable-length CCG repeat that is included in the amplification product obtained with most currently used primer sets and PCR protocols. Inclusion of this adjacent CCG repeat complicates the accurate assessment of CAG repeat length and interferes with the genotype determination of those individuals carrying alleles in the intermediate range between normal and expanded sized. Due to the GC-rich nature of this region, attempts at designingmore » a protocol for amplification of only the CAG repeat have proved unreliable and difficult to execute. We report here the development of a compatible primer set and PCR protocol that yields consistent amplification of the CAG-repeat region. PCR products can be visualized in ethidium bromide-stained agarose gels for rapid screening or in 6% polyacrylamide gels for determination of exact repeat length. This assay produces bands that can be sized accurately, while eliminating most nonspecific products. Fifty-five specimens examined showed consistency with another well-known method, but one that amplifies the CCG repeats as well. The results we obtained also matched the known carrier status of the donors.« less
Oligonucleotides targeting TCF4 triplet repeat expansion inhibit RNA foci and mis-splicing in Fuchs' dystrophy.

PubMed

Hu, Jiaxin; Rong, Ziye; Gong, Xin; Zhou, Zhengyang; Sharma, Vivek K; Xing, Chao; Watts, Jonathan K; Corey, David R; Mootha, V Vinod

2018-03-15

Fuchs' endothelial corneal dystrophy (FECD) is the most common repeat expansion disorder. FECD impacts 4% of U.S. population and is the leading indication for corneal transplantation. Most cases are caused by an expanded intronic CUG tract in the TCF4 gene that forms nuclear foci, sequesters splicing factors and impairs splicing. We investigated the sense and antisense RNA landscape at the FECD gene and find that the sense-expanded repeat transcript is the predominant species in patient corneas. In patient tissue, sense foci number were negatively correlated with age and showed no correlation with sex. Each endothelial cell has ∼2 sense foci and each foci is single RNA molecule. We designed antisense oligonucleotides (ASOs) to target the mutant-repetitive RNA and demonstrated potent inhibition of foci in patient-derived cells. Ex vivo treatment of FECD human corneas effectively inhibits foci and reverses pathological changes in splicing. FECD has the potential to be a model for treating many trinucleotide repeat diseases and targeting the TCF4 expansion with ASOs represents a promising therapeutic strategy to prevent and treat FECD.
Comparative Transcriptome Analysis of Male and Female Conelets and Development of Microsatellite Markers in Pinus bungeana, an Endemic Conifer in China

PubMed Central

Duan, Dong; Jia, Yun; Yang, Jie; Li, Zhong-Hu

2017-01-01

The sex determination in gymnosperms is still poorly characterized due to the lack of genomic/transcriptome resources and useful molecular genetic markers. To enhance our understanding of the molecular mechanisms of the determination of sexual recognition of reproductive structures in conifers, the transcriptome of male and female conelets were characterized in a Chinese endemic conifer species, Pinus bungeana Zucc. ex Endl. The 39.62 Gb high-throughput sequencing reads were obtained from two kinds of sexual conelets. After de novo assembly of the obtained reads, 85,305 unigenes were identified, 53,944 (63.23%) of which were annotated with public databases. A total of 12,073 differentially expressed genes were detected between the two types of sexes in P. bungeana, and 5766 (47.76%) of them were up-regulated in females. The Kyoto Encyclopedia of Genes and Genomes (KEGG) enriched analysis suggested that some of the genes were significantly associated with the sex determination process of P. bungeana, such as those involved in tryptophan metabolism, zeatin biosynthesis, and cysteine and methionine metabolism, and the phenylpropanoid biosynthesis pathways. Meanwhile, some important plant hormone pathways (e.g., the gibberellin (GA) pathway, carotenoid biosynthesis, and brassinosteroid biosynthesis (BR) pathway) that affected sexual determination were also induced in P. bungeana. In addition, 8791 expressed sequence tag-simple sequence repeats (EST-SSRs) from 7859 unigenes were detected in P. bungeana. The most abundant repeat types were dinucleotides (1926), followed by trinucleotides (1711). The dominant classes of the sequence repeat were A/T (4942) in mononucleotides and AT/AT (1283) in dinucleotides. Among these EST-SSRs, 84 pairs of primers were randomly selected for the characterization of potential molecular genetic markers. Finally, 19 polymorphic EST-SSR primers were characterized. We found low to moderate levels of genetic diversity (NA = 1.754; HO = 0.206; HE = 0.205) across natural populations of P. bungeana. The cluster analysis revealed two distinct genetic groups for the six populations that were sampled in this endemic species, which might be caused by the fragmentation of habitats and long-term geographic isolation among different populations. Taken together, this work provides important insights into the molecular mechanisms of sexual identity in the reproductive organs of P. bungeana. The molecular genetic resources that were identified in this study will also facilitate further studies in functional genomics and population genetics in the Pinus species. PMID:29257091
Glial response to polyglutamine-mediated stress

PubMed Central

Vig, Parminder J.S.; Shao, Qingmei; Lopez, Maripar E

2009-01-01

Neurodegenerative trinucleotide (CAG) repeat disorders are caused by the expansion of polyglutamine tracts within the disease proteins. Some of these proteins have an unknown function. How does expanded polyglutamine cause target neurons to degenerate, is not clear. Recent evidence suggests that intercellular miscommunication may contribute to polyglutamine pathogenesis in CAG repeat disorders. Polyglutamine induced degeneration of the target neuron can be mediated via glia-neuron interactions. Here we hypothesize during neurodegenerative process the failure of cell: cell interactions have more severe consequences than alterations in intracellular neuron biology. We further believe that bidirectional communication between neurons and glia are prerequisite for the normal development and function of either cell-type. Understanding intercellular signaling mechanisms such as glial trophic factors and their receptors, cell adhesion or other well-defined signaling molecules provide opportunities for developing potential therapies. PMID:20046986
A genetic scale of reading frame coding.

PubMed

Michel, Christian J

2014-08-21

The reading frame coding (RFC) of codes (sets) of trinucleotides is a genetic concept which has been largely ignored during the last 50 years. A first objective is the definition of a new and simple statistical parameter PrRFC for analysing the probability (efficiency) of reading frame coding (RFC) of any trinucleotide code. A second objective is to reveal different classes and subclasses of trinucleotide codes involved in reading frame coding: the circular codes of 20 trinucleotides and the bijective genetic codes of 20 trinucleotides coding the 20 amino acids. This approach allows us to propose a genetic scale of reading frame coding which ranges from 1/3 with the random codes (RFC probability identical in the three frames) to 1 with the comma-free circular codes (RFC probability maximal in the reading frame and null in the two shifted frames). This genetic scale shows, in particular, the reading frame coding probabilities of the 12,964,440 circular codes (PrRFC=83.2% in average), the 216 C(3) self-complementary circular codes (PrRFC=84.1% in average) including the code X identified in eukaryotic and prokaryotic genes (PrRFC=81.3%) and the 339,738,624 bijective genetic codes (PrRFC=61.5% in average) including the 52 codes without permuted trinucleotides (PrRFC=66.0% in average). Otherwise, the reading frame coding probabilities of each trinucleotide code coding an amino acid with the universal genetic code are also determined. The four amino acids Gly, Lys, Phe and Pro are coded by codes (not circular) with RFC probabilities equal to 2/3, 1/2, 1/2 and 2/3, respectively. The amino acid Leu is coded by a circular code (not comma-free) with a RFC probability equal to 18/19. The 15 other amino acids are coded by comma-free circular codes, i.e. with RFC probabilities equal to 1. The identification of coding properties in some classes of trinucleotide codes studied here may bring new insights in the origin and evolution of the genetic code. Copyright © 2014 Elsevier Ltd. All rights reserved.
The Role of the Immune System in Triplet Repeat Expansion Diseases

PubMed Central

Urbanek, Martyna O.; Krzyzosiak, Wlodzimierz J.

2015-01-01

Trinucleotide repeat expansion disorders (TREDs) are a group of dominantly inherited neurological diseases caused by the expansion of unstable repeats in specific regions of the associated genes. Expansion of CAG repeat tracts in translated regions of the respective genes results in polyglutamine- (polyQ-) rich proteins that form intracellular aggregates that affect numerous cellular activities. Recent evidence suggests the involvement of an RNA toxicity component in polyQ expansion disorders, thus increasing the complexity of the pathogenic processes. Neurodegeneration, accompanied by reactive gliosis and astrocytosis is the common feature of most TREDs, which may suggest involvement of inflammation in pathogenesis. Indeed, a number of immune response markers have been observed in the blood and CNS of patients and mouse models, and the activation of these markers was even observed in the premanifest stage of the disease. Although inflammation is not an initiating factor of TREDs, growing evidence indicates that inflammatory responses involving astrocytes, microglia, and the peripheral immune system may contribute to disease progression. Herein, we review the involvement of the immune system in the pathogenesis of triplet repeat expansion diseases, with particular emphasis on polyglutamine disorders. We also present various therapeutic approaches targeting the dysregulated inflammation pathways in these diseases. PMID:25873774
Simple sequence repeat markers useful for sorghum downy mildew (Peronosclerospora sorghi) and related species

PubMed Central

Perumal, Ramasamy; Nimmakayala, Padmavathi; Erattaimuthu, Saradha R; No, Eun-Gyu; Reddy, Umesh K; Prom, Louis K; Odvody, Gary N; Luster, Douglas G; Magill, Clint W

2008-01-01

Background A recent outbreak of sorghum downy mildew in Texas has led to the discovery of both metalaxyl resistance and a new pathotype in the causal organism, Peronosclerospora sorghi. These observations and the difficulty in resolving among phylogenetically related downy mildew pathogens dramatically point out the need for simply scored markers in order to differentiate among isolates and species, and to study the population structure within these obligate oomycetes. Here we present the initial results from the use of a biotin capture method to discover, clone and develop PCR primers that permit the use of simple sequence repeats (microsatellites) to detect differences at the DNA level. Results Among the 55 primers pairs designed from clones from pathotype 3 of P. sorghi, 36 flanked microsatellite loci containing simple repeats, including 28 (55%) with dinucleotide repeats and 6 (11%) with trinucleotide repeats. A total of 22 microsatellites with CA/AC or GT/TG repeats were the most abundant (40%) and GA/AG or CT/TC types contribute 15% in our collection. When used to amplify DNA from 19 isolates from P. sorghi, as well as from 5 related species that cause downy mildew on other hosts, the number of different bands detected for each SSR primer pair using a LI-COR- DNA Analyzer ranged from two to eight. Successful cross-amplification for 12 primer pairs studied in detail using DNA from downy mildews that attack maize (P. maydis & P. philippinensis), sugar cane (P. sacchari), pearl millet (Sclerospora graminicola) and rose (Peronospora sparsa) indicate that the flanking regions are conserved in all these species. A total of 15 SSR amplicons unique to P. philippinensis (one of the potential threats to US maize production) were detected, and these have potential for development of diagnostic tests. A total of 260 alleles were obtained using 54 microsatellites primer combinations, with an average of 4.8 polymorphic markers per SSR across 34 Peronosclerospora, Peronospora and Sclerospora spp isolates studied. Cluster analysis by UPGMA as well as principal coordinate analysis (PCA) grouped the 34 isolates into three distinct groups (all 19 isolates of Peronosclerospora sorghi in cluster I, five isolates of P. maydis and three isolates of P. sacchari in cluster II and five isolates of Sclerospora graminicola in cluster III). Conclusion To our knowledge, this is the first attempt to extensively develop SSR markers from Peronosclerospora genomic DNA. The newly developed SSR markers can be readily used to distinguish isolates within several species of the oomycetes that cause downy mildew diseases. Also, microsatellite fragments likely include retrotransposon regions of DNA and these sequences can serve as useful genetic markers for strain identification, due to their degree of variability and their widespread occurrence among sorghum, maize, sugarcane, pearl millet and rose downy mildew isolates. PMID:19040756
Genetic modifiers of Huntington's disease.

PubMed

Gusella, James F; MacDonald, Marcy E; Lee, Jong-Min

2014-09-15

Huntington's disease (HD) is a devastating neurodegenerative disorder that directly affects more than 1 in 10,000 persons in Western societies but, as a family disorder with a long, costly, debilitating course, it has an indirect impact on a far greater proportion of the population. Although some palliative treatments are used, no effective treatment exists for preventing clinical onset of the disorder or for delaying its inevitable progression toward premature death, approximately 15 years after diagnosis. Huntington's disease involves a movement disorder characterized by chorea, as well as a variety of psychiatric disturbances and intellectual decline, with a gradual loss of independence. A dire need exists for effective HD therapies to alleviate the suffering and costs to the individual, family, and health care system. In past decades, genetics, the study of DNA sequence variation and its consequences, provided the tools to map the HD gene to chromosome 4 and ultimately to identify its mutation as an expanded CAG trinucleotide repeat in the coding sequence of a large protein, dubbed huntingtin. Now, advances in genetic technology offer an unbiased route to the identification of genetic factors that are disease-modifying agents in human patients. Such genetic modifiers are expected to highlight processes capable of altering the course of HD and therefore to provide new, human-validated targets for traditional drug development, with the goal of developing rational treatments to delay or prevent onset of HD clinical signs. © 2014 International Parkinson and Movement Disorder Society.
Tissue-specific mismatch repair protein expression: MSH3 is higher than MSH6 in multiple mouse tissues.

PubMed

Tomé, Stéphanie; Simard, Jodie P; Slean, Meghan M; Holt, Ian; Morris, Glenn E; Wojciechowicz, Kamila; te Riele, Hein; Pearson, Christopher E

2013-01-01

Mismatch repair (MMR) proteins have critical roles in the maintenance of genomic stability, both class-switch recombination and somatic hypermutation of immunoglobulin genes and disease-associated trinucleotide repeat expansions. In the genetic absence of MMR, certain tissues are predisposed to mutations and cancer. MMR proteins are involved in various functions including protection from replication-associated and non-mitotic mutations, as well as driving programmed and deleterious mutations, including disease-causing trinucleotide repeat expansions. Here we have assessed the levels of MSH2, MSH3, and MSH6 expression in a large number of murine tissues by transcript analysis and simultaneous Western blotting. We observed that MMR expression patterns varied widely between 14 different tissue types, but did not vary with age (13-84 weeks). MMR protein expression is highest in testis, thymus and spleen and lowest in pancreas, quadriceps and heart, with intermediate levels in liver, kidney, intestine, colon, cortex, striatum and cerebellum. By equalizing antibody signal intensity to represent levels found in mMutSα and mMutSβ purified proteins, we observed that mMSH3 protein levels are greater than mMSH6 levels in the multiple tissues analyzed, with more MSH6 in proliferating tissues. In the intestinal epithelium MSH3 and MSH6 are more highly expressed in the proliferative undifferentiated cells of the crypts than in the differentiated villi cells, as reported for MSH2. This finding correlates with the higher level of MMR expression in highly proliferative mouse tissues such as the spleen and thymus. The relative MMR protein expression levels may explain the functional and tissue-specific reliance upon the roles of each MMR protein. Crown Copyright © 2012. Published by Elsevier B.V. All rights reserved.
No CAG repeat expansion of polymerase gamma is associated with male infertility in Tamil Nadu, South India

PubMed Central

Poongothai, J.

2013-01-01

Mitochondria contains a single deoxyribonucleic acid (DNA) polymerase, polymerase gamma (POLG) mapped to long arm of chromosome 15 (15q25), responsible for replication and repair of mitochondrial DNA. Exon 1 of the human POLG contains CAG trinucleotide repeat, which codes for polyglutamate. Ten copies of CAG repeat were found to be uniformly high (0.88) in different ethnic groups and considered as the common allele, whereas the mutant alleles (not -10/not -10 CAG repeats) were found to be associated with oligospermia/oligoasthenospermia in male infertility. Recent data suggested the implication of POLG CAG repeat expansion in infertility, but are debated. The aim of our study was to explore whether the not -10/not -10 variant is associated with spermatogenic failure. As few study on Indian population have been conducted so far to support this view, we investigated the distribution of the POLG CAG repeats in 61 infertile men and 60 normozoospermic control Indian men of Tamil Nadu, from the same ethnic background. This analysis interestingly revealed that the homozygous wild type genotype (10/-10) was common in infertile men (77% - 47/61) and in normozoospermic control men (71.7% - 43/60). Our study failed to confirm any influence of the POLG gene polymorphism on the efficiency of the spermatogenesis. PMID:24339545
CasA mediates Cas3-catalyzed target degradation during CRISPR RNA-guided interference.

PubMed

Hochstrasser, Megan L; Taylor, David W; Bhat, Prashant; Guegler, Chantal K; Sternberg, Samuel H; Nogales, Eva; Doudna, Jennifer A

2014-05-06

In bacteria, the clustered regularly interspaced short palindromic repeats (CRISPR)-associated (Cas) DNA-targeting complex Cascade (CRISPR-associated complex for antiviral defense) uses CRISPR RNA (crRNA) guides to bind complementary DNA targets at sites adjacent to a trinucleotide signature sequence called the protospacer adjacent motif (PAM). The Cascade complex then recruits Cas3, a nuclease-helicase that catalyzes unwinding and cleavage of foreign double-stranded DNA (dsDNA) bearing a sequence matching that of the crRNA. Cascade comprises the CasA-E proteins and one crRNA, forming a structure that binds and unwinds dsDNA to form an R loop in which the target strand of the DNA base pairs with the 32-nt RNA guide sequence. Single-particle electron microscopy reconstructions of dsDNA-bound Cascade with and without Cas3 reveal that Cascade positions the PAM-proximal end of the DNA duplex at the CasA subunit and near the site of Cas3 association. The finding that the DNA target and Cas3 colocalize with CasA implicates this subunit in a key target-validation step during DNA interference. We show biochemically that base pairing of the PAM region is unnecessary for target binding but critical for Cas3-mediated degradation. In addition, the L1 loop of CasA, previously implicated in PAM recognition, is essential for Cas3 activation following target binding by Cascade. Together, these data show that the CasA subunit of Cascade functions as an essential partner of Cas3 by recognizing DNA target sites and positioning Cas3 adjacent to the PAM to ensure cleavage.

The Wilson films--Huntington's chorea.

PubMed

Klein, Christine

2011-12-01

Wilson's Queen Square Case 9 with Huntington's chorea shows a 68-year-old man with mild to moderate generalized chorea, impaired fixation, and probable cognitive decline in keeping with a diagnosis of Huntington's disease (HD). An age of onset in the late sixties and a negative family history suggest a relatively small expanded trinucleotide repeat in the HTT gene in the patient and reduced penetrance of an even shorter repeat allele in one of his parents. A highly sensitive and specific gene test has been offered worldwide for diagnostic testing of HD for almost two decades. This test, obviously unavailable at Wilson's times, became the historic frontrunner for guidelines of symptomatic, presymptomatic, and prenatal testing for an adult-onset neurodegenerative disorder. Regarding treatment of HD, however, we are still awaiting the successful translation of research results into the development of effective cause-directed, neuropreventive and neurorestaurative therapies. Copyright © 2011 Movement Disorder Society.
Psychiatric and autistic comorbidity in fragile X syndrome across ages.

PubMed

Gabis, Lidia V; Baruch, Yael Kesner; Jokel, Ariela; Raz, Raanan

2011-08-01

Fragile X syndrome is caused by CGG trinucleotide repeat expansion within the fragile X mental retardation 1 gene, when repeat number exceeds 200. The typical psychiatric profile of fragile X syndrome patients includes cognitive and behavioral deficits, psychiatric comorbidity, and autistic characteristics. Specific psychiatric features have not yet been clarified, specifically in relationship to age and genetic characteristics. The objective of this study was to characterize psychiatric comorbidities in subjects with fragile X syndrome at different ages. Subjects with fragile X syndrome and their unaffected siblings were recruited and their parents filled out functional-behavioral and psychiatric comorbidities questionnaires. Adolescents with fragile X syndrome showed decreased prevalence of functional-behavioral deficits. Incidence and severity of most psychiatric comorbidities were lower in older subjects. Incidence of generalized anxiety disorder increased with age in the fragile X syndrome group. The typical profile of patients with fragile X syndrome changes with age. Unaffected siblings exhibit anxiety and motor tics.
Somatic frameshift mutations in the Bloom syndrome BLM gene are frequent in sporadic gastric carcinomas with microsatellite mutator phenotype

PubMed Central

Calin, George; Ranzani, Guglielmina N; Amadori, Dino; Herlea, Vlad; Matei, Irina; Barbanti-Brodano, Giuseppe; Negrini, Massimo

2001-01-01

Background Genomic instability has been reported at microsatellite tracts in few coding sequences. We have shown that the Bloom syndrome BLM gene may be a target of microsatelliteinstability (MSI) in a short poly-adenine repeat located in its coding region. To further characterize the involvement of BLM in tumorigenesis, we have investigated mutations in nine genes containing coding microsatellites in microsatellite mutator phenotype (MMP) positive and negative gastric carcinomas (GCs). Methods We analyzed 50 gastric carcinomas (GCs) for mutations in the BLM poly(A) tract aswell as in the coding microsatellites of the TGFβ1-RII, IGFIIR, hMSH3, hMSH6, BAX, WRN, RECQL and CBL genes. Results BLM mutations were found in 27% of MMP+ GCs (4/15 cases) but not in any of the MMP negative GCs (0/35 cases). The frequency of mutations in the other eight coding regions microsatellite was the following: TGFβ1-RII (60 %), BAX (27%), hMSH6 (20%),hMSH3 (13%), CBL (13%), IGFIIR (7%), RECQL (0%) and WRN (0%). Mutations in BLM appear to be more frequently associated with frameshifts in BAX and in hMSH6and/or hMSH3. Tumors with BLM alterations present a higher frequency of unstable mono- and trinucleotide repeats located in coding regions as compared with mutator phenotype tumors without BLM frameshifts. Conclusions BLM frameshifts are frequent alterations in GCs specifically associated with MMP+tumors. We suggest that BLM loss of function by MSI may increase the genetic instability of a pre-existent unstable genotype in gastric tumors. PMID:11532193
PubMed Central

NIGRO, GERARDO; PAPA, ANDREA ANTONIO; POLITANO, LUISA

2012-01-01

Myotonic dystrophy (Dystrophia Myotonica, DM) is the most frequently inherited neuromuscular disease of adult life. It is a multisystemic disease with major cardiac involvement. Core features of myotonic dystrophy are myotonia, muscle weakness, cataract, respiratory failure and cardiac conduction abnormalities. Classical DM, first described by Steinert and called Steinert's disease or DM1 (Dystrophia Myotonica type 1) has been identified as an autosomal dominant disorder associated with the presence of an abnormal expansion of a CTG trinucleotide repeat in the 3' untranslated region of DMPK gene on chromosome 19. This review will mainly focus on the various aspects of cardiac involvement in DM1 patients and the current role of cardiac pacing in their treatment. PMID:23097601
Learning a weighted sequence model of the nucleosome core and linker yields more accurate predictions in Saccharomyces cerevisiae and Homo sapiens.

PubMed

Reynolds, Sheila M; Bilmes, Jeff A; Noble, William Stafford

2010-07-08

DNA in eukaryotes is packaged into a chromatin complex, the most basic element of which is the nucleosome. The precise positioning of the nucleosome cores allows for selective access to the DNA, and the mechanisms that control this positioning are important pieces of the gene expression puzzle. We describe a large-scale nucleosome pattern that jointly characterizes the nucleosome core and the adjacent linkers and is predominantly characterized by long-range oscillations in the mono, di- and tri-nucleotide content of the DNA sequence, and we show that this pattern can be used to predict nucleosome positions in both Homo sapiens and Saccharomyces cerevisiae more accurately than previously published methods. Surprisingly, in both H. sapiens and S. cerevisiae, the most informative individual features are the mono-nucleotide patterns, although the inclusion of di- and tri-nucleotide features results in improved performance. Our approach combines a much longer pattern than has been previously used to predict nucleosome positioning from sequence-301 base pairs, centered at the position to be scored-with a novel discriminative classification approach that selectively weights the contributions from each of the input features. The resulting scores are relatively insensitive to local AT-content and can be used to accurately discriminate putative dyad positions from adjacent linker regions without requiring an additional dynamic programming step and without the attendant edge effects and assumptions about linker length modeling and overall nucleosome density. Our approach produces the best dyad-linker classification results published to date in H. sapiens, and outperforms two recently published models on a large set of S. cerevisiae nucleosome positions. Our results suggest that in both genomes, a comparable and relatively small fraction of nucleosomes are well-positioned and that these positions are predictable based on sequence alone. We believe that the bulk of the remaining nucleosomes follow a statistical positioning model.
Msh3 is a limiting factor in the formation of intergenerational CTG expansions in DM1 transgenic mice.

PubMed

Foiry, Laurent; Dong, Li; Savouret, Cédric; Hubert, Laurence; te Riele, Hein; Junien, Claudine; Gourdon, Geneviève

2006-06-01

The CTG repeat involved in myotonic dystrophy is one of the most unstable trinucleotide repeats. However, the molecular mechanisms underlying this particular form of genetic instability-biased towards expansions-have not yet been completely elucidated. We previously showed, with highly unstable CTG repeat arrays in DM1 transgenic mice, that Msh2 is required for the formation of intergenerational and somatic expansions. To identify the partners of Msh2 in the formation of intergenerational CTG repeat expansions, we investigated the involvement of Msh3 and Msh6, partners of Msh2 in mismatch repair. Transgenic mice with CTG expansions were crossed with Msh3- or Msh6-deficient mice and CTG repeats were analysed after maternal and paternal transmissions. We demonstrated that Msh3 but not Msh6 plays also a key role in the formation of expansions over successive generation. Furthermore, the absence of one Msh3 allele was sufficient to decrease the formation of expansions, indicating that Msh3 is rate-limiting in this process. In the absence of Msh6, the frequency of expansions decreased only in maternal transmissions. However, the significantly lower levels of Msh2 and Msh3 proteins in Msh6 -/- ovaries suggest that the absence of Msh6 may have an indirect effect.
A South African family with oculopharyngeal muscular dystrophy: Clinical and molecular genetic characteristics.

PubMed

Schutte, Clara Maria; Dorfling, Cecelia M; van Coller, Riaan; Honey, Engela M; van Rensburg, Elizabeth Jansen

2015-09-21

Autosomal dominantly inherited oculopharyngeal muscular dystrophy (OPMD) is caused by a trinucleotide repeat expansion in exon 1 of the polyadenylate binding protein nuclear 1 (PABPN1) gene on chromosome 14q. A large family with OPMD was recently identified in Pretoria, South Africa (SA). Molecular studies revealed a (GCG)11(GCA)3GCG or (GCN)15 mutant allele. The (GCN)15 mutation detected in this family has been described previously in families from Uruguay and Mexico as a founder effect. To our knowledge, this is the first report of an SA Afrikaner family with molecularly confirmed OPMD. The proband, a 64-year-old woman, presented to the neurology outpatient department at Steve Biko Academic Hospital, Pretoria. A sibship of 18 individuals was identified, of whom eight had OPMD. Four patients were interviewed and examined clinically, and electromyographic studies were performed. Molecular analysis of the PABPN1 gene was performed by polymerase chain reaction amplification and direct sequencing of exon 1 in three of the patients. Patients presented with ptosis, external ophthalmoplegia, dysphagia, dysarthria and mild proximal weakness. High foot arches and absent ankle reflexes raised the possibility of peripheral neuropathy, but electromyography showed only mildly low sensory amplitudes, and myopathic units in two patients.
Interaction of Cu(+) with cytosine and formation of i-motif-like C-M(+)-C complexes: alkali versus coinage metals.

PubMed

Gao, Juehan; Berden, Giel; Rodgers, M T; Oomens, Jos

2016-03-14

The Watson-Crick structure of DNA is among the most well-known molecular structures of our time. However, alternative base-pairing motifs are also known to occur, often depending on base sequence, pH, or the presence of cations. Pairing of cytosine (C) bases induced by the sharing of a single proton (C-H(+)-C) may give rise to the so-called i-motif, which occurs primarily in expanded trinucleotide repeats and the telomeric region of DNA, particularly at low pH. At physiological pH, silver cations were recently found to stabilize C dimers in a C-Ag(+)-C structure analogous to the hemiprotonated C-dimer. Here we use infrared ion spectroscopy in combination with density functional theory calculations at the B3LYP/6-311G+(2df,2p) level to show that copper in the 1+ oxidation state induces an analogous formation of C-Cu(+)-C structures. In contrast to protons and these transition metal ions, alkali metal ions induce a different dimer structure, where each ligand coordinates the alkali metal ion in a bidentate fashion in which the N3 and O2 atoms of both cytosine ligands coordinate to the metal ion, sacrificing hydrogen-bonding interactions between the ligands for improved chelation of the metal cation.
Identification of protein-interacting nucleotides in a RNA sequence using composition profile of tri-nucleotides.

PubMed

Panwar, Bharat; Raghava, Gajendra P S

2015-04-01

The RNA-protein interactions play a diverse role in the cells, thus identification of RNA-protein interface is essential for the biologist to understand their function. In the past, several methods have been developed for predicting RNA interacting residues in proteins, but limited efforts have been made for the identification of protein-interacting nucleotides in RNAs. In order to discriminate protein-interacting and non-interacting nucleotides, we used various classifiers (NaiveBayes, NaiveBayesMultinomial, BayesNet, ComplementNaiveBayes, MultilayerPerceptron, J48, SMO, RandomForest, SMO and SVM(light)) for prediction model development using various features and achieved highest 83.92% sensitivity, 84.82 specificity, 84.62% accuracy and 0.62 Matthew's correlation coefficient by SVM(light) based models. We observed that certain tri-nucleotides like ACA, ACC, AGA, CAC, CCA, GAG, UGA, and UUU preferred in protein-interaction. All the models have been developed using a non-redundant dataset and are evaluated using five-fold cross validation technique. A web-server called RNApin has been developed for the scientific community (http://crdd.osdd.net/raghava/rnapin/). Copyright © 2015 Elsevier Inc. All rights reserved.
Somatic instability of the expanded allele of IT-15 from patients with Huntington disease

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stine, O.C.; Pleasant, N.; Ross, C.A.

1994-09-01

Huntington`s disease (HD) is an inherited neurodegenerative disorder caused by an expanded trinucleotide repeat in the gene IT-15. Although the expanded allele of IT-15 is unstable during gametogenesis, particularly, spermatogenesis, it is not clear if there is somatic stability. There are two reports of stability and one of instability. In order to test whether somatic instability occurs in the expansions found in HD, we have compared amplified genomic DNA isolated from either blood or distinct regions of autopsied brains of persons with Huntington disease. We find that somatic variation occurs in at least two ways. First, in cases with longermore » repeats (n > 47), the cerebellum often (8 of 9 cases) has a smaller number of repeats (2 to 10 less) than other tested regions of the brain. The larger the expanded allele, the larger the reduction in size of the repeat in the cerebellum (r=0.94, p<0.0001, df=12). Second, regardless of the repeat size, the number of amplification products from genomic DNA isolated from the cerebellum is smaller than that from genomic DNA from other forebrain regions such as the dorsal parietal cortex. As the length of the expanded allele increases, the number of amplification products increase in either tissue (r=0.86, p<0.001, df=12). Therefore our data demonstrates somatic instability especially for longer repeats.« less
Investigating the relationship between FMR1 allele length and cognitive ability in children: a subtle effect of the normal allele range on the normal ability range?

PubMed

Loat, C S; Craig, G; Plomin, R; Craig, I W

2006-09-01

The FMR1 gene contains a trinucleotide repeat tract which can expand from a normal size of around 30 repeats to over 200 repeats, causing mental retardation (Fragile X Syndrome). Evidence suggests that premutation males (55-200 repeats) are susceptible to a late-onset tremor/ataxia syndrome and females to premature ovarian failure, and that intermediate alleles ( approximately 41-55 repeats) and premutations may be in excess in samples with special educational needs. We explored the relationship between FMR1 allele length and cognitive ability in 621 low ability and control children assessed at 4 and 7 years, as well as 122 students with high IQ. The low and high ability and control samples showed no between-group differences in incidence of longer alleles. In males there was a significant negative correlation between allele length and non-verbal ability at 4 years (p = 0.048), academic achievement in maths (p = 0.003) and English (p = 0.011) at 7 years, and IQ in the high ability group (p = 0.018). There was a significant negative correlation between allele length and a standardised score for IQ and general cognitive ability at age 7 in the entire male sample (p = 0.002). This suggests that, within the normal spectrum of allele length, increased repeat numbers may have a limiting influence on cognitive performance.
Genetics and child psychiatry: I Advances in quantitative and molecular genetics.

PubMed

Rutter, M; Silberg, J; O'Connor, T; Simonoff, E

1999-01-01

Advances in quantitative psychiatric genetics as a whole are reviewed with respect to conceptual and methodological issues in relation to statistical model fitting, new genetic designs, twin and adoptee studies, definition of the phenotype, pervasiveness of genetic influences, pervasiveness of environmental influences, shared and nonshared environmental effects, and nature-nurture interplay. Advances in molecular genetics are discussed in relation to the shifts in research strategies to investigate multifactorial disorders (affected relative linkage designs, association strategies, and quantitative trait loci studies); new techniques and identified genetic mechanisms (expansion of trinucleotide repeats, genomic imprinting, mitochondrial DNA, fluorescent in-situ hybridisation, behavioural phenotypes, and animal models); and the successful localisation of genes.
The heart and cardiac pacing in Steinert disease.

PubMed

Nigro, Gerardo; Papa, Andrea Antonio; Politano, Luisa

2012-10-01

Myotonic dystrophy (Dystrophia Myotonica, DM) is the most frequently inherited neuromuscular disease of adult life. It is a multisystemic disease with major cardiac involvement. Core features of myotonic dystrophy are myotonia, muscle weakness, cataract, respiratory failure and cardiac conduction abnormalities. Classical DM, first described by Steinert and called Steinert's disease or DM1 (Dystrophia Myotonica type 1) has been identified as an autosomal dominant disorder associated with the presence of an abnormal expansion of a CTG trinucleotide repeat in the 3' untranslated region of DMPK gene on chromosome 19. This review will mainly focus on the various aspects of cardiac involvement in DM1 patients and the current role of cardiac pacing in their treatment.
Human pluripotent stem cell models of Fragile X syndrome.

PubMed

Bhattacharyya, Anita; Zhao, Xinyu

2016-06-01

Fragile X syndrome (FXS) is the most common inherited cause of intellectual disability and autism. The causal mutation in FXS is a trinucleotide CGG repeat expansion in the FMR1 gene that leads to human specific epigenetic silencing and loss of Fragile X Mental Retardation Protein (FMRP) expression. Human pluripotent stem cells (PSCs), including human embryonic stem cells (ESCs) and particularly induced PSCs (iPSCs), offer a model system to reveal cellular and molecular events underlying human neuronal development and function in FXS. Human FXS PSCs have been established and have provided insight into the epigenetic silencing of the FMR1 gene as well as aspects of neuronal development. Copyright © 2015 Elsevier Inc. All rights reserved.
Structure and Dynamics of RNA Repeat Expansions That Cause Huntington's Disease and Myotonic Dystrophy Type 1.

PubMed

Chen, Jonathan L; VanEtten, Damian M; Fountain, Matthew A; Yildirim, Ilyas; Disney, Matthew D

2017-07-11

RNA repeat expansions cause a host of incurable, genetically defined diseases. The most common class of RNA repeats consists of trinucleotide repeats. These long, repeating transcripts fold into hairpins containing 1 × 1 internal loops that can mediate disease via a variety of mechanism(s) in which RNA is the central player. Two of these disorders are Huntington's disease and myotonic dystrophy type 1, which are caused by r(CAG) and r(CUG) repeats, respectively. We report the structures of two RNA constructs containing three copies of a r(CAG) [r(3×CAG)] or r(CUG) [r(3×CUG)] motif that were modeled with nuclear magnetic resonance spectroscopy and simulated annealing with restrained molecular dynamics. The 1 × 1 internal loops of r(3×CAG) are stabilized by one-hydrogen bond (cis Watson-Crick/Watson-Crick) AA pairs, while those of r(3×CUG) prefer one- or two-hydrogen bond (cis Watson-Crick/Watson-Crick) UU pairs. Assigned chemical shifts for the residues depended on the identity of neighbors or next nearest neighbors. Additional insights into the dynamics of these RNA constructs were gained by molecular dynamics simulations and a discrete path sampling method. Results indicate that the global structures of the RNA are A-form and that the loop regions are dynamic. The results will be useful for understanding the dynamic trajectory of these RNA repeats but also may aid in the development of therapeutics.
Anticipation in familial leukemia

DOE Office of Scientific and Technical Information (OSTI.GOV)

Horwitz, M.; Jarvik, G.P.; Goode, E.L.

Anticipation refers to worsening severity or earlier age at onset with each generation for an inherited disease and primarily has been described for neurodegenerative illnesses resulting from expansion of trinucleotide repeats. We have tested for evidence of anticipation in familial leukemia. Of 49 affected individuals in nine families transmitting autosomal dominant acute myelogenous leukemia (AML), the mean age at onset is 57 years in the grandparental generation, 32 years in the parental generation, and 13 years in the youngest generation (P < .001). Of 21 parent-child pairs with AML, 19 show younger ages at onset in the child and demonstratemore » a mean decline in age at onset of 28 years (P < .001). Of 18 affected individuals from seven pedigrees with autosomal dominant chronic lymphocytic leukemia (CLL), the mean age at onset in the parental generation is 66 years versus 51 years in the youngest generation (P = .008). Of nine parent-child pairs with CLL, eight show younger ages at onset in the child and reveal a mean decline in age at onset of 21 years (P = .001). Inspection of rare pedigrees transmitting acute lymphocytic leukemia, chronic myelogenous leukemia, multiple types of leukemia, and lymphoma is also compatible with anticipation. Sampling bias is unlikely to explain these findings. This suggests that dynamic mutation of unstable DNA sequence repeats could be a common mechanism of inherited hematopoietic malignancy with implications for the role of somatic mutation in the more frequent sporadic cases. We speculate on three possible candidate genes for familial leukemia with anticipation: a locus on 21q22.1-22.2, CBL2 on 11q23.3, and CBFB or a nearby gene on 16q22. 55 refs., 4 figs.« less
Spinocerebellum Ataxia Type 6: Molecular Mechanisms and Calcium Channel Genetics.

PubMed

Du, Xiaofei; Gomez, Christopher Manuel

2018-01-01

Spinocerebellar ataxia (SCA) type 6 is an autosomal dominant disease affecting cerebellar degeneration. Clinically, it is characterized by pure cerebellar dysfunction, slowly progressive unsteadiness of gait and stance, slurred speech, and abnormal eye movements with late onset. Pathological findings of SCA6 include a diffuse loss of Purkinje cells, predominantly in the cerebellar vermis. Genetically, SCA6 is caused by expansion of a trinucleotide CAG repeat in the last exon of longest isoform CACNA1A gene on chromosome 19p13.1-p13.2. Normal alleles have 4-18 repeats, while alleles causing disease contain 19-33 repeats. Due to presence of a novel internal ribosomal entry site (IRES) with the mRNA, CACNA1A encodes two structurally unrelated proteins with distinct functions within an overlapping open reading frame (ORF) of the same mRNA: (1) α1A subunit of P/Q-type voltage gated calcium channel; (2) α1ACT, a newly recognized transcription factor, with polyglutamine repeat at C-terminal end. Understanding the function of α1ACT in physiological and pathological conditions may elucidate the pathogenesis of SCA6. More importantly, the IRES, as the translational control element of α1ACT, provides a potential therapeutic target for the treatment of SCA6.
A Simple, High-Throughput Assay for Fragile X Expanded Alleles Using Triple Repeat Primed PCR and Capillary Electrophoresis

PubMed Central

Lyon, Elaine; Laver, Thomas; Yu, Ping; Jama, Mohamed; Young, Keith; Zoccoli, Michael; Marlowe, Natalia

2010-01-01

Population screening has been proposed for Fragile X syndrome to identify premutation carrier females and affected newborns. We developed a PCR-based assay capable of quickly detecting the presence or absence of an expanded FMR1 allele with high sensitivity and specificity. This assay combines a triplet repeat primed PCR with high-throughput automated capillary electrophoresis. We evaluated assay performance using archived samples sent for Fragile X diagnostic testing representing a range of Fragile X CGG-repeat expansions. Two hundred five previously genotyped samples were tested with the new assay. Data were analyzed for the presence of a trinucleotide “ladder” extending beyond 55 repeats, which was set as a cut-off to identify expanded FMR1 alleles. We identified expanded FMR1 alleles in 132 samples (59 premutation, 71 full mutation, 2 mosaics) and normal FMR1 alleles in 73 samples. We found 100% concordance with previous results from PCR and Southern blot analyses. In addition, we show feasibility of using this assay with DNA extracted from dried-blood spots. Using a single PCR combined with high-throughput fragment analysis on the automated capillary electrophoresis instrument, we developed a rapid and reproducible PCR-based laboratory assay that meets many of the requirements for a first-tier test for population screening. PMID:20431035
Lack of expansion of triplet repeats in the FMR1, FRAXE, and FRAXF loci in male multiplex families with autism and pervasive developmental disorders

DOE Office of Scientific and Technical Information (OSTI.GOV)

Holden, J.J.A.; Julien-Inalsingh, C.; Wing, M.

Sib, twin, and family studies have shown that a genetic cause exists in many cases of autism, with a portion of cases associated with a fragile X chromosome. Three folate-sensitive fragile sites in the Xq27{r_arrow}Xq28 region have been cloned and found to have polymorphic trinucleotide repeats at the respective sites; these repeats are amplified and methylated in individuals who are positive for the different fragile sites. We have tested affected boys and their mothers from 19 families with two autistic/PDD boys for amplification and/or instability of the triplet repeats at these loci and concordance of inheritance of alleles by affectedmore » brothers. In all cases, the triplet repeat numbers were within the normal range, with no individuals having expanded or premutation-size alleles. For each locus, there was no evidence for an increased frequency of concordance, indicating that mutations within these genes are unlikely to be responsible for the autistic/PDD phenotypes in the affected boys. Thus, we think it is important to retest those autistic individuals who were cytogenetically positive for a fragile X chromosome, particularly cases where there is no family history of the fragile X syndrome, using the more accurate DNA-based testing procedures. 29 refs., 1 fig., 1 tab.« less
Learning a Weighted Sequence Model of the Nucleosome Core and Linker Yields More Accurate Predictions in Saccharomyces cerevisiae and Homo sapiens

PubMed Central

Reynolds, Sheila M.; Bilmes, Jeff A.; Noble, William Stafford

2010-01-01

DNA in eukaryotes is packaged into a chromatin complex, the most basic element of which is the nucleosome. The precise positioning of the nucleosome cores allows for selective access to the DNA, and the mechanisms that control this positioning are important pieces of the gene expression puzzle. We describe a large-scale nucleosome pattern that jointly characterizes the nucleosome core and the adjacent linkers and is predominantly characterized by long-range oscillations in the mono, di- and tri-nucleotide content of the DNA sequence, and we show that this pattern can be used to predict nucleosome positions in both Homo sapiens and Saccharomyces cerevisiae more accurately than previously published methods. Surprisingly, in both H. sapiens and S. cerevisiae, the most informative individual features are the mono-nucleotide patterns, although the inclusion of di- and tri-nucleotide features results in improved performance. Our approach combines a much longer pattern than has been previously used to predict nucleosome positioning from sequence—301 base pairs, centered at the position to be scored—with a novel discriminative classification approach that selectively weights the contributions from each of the input features. The resulting scores are relatively insensitive to local AT-content and can be used to accurately discriminate putative dyad positions from adjacent linker regions without requiring an additional dynamic programming step and without the attendant edge effects and assumptions about linker length modeling and overall nucleosome density. Our approach produces the best dyad-linker classification results published to date in H. sapiens, and outperforms two recently published models on a large set of S. cerevisiae nucleosome positions. Our results suggest that in both genomes, a comparable and relatively small fraction of nucleosomes are well-positioned and that these positions are predictable based on sequence alone. We believe that the bulk of the remaining nucleosomes follow a statistical positioning model. PMID:20628623

Phenotypic characterization of individuals with 30-40 CAG repeats in the Huntington disease (HD) gene reveals HD cases with 36 repeats and apparently normal elderly individuals with 36-39 repeats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rubinsztein, D.C.; Leggo, J.; Whittaker, J.L.

1996-07-01

Abnormal CAG expansions in the IT-15 gene are associated with Huntington disease (HD). In the diagnostic setting it is necessary to define the limits of the CAG size ranges on normal and HD-associated chromosomes. Most large analyses that defined the limits of the normal and pathological size ranges employed PCR assays, which included the CAG repeats and a CCG repeat tract that was thought to be invariant. Many of these experiments found an overlap between the normal and disease size ranges. Subsequent findings that the CCG repeats vary by 9 trinucleotide lengths suggested that the limits of the normal andmore » disease size ranges should be reevaluated with assays that exclude the CCG polymorphism. Since patients with between 30 and 40 repeats are rare, a consortium was assembled to collect such individuals. All 178 samples were reanalyzed in Cambridge by using assays specific for the CAG repeats. We have optimized methods for reliable sizing of CAG repeats and show cases that demonstrate the dangers of using PCR assays that include both the CAG and CCG polymorphisms. Seven HD patients had 36 repeats, which confirms that this allele is associated with disease. Individuals without apparent symptoms or signs of HD were found at 36 repeats (aged 74, 78, 79, and 87 years), 37 repeats (aged 69 years), 38 repeats (aged 69 and 90 years), and 39 repeats (aged 67, 90, and 95 years). The detailed case histories of an exceptional case from this series will be presented: a 95-year-old man with 39 repeats who did not have classical features of HD. The apparently healthy survival into old age of some individuals with 36-39 repeats suggests that the HD mutation may not always be fully penetrant. 26 refs., 3 figs., 1 tab.« less
The complete plastome sequence of Rubus takesimensis endemic to Ulleung Island, Korea: Insights into molecular evolution of anagenetically derived species in Rubus (Rosaceae).

PubMed

Yang, Ji Young; Pak, Jae-Hong; Kim, Seung-Chul

2018-08-20

Previous phylogenetic studies have suggested that Rubus takesimensis (Rosaceae), which is endemic to Ulleung Island, Korea, is closely related to R. crataegifolius, which is broadly distributed across East Asia. A recent phylogeographic study also suggested the possible polyphyletic origins of R. takesimensis from multiple source populations of its continental progenitor R. crataegifolius in China, Japan, Korea, and the Russian Far East. However, even though the progenitor-derivative relationship between R. crataegifolius and R. takesimensis has been established, little is known about the chloroplast genome (i.e., plastome) evolution of anagenetically derived species on oceanic islands and their continental progenitor species. In the present study, we characterized the complete plastome of R. takesimensis and compared it to those of R. crataegifolius and four other Rubus species. The R. takesimensis plastome was 155,760 base pairs (bp) long, a total of 46 bp longer than the plastome of R. crataegifolius (28 from LSC and 18 from SSC). No structural or content rearrangements were found between the species pairs. Four highly variable intergenic regions (rpl32/trnL, rps4/trnT, trnT/trnL, and psbZ/trnG) were identified between R. takesimensis and R. crataegifolius. Compared to the plastomes of other congeneric species (R. corchorifolius, R. fockeanus, and R. niveus), six highly variable intergenic regions (ndhC/psaC, rps16/trnQ, trnK/rps16, trnL/trnF, trnM/atpE, and trnQ/psbK) were also identified. A total of 116 simple sequence repeats (SSRs), including 48 mononucleotide, 64 dinucleotide, and four trinucleotide repeat motifs were characterized in R. takesimensis. The plastome resources generated by the present study will help to elucidate plastome evolution within the genus and to resolve phylogenetic relationships within highly complex and reticulated lineages. Phylogenetic analysis supported both the monophyly of Rubus and the sister relationship between R. crataegifolius and R. takesimensis. Copyright © 2018. Published by Elsevier B.V.
Comparative analysis of microsatellites in five different antagonistic Trichoderma species for diversity assessment.

PubMed

Rai, Shalini; Kashyap, Prem Lal; Kumar, Sudheer; Srivastava, Alok Kumar; Ramteke, Pramod W

2016-01-01

Microsatellites provide an ideal molecular markers system to screen, characterize and evaluate genetic diversity of several fungal species. Currently, there is very limited information on the genetic diversity of antagonistic Trichoderma species as determined using a range of molecular markers. In this study, expressed and whole genome sequences available in public database were used to investigate the occurrence, relative abundance and relative density of SSRs in five different antagonistic Trichoderma species: Trichoderma atroviride, T. harzianum, T. reesei, T. virens and T. asperellum. Fifteen SSRs loci were used to evaluate genetic diversity of twenty isolates of Trichoderma spp. from different geographical regions of India. Results indicated that relative abundance and relative density of SSRs were higher in T. asperellum followed by T. reesei and T. atroviride. Tri-nucleotide repeats (80.2%) were invariably the most abundant in all species. The abundance and relative density of SSRs were not influenced by the genome sizes and GC content. Out of eighteen primer sets, only 15 primer pairs showed successful amplification in all the test species. A total of 24 alleles were detected and five loci were highly informative with polymorphism information content values greater than 0.40, these markers provide useful information on genetic diversity and population genetic structure, which, in turn, can exploit for establishing conservation strategy for antagonistic Trichoderma isolates.
Microsatellite primers in the white proteas (Protea section Exsertae, Proteaceae), a rapidly radiating lineage.

PubMed

Prunier, Rachel; Latimer, Andrew

2010-01-01

Microsatellite primers were developed in the South African sclerophyllous shrub Protea punctata to investigate the degree of population differentiation within and between P. punctata and closely related species. • 10 primer pairs were identified from three individuals of Protea punctata. The primers amplified di- and tri-nucleotide repeats. Across all P. punctata samples, the loci have 8-49 alleles. All primers also amplified in Protea section Exsertae (P. aurea, P. aurea subsp. potbergensis, P. mundii, P. venusta, P. lacticolor, and P. subvestita). The loci had 14-69 alleles across the subgenus. • These results show the broad utility of microsatellite loci for future studies of population genetics in the white proteas and their potential utility across the entire genus.
"iSS-Hyb-mRMR": Identification of splicing sites using hybrid space of pseudo trinucleotide and pseudo tetranucleotide composition.

PubMed

Iqbal, Muhammad; Hayat, Maqsood

2016-05-01

Gene splicing is a vital source of protein diversity. Perfectly eradication of introns and joining exons is the prominent task in eukaryotic gene expression, as exons are usually interrupted by introns. Identification of splicing sites through experimental techniques is complicated and time-consuming task. With the avalanche of genome sequences generated in the post genomic age, it remains a complicated and challenging task to develop an automatic, robust and reliable computational method for fast and effective identification of splicing sites. In this study, a hybrid model "iSS-Hyb-mRMR" is proposed for quickly and accurately identification of splicing sites. Two sample representation methods namely; pseudo trinucleotide composition (PseTNC) and pseudo tetranucleotide composition (PseTetraNC) were used to extract numerical descriptors from DNA sequences. Hybrid model was developed by concatenating PseTNC and PseTetraNC. In order to select high discriminative features, minimum redundancy maximum relevance algorithm was applied on the hybrid feature space. The performance of these feature representation methods was tested using various classification algorithms including K-nearest neighbor, probabilistic neural network, general regression neural network, and fitting network. Jackknife test was used for evaluation of its performance on two benchmark datasets S1 and S2, respectively. The predictor, proposed in the current study achieved an accuracy of 93.26%, sensitivity of 88.77%, and specificity of 97.78% for S1, and the accuracy of 94.12%, sensitivity of 87.14%, and specificity of 98.64% for S2, respectively. It is observed, that the performance of proposed model is higher than the existing methods in the literature so for; and will be fruitful in the mechanism of RNA splicing, and other research academia. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Characterization of conservative somatic instability of the CAG repeat region in Huntington`s disease

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schaefer, F.V.; Calikoglu, A.S.; Whetsell, L.H.

1994-09-01

Instability and enlargement of a CAG repeat region at the beginning of the huntingtin gene (IT-15) has been linked with Huntington`s disease. The CAG repeat size shows a highly significant correlation with age-of-onset of clinicial features in individuals with 40 or more repeats who have Huntington disease. The clinical status of nonsymptomatic individuals with 30 to 39 CAG repeats is considered ambiguous. In order to define more carefully the nature of the HD expansion instability, we examined patients in our HD population using a discriminating fluorescence-based PCR approach. The degree of somatic mutation increases with both earlier age of onsetmore » and the size of the inherited allele. A single prominent band one repeat larger than the index peak was typical in individuals with 40-41 CAG repeats. Three to four larger bands are typically discerned in individuals with 50 or more repeats. In an extreme example, an individual with approximately 95 repeats had at least 8 prominent bands. Plotting the degree of somatic mutation relative to the size of the HD allele shows somatic mutation activity increases with size. By this approach 40-60% of the alleles in a 40-41 CAG repeat HD loci is represented in the primary allele. In contrast, the primary allele represents a relatively minor proportion of the total alleles for expansions greater than 50 CAG repeats (10-20%). The limited range of somatic mutation suggest that the instability is restricted to very early stages of embryogenesis before tissue development diverges or that persistent somatic instability occurs at a slow rate. Therefore, the properties of somatic instability in Huntington`s disease have aspects that are both in common but also different from that found in other trinucleotide repeat expanding diseases such as myotonic muscular dystrophy and fragile X syndrome.« less
The complete genome sequence of the Atlantic salmon paramyxovirus (ASPV)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nylund, Stian; Karlsen, Marius; Nylund, Are

2008-03-30

The complete RNA genome of the Atlantic salmon paramyxovirus (ASPV), isolated from Atlantic salmon suffering from proliferative gill inflammation (PGI), has been determined. The genome is 16,965 nucleotides in length and consists of six nonoverlapping genes in the order 3'- N - P/C/V - M - F - HN - L -5', coding for the nucleocapsid, phospho-, matrix, fusion, hemagglutinin-neuraminidase and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and trinucleotide intergenic regions similar to those of other Paramyxoviridae. The ASPV P-gene expression strategy is like that of the respiro- and morbilliviruses,more » which express the phosphoprotein from the primary transcript, and edit a portion of the mRNA to encode the accessory proteins V and W. It also encodes the C-protein by ribosomal choice of translation initiation. Pairwise comparisons of amino acid identities, and phylogenetic analysis of deduced ASPV protein sequences with homologous sequences from other Paramyxoviridae, show that ASPV has an affinity for the genus Respirovirus, but may represent a new genus within the subfamily Paramyxovirinae.« less
CAG repeat expansion in Huntington disease determines age at onset in a fully dominant fashion

PubMed Central

Lee, J.-M.; Ramos, E.M.; Lee, J.-H.; Gillis, T.; Mysore, J.S.; Hayden, M.R.; Warby, S.C.; Morrison, P.; Nance, M.; Ross, C.A.; Margolis, R.L.; Squitieri, F.; Orobello, S.; Di Donato, S.; Gomez-Tortosa, E.; Ayuso, C.; Suchowersky, O.; Trent, R.J.A.; McCusker, E.; Novelletto, A.; Frontali, M.; Jones, R.; Ashizawa, T.; Frank, S.; Saint-Hilaire, M.H.; Hersch, S.M.; Rosas, H.D.; Lucente, D.; Harrison, M.B.; Zanko, A.; Abramson, R.K.; Marder, K.; Sequeiros, J.; Paulsen, J.S.; Landwehrmeyer, G.B.; Myers, R.H.; MacDonald, M.E.; Durr, Alexandra; Rosenblatt, Adam; Frati, Luigi; Perlman, Susan; Conneally, Patrick M.; Klimek, Mary Lou; Diggin, Melissa; Hadzi, Tiffany; Duckett, Ayana; Ahmed, Anwar; Allen, Paul; Ames, David; Anderson, Christine; Anderson, Karla; Anderson, Karen; Andrews, Thomasin; Ashburner, John; Axelson, Eric; Aylward, Elizabeth; Barker, Roger A.; Barth, Katrin; Barton, Stacey; Baynes, Kathleen; Bea, Alexandra; Beall, Erik; Beg, Mirza Faisal; Beglinger, Leigh J.; Biglan, Kevin; Bjork, Kristine; Blanchard, Steve; Bockholt, Jeremy; Bommu, Sudharshan Reddy; Brossman, Bradley; Burrows, Maggie; Calhoun, Vince; Carlozzi, Noelle; Chesire, Amy; Chiu, Edmond; Chua, Phyllis; Connell, R.J.; Connor, Carmela; Corey-Bloom, Jody; Craufurd, David; Cross, Stephen; Cysique, Lucette; Santos, Rachelle Dar; Davis, Jennifer; Decolongon, Joji; DiPietro, Anna; Doucette, Nicholas; Downing, Nancy; Dudler, Ann; Dunn, Steve; Ecker, Daniel; Epping, Eric A.; Erickson, Diane; Erwin, Cheryl; Evans, Ken; Factor, Stewart A.; Farias, Sarah; Fatas, Marta; Fiedorowicz, Jess; Fullam, Ruth; Furtado, Sarah; Garde, Monica Bascunana; Gehl, Carissa; Geschwind, Michael D.; Goh, Anita; Gooblar, Jon; Goodman, Anna; Griffith, Jane; Groves, Mark; Guttman, Mark; Hamilton, Joanne; Harrington, Deborah; Harris, Greg; Heaton, Robert K.; Helmer, Karl; Henneberry, Machelle; Hershey, Tamara; Herwig, Kelly; Howard, Elizabeth; Hunter, Christine; Jankovic, Joseph; Johnson, Hans; Johnson, Arik; Jones, Kathy; Juhl, Andrew; Kim, Eun Young; Kimble, Mycah; King, Pamela; Klimek, Mary Lou; Klöppel, Stefan; Koenig, Katherine; Komiti, Angela; Kumar, Rajeev; Langbehn, Douglas; Leavitt, Blair; Leserman, Anne; Lim, Kelvin; Lipe, Hillary; Lowe, Mark; Magnotta, Vincent A.; Mallonee, William M.; Mans, Nicole; Marietta, Jacquie; Marshall, Frederick; Martin, Wayne; Mason, Sarah; Matheson, Kirsty; Matson, Wayne; Mazzoni, Pietro; McDowell, William; Miedzybrodzka, Zosia; Miller, Michael; Mills, James; Miracle, Dawn; Montross, Kelsey; Moore, David; Mori, Sasumu; Moser, David J.; Moskowitz, Carol; Newman, Emily; Nopoulos, Peg; Novak, Marianne; O'Rourke, Justin; Oakes, David; Ondo, William; Orth, Michael; Panegyres, Peter; Pease, Karen; Perlman, Susan; Perlmutter, Joel; Peterson, Asa; Phillips, Michael; Pierson, Ron; Potkin, Steve; Preston, Joy; Quaid, Kimberly; Radtke, Dawn; Rae, Daniela; Rao, Stephen; Raymond, Lynn; Reading, Sarah; Ready, Rebecca; Reece, Christine; Reilmann, Ralf; Reynolds, Norm; Richardson, Kylie; Rickards, Hugh; Ro, Eunyoe; Robinson, Robert; Rodnitzky, Robert; Rogers, Ben; Rosenblatt, Adam; Rosser, Elisabeth; Rosser, Anne; Price, Kathy; Price, Kathy; Ryan, Pat; Salmon, David; Samii, Ali; Schumacher, Jamy; Schumacher, Jessica; Sendon, Jose Luis Lópenz; Shear, Paula; Sheinberg, Alanna; Shpritz, Barnett; Siedlecki, Karen; Simpson, Sheila A.; Singer, Adam; Smith, Jim; Smith, Megan; Smith, Glenn; Snyder, Pete; Song, Allen; Sran, Satwinder; Stephan, Klaas; Stober, Janice; Sü?muth, Sigurd; Suter, Greg; Tabrizi, Sarah; Tempkin, Terry; Testa, Claudia; Thompson, Sean; Thomsen, Teri; Thumma, Kelli; Toga, Arthur; Trautmann, Sonja; Tremont, Geoff; Turner, Jessica; Uc, Ergun; Vaccarino, Anthony; van Duijn, Eric; Van Walsem, Marleen; Vik, Stacie; Vonsattel, Jean Paul; Vuletich, Elizabeth; Warner, Tom; Wasserman, Paula; Wassink, Thomas; Waterman, Elijah; Weaver, Kurt; Weir, David; Welsh, Claire; Werling-Witkoske, Chris; Wesson, Melissa; Westervelt, Holly; Weydt, Patrick; Wheelock, Vicki; Williams, Kent; Williams, Janet; Wodarski, Mary; Wojcieszek, Joanne; Wood, Jessica; Wood-Siverio, Cathy; Wu, Shuhua; Yastrubetskaya, Olga; de Yebenes, Justo Garcia; Zhao, Yong Qiang; Zimbelman, Janice; Zschiegner, Roland; Aaserud, Olaf; Abbruzzese, Giovanni; Andrews, Thomasin; Andrich, Jurgin; Antczak, Jakub; Arran, Natalie; Artiga, Maria J. Saiz; Bachoud-Lévi, Anne-Catherine; Banaszkiewicz, Krysztof; di Poggio, Monica Bandettini; Bandmann, Oliver; Barbera, Miguel A.; Barker, Roger A.; Barrero, Francisco; Barth, Katrin; Bas, Jordi; Beister, Antoine; Bentivoglio, Anna Rita; Bertini, Elisabetta; Biunno, Ida; Bjørgo, Kathrine; Bjørnevoll, Inga; Bohlen, Stefan; Bonelli, Raphael M.; Bos, Reineke; Bourne, Colin; Bradbury, Alyson; Brockie, Peter; Brown, Felicity; Bruno, Stefania; Bryl, Anna; Buck, Andrea; Burg, Sabrina; Burgunder, Jean-Marc; Burns, Peter; Burrows, Liz; Busquets, Nuria; Busse, Monica; Calopa, Matilde; Carruesco, Gemma T.; Casado, Ana Gonzalez; Catena, Judit López; Chu, Carol; Ciesielska, Anna; Clapton, Jackie; Clayton, Carole; Clenaghan, Catherine; Coelho, Miguel; Connemann, Julia; Craufurd, David; Crooks, Jenny; Cubillo, Patricia Trigo; Cubo, Esther; Curtis, Adrienne; De Michele, Giuseppe; De Nicola, A.; de Souza, Jenny; de Weert, A. Marit; de Yébenes, Justo Garcia; Dekker, M.; Descals, A. Martínez; Di Maio, Luigi; Di Pietro, Anna; Dipple, Heather; Dose, Matthias; Dumas, Eve M.; Dunnett, Stephen; Ecker, Daniel; Elifani, F.; Ellison-Rose, Lynda; Elorza, Marina D.; Eschenbach, Carolin; Evans, Carole; Fairtlough, Helen; Fannemel, Madelein; Fasano, Alfonso; Fenollar, Maria; Ferrandes, Giovanna; Ferreira, Jaoquim J.; Fillingham, Kay; Finisterra, Ana Maria; Fisher, K.; Fletcher, Amy; Foster, Jillian; Foustanos, Isabella; Frech, Fernando A.; Fullam, Robert; Fullham, Ruth; Gago, Miguel; García, RocioGarcía-Ramos; García, Socorro S.; Garrett, Carolina; Gellera, Cinzia; Gill, Paul; Ginestroni, Andrea; Golding, Charlotte; Goodman, Anna; Gørvell, Per; Grant, Janet; Griguoli, A.; Gross, Diana; Guedes, Leonor; BascuñanaGuerra, Monica; Guerra, Maria Rosalia; Guerrero, Rosa; Guia, Dolores B.; Guidubaldi, Arianna; Hallam, Caroline; Hamer, Stephanie; Hammer, Kathrin; Handley, Olivia J.; Harding, Alison; Hasholt, Lis; Hedge, Reikha; Heiberg, Arvid; Heinicke, Walburgis; Held, Christine; Hernanz, Laura Casas; Herranhof, Briggitte; Herrera, Carmen Durán; Hidding, Ute; Hiivola, Heli; Hill, Susan; Hjermind, Lena. E.; Hobson, Emma; Hoffmann, Rainer; Holl, Anna Hödl; Howard, Liz; Hunt, Sarah; Huson, Susan; Ialongo, Tamara; Idiago, Jesus Miguel R.; Illmann, Torsten; Jachinska, Katarzyna; Jacopini, Gioia; Jakobsen, Oda; Jamieson, Stuart; Jamrozik, Zygmunt; Janik, Piotr; Johns, Nicola; Jones, Lesley; Jones, Una; Jurgens, Caroline K.; Kaelin, Alain; Kalbarczyk, Anna; Kershaw, Ann; Khalil, Hanan; Kieni, Janina; Klimberg, Aneta; Koivisto, Susana P.; Koppers, Kerstin; Kosinski, Christoph Michael; Krawczyk, Malgorzata; Kremer, Berry; Krysa, Wioletta; Kwiecinski, Hubert; Lahiri, Nayana; Lambeck, Johann; Lange, Herwig; Laver, Fiona; Leenders, K.L.; Levey, Jamie; Leythaeuser, Gabriele; Lezius, Franziska; Llesoy, Joan Roig; Löhle, Matthias; López, Cristobal Diez-Aja; Lorenza, Fortuna; Loria, Giovanna; Magnet, Markus; Mandich, Paola; Marchese, Roberta; Marcinkowski, Jerzy; Mariotti, Caterina; Mariscal, Natividad; Markova, Ivana; Marquard, Ralf; Martikainen, Kirsti; Martínez, Isabel Haro; Martínez-Descals, Asuncion; Martino, T.; Mason, Sarah; McKenzie, Sue; Mechi, Claudia; Mendes, Tiago; Mestre, Tiago; Middleton, Julia; Milkereit, Eva; Miller, Joanne; Miller, Julie; Minster, Sara; Möller, Jens Carsten; Monza, Daniela; Morales, Blas; Moreau, Laura V.; Moreno, Jose L. López-Sendón; Münchau, Alexander; Murch, Ann; Nielsen, Jørgen E.; Niess, Anke; Nørremølle, Anne; Novak, Marianne; O'Donovan, Kristy; Orth, Michael; Otti, Daniela; Owen, Michael; Padieu, Helene; Paganini, Marco; Painold, Annamaria; Päivärinta, Markku; Partington-Jones, Lucy; Paterski, Laurent; Paterson, Nicole; Patino, Dawn; Patton, Michael; Peinemann, Alexander; Peppa, Nadia; Perea, Maria Fuensanta Noguera; Peterson, Maria; Piacentini, Silvia; Piano, Carla; Càrdenas, Regina Pons i; Prehn, Christian; Price, Kathleen; Probst, Daniela; Quarrell, Oliver; Quiroga, Purificacion Pin; Raab, Tina; Rakowicz, Maryla; Raman, Ashok; Raymond, Lucy; Reilmann, Ralf; Reinante, Gema; Reisinger, Karin; Retterstol, Lars; Ribaï, Pascale; Riballo, Antonio V.; Ribas, Guillermo G.; Richter, Sven; Rickards, Hugh; Rinaldi, Carlo; Rissling, Ida; Ritchie, Stuart; Rivera, Susana Vázquez; Robert, Misericordia Floriach; Roca, Elvira; Romano, Silvia; Romoli, Anna Maria; Roos, Raymond A.C.; Røren, Niini; Rose, Sarah; Rosser, Elisabeth; Rosser, Anne; Rossi, Fabiana; Rothery, Jean; Rudzinska, Monika; Ruíz, Pedro J. García; Ruíz, Belan Garzon; Russo, Cinzia Valeria; Ryglewicz, Danuta; Saft, Carston; Salvatore, Elena; Sánchez, Vicenta; Sando, Sigrid Botne; Šašinková, Pavla; Sass, Christian; Scheibl, Monika; Schiefer, Johannes; Schlangen, Christiane; Schmidt, Simone; Schöggl, Helmut; Schrenk, Caroline; Schüpbach, Michael; Schuierer, Michele; Sebastián, Ana Rojo; Selimbegovic-Turkovic, Amina; Sempolowicz, Justyna; Silva, Mark; Sitek, Emilia; Slawek, Jaroslaw; Snowden, Julie; Soleti, Francesco; Soliveri, Paola; Sollom, Andrea; Soltan, Witold; Sorbi, Sandro; Sorensen, Sven Asger; Spadaro, Maria; Städtler, Michael; Stamm, Christiane; Steiner, Tanja; Stokholm, Jette; Stokke, Bodil; Stopford, Cheryl; Storch, Alexander; Straßburger, Katrin; Stubbe, Lars; Sulek, Anna; Szczudlik, Andrzej; Tabrizi, Sarah; Taylor, Rachel; Terol, Santiago Duran-Sindreu; Thomas, Gareth; Thompson, Jennifer; Thomson, Aileen; Tidswell, Katherine; Torres, Maria M. Antequera; Toscano, Jean; Townhill, Jenny; Trautmann, Sonja; Tucci, Tecla; Tuuha, Katri; Uhrova, Tereza; Valadas, Anabela; van Hout, Monique S.E.; van Oostrom, J.C.H.; van Vugt, Jeroen P.P.; vanm, Walsem Marleen R.; Vandenberghe, Wim; Verellen-Dumoulin, Christine; Vergara, Mar Ruiz; Verstappen, C.C.P.; Verstraelen, Nichola; Viladrich, Celia Mareca; Villanueva, Clara; Wahlström, Jan; Warner, Thomas; Wehus, Raghild; Weindl, Adolf; Werner, Cornelius J.; Westmoreland, Leann; Weydt, Patrick; Wiedemann, Alexandra; Wild, Edward; Wild, Sue; Witjes-Ané, Marie-Noelle; Witkowski, Grzegorz; Wójcik, Magdalena; Wolz, Martin; Wolz, Annett; Wright, Jan; Yardumian, Pam; Yates, Shona; Yudina, Elizaveta; Zaremba, Jacek; Zaugg, Sabine W.; Zdzienicka, Elzbieta; Zielonka, Daniel; Zielonka, Euginiusz; Zinzi, Paola; Zittel, Simone; Zucker, Birgrit; Adams, John; Agarwal, Pinky; Antonijevic, Irina; Beck, Christopher; Chiu, Edmond; Churchyard, Andrew; Colcher, Amy; Corey-Bloom, Jody; Dorsey, Ray; Drazinic, Carolyn; Dubinsky, Richard; Duff, Kevin; Factor, Stewart; Foroud, Tatiana; Furtado, Sarah; Giuliano, Joe; Greenamyre, Timothy; Higgins, Don; Jankovic, Joseph; Jennings, Dana; Kang, Un Jung; Kostyk, Sandra; Kumar, Rajeev; Leavitt, Blair; LeDoux, Mark; Mallonee, William; Marshall, Frederick; Mohlo, Eric; Morgan, John; Oakes, David; Panegyres, Peter; Panisset, Michel; Perlman, Susan; Perlmutter, Joel; Quaid, Kimberly; Raymond, Lynn; Revilla, Fredy; Robertson, Suzanne; Robottom, Bradley; Sanchez-Ramos, Juan; Scott, Burton; Shannon, Kathleen; Shoulson, Ira; Singer, Carlos; Tabbal, Samer; Testa, Claudia; van, Kammen Dan; Vetter, Louise; Walker, Francis; Warner, John; Weiner, illiam; Wheelock, Vicki; Yastrubetskaya, Olga; Barton, Stacey; Broyles, Janice; Clouse, Ronda; Coleman, Allison; Davis, Robert; Decolongon, Joji; DeLaRosa, Jeanene; Deuel, Lisa; Dietrich, Susan; Dubinsky, Hilary; Eaton, Ken; Erickson, Diane; Fitzpatrick, Mary Jane; Frucht, Steven; Gartner, Maureen; Goldstein, Jody; Griffith, Jane; Hickey, Charlyne; Hunt, Victoria; Jaglin, Jeana; Klimek, Mary Lou; Lindsay, Pat; Louis, Elan; Loy, Clemet; Lucarelli, Nancy; Malarick, Keith; Martin, Amanda; McInnis, Robert; Moskowitz, Carol; Muratori, Lisa; Nucifora, Frederick; O'Neill, Christine; Palao, Alicia; Peavy, Guerry; Quesada, Monica; Schmidt, Amy; Segro, Vicki; Sperin, Elaine; Suter, Greg; Tanev, Kalo; Tempkin, Teresa; Thiede, Curtis; Wasserman, Paula; Welsh, Claire; Wesson, Melissa; Zauber, Elizabeth

2012-01-01

Objective: Age at onset of diagnostic motor manifestations in Huntington disease (HD) is strongly correlated with an expanded CAG trinucleotide repeat. The length of the normal CAG repeat allele has been reported also to influence age at onset, in interaction with the expanded allele. Due to profound implications for disease mechanism and modification, we tested whether the normal allele, interaction between the expanded and normal alleles, or presence of a second expanded allele affects age at onset of HD motor signs. Methods: We modeled natural log-transformed age at onset as a function of CAG repeat lengths of expanded and normal alleles and their interaction by linear regression. Results: An apparently significant effect of interaction on age at motor onset among 4,068 subjects was dependent on a single outlier data point. A rigorous statistical analysis with a well-behaved dataset that conformed to the fundamental assumptions of linear regression (e.g., constant variance and normally distributed error) revealed significance only for the expanded CAG repeat, with no effect of the normal CAG repeat. Ten subjects with 2 expanded alleles showed an age at motor onset consistent with the length of the larger expanded allele. Conclusions: Normal allele CAG length, interaction between expanded and normal alleles, and presence of a second expanded allele do not influence age at onset of motor manifestations, indicating that the rate of HD pathogenesis leading to motor diagnosis is determined by a completely dominant action of the longest expanded allele and as yet unidentified genetic or environmental factors. Neurology® 2012;78:690–695 PMID:22323755
The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.

PubMed

Murray, Vincent; Chen, Jon K; Tanaka, Mark M

2016-07-01

The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.
Generation of induced pluripotent stem cells from a patient with spinocerebellar ataxia type 3.

PubMed

Soong, Bing-Wen; Syu, Shih-Han; Wen, Cheng-Hao; Ko, Hui-Wen; Wu, Mei-Ling; Hsieh, Patrick C H; Hwang, Shiaw-Min; Lu, Huai-En

2017-01-01

Spinocerebellar ataxia type 3 (SCA3) is a dominantly inherited neurodegenerative disease caused by a trinucleotide repeat (CAG) expansion in the coding region of ATXN3 gene resulting in production of ataxin-3 with an elongated polyglutamine tract. Here, we generated induced pluripotent stem cells (iPSCs) from the peripheral blood mononuclear cells of a male patient with SCA3 by using the Sendai-virus delivery system. The resulting iPSCs had a normal karyotype, retained the disease-causing ATXN3 mutation, expressed pluripotent markers and could differentiate into the three germ layers. Potentially, the iPSCs could be a useful tool for the investigation of disease mechanisms of SCA3. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Microsatellite primers for a species of South African everlasting daisy (Helichrysum odoratissimum; Gnaphalieae, Asteraceae).

PubMed

Glennon, Kelsey L; Cron, Glynis V

2016-05-01

Microsatellites were developed for the widespread Helichrysum odoratissimum (Asteraceae) to estimate gene flow across diploid populations and to test if gene flow occurs among other closely related lineages within this genus. Ten primer pairs were developed and tested using populations across South Africa; however, only seven primer pairs were polymorphic for the target species. The seven polymorphic primers amplified di- and trinucleotide repeats with up to 16 alleles per locus among 125 diploid individuals used for analyses. These markers can be used to estimate gene flow among populations of known ploidy level of H. odoratissimum to test evolutionary hypotheses. Furthermore, these markers amplify successfully in other Helichrysum species, including the other three taxonomic Group 4 species, and therefore can be used to inform taxonomic work on these species.
Microsatellite markers for Vellozia gigantea (Velloziaceae), a narrowly endemic species to the Brazilian campos rupestres.

PubMed

Martins, Ana Paula V; Proite, Karina; Kalapothakis, Evanguedes; Santos, Fabrício R; Chaves, Anderson V; Borba, Eduardo L

2012-07-01

Microsatellite primers were developed for the first time in Velloziaceae, in the endangered species Vellozia gigantea. Using two different protocols, seven primer sets were characterized in three populations of V. gigantea. The primers amplified di- and trinucleotide repeats with six to 12 alleles per locus. These revealed high levels of genetic variation, presenting an average observed heterozygosity of 0.508 in V. gigantea. The seven primers were tested for cross-amplification in three Vellozia species. All primers successfully amplified in V. auriculata. Six primers amplified in V. compacta and three in V. hirsuta. The new marker set described here will be useful for studies of population genetics of V. gigantea. The cross-amplification results indicate the utility of primers for studies in other Vellozia species.
Expression levels of DNA replication and repair genes predict regional somatic repeat instability in the brain but are not altered by polyglutamine disease protein expression or age.

PubMed

Mason, Amanda G; Tomé, Stephanie; Simard, Jodie P; Libby, Randell T; Bammler, Theodor K; Beyer, Richard P; Morton, A Jennifer; Pearson, Christopher E; La Spada, Albert R

2014-03-15

Expansion of CAG/CTG trinucleotide repeats causes numerous inherited neurological disorders, including Huntington's disease (HD), several spinocerebellar ataxias and myotonic dystrophy type 1. Expanded repeats are genetically unstable with a propensity to further expand when transmitted from parents to offspring. For many alleles with expanded repeats, extensive somatic mosaicism has been documented. For CAG repeat diseases, dramatic instability has been documented in the striatum, with larger expansions noted with advancing age. In contrast, only modest instability occurs in the cerebellum. Using microarray expression analysis, we sought to identify the genetic basis of these regional instability differences by comparing gene expression in the striatum and cerebellum of aged wild-type C57BL/6J mice. We identified eight candidate genes enriched in cerebellum, and validated four--Pcna, Rpa1, Msh6 and Fen1--along with a highly associated interactor, Lig1. We also explored whether expression levels of mismatch repair (MMR) proteins are altered in a line of HD transgenic mice, R6/2, that is known to show pronounced regional repeat instability. Compared with wild-type littermates, MMR expression levels were not significantly altered in R6/2 mice regardless of age. Interestingly, expression levels of these candidates were significantly increased in the cerebellum of control and HD human samples in comparison to striatum. Together, our data suggest that elevated expression levels of DNA replication and repair proteins in cerebellum may act as a safeguard against repeat instability, and may account for the dramatically reduced somatic instability present in this brain region, compared with the marked instability observed in the striatum.
Bijective transformation circular codes and nucleotide exchanging RNA transcription.

PubMed

Michel, Christian J; Seligmann, Hervé

2014-04-01

The C(3) self-complementary circular code X identified in genes of prokaryotes and eukaryotes is a set of 20 trinucleotides enabling reading frame retrieval and maintenance, i.e. a framing code (Arquès and Michel, 1996; Michel, 2012, 2013). Some mitochondrial RNAs correspond to DNA sequences when RNA transcription systematically exchanges between nucleotides (Seligmann, 2013a,b). We study here the 23 bijective transformation codes ΠX of X which may code nucleotide exchanging RNA transcription as suggested by this mitochondrial observation. The 23 bijective transformation codes ΠX are C(3) trinucleotide circular codes, seven of them are also self-complementary. Furthermore, several correlations are observed between the Reading Frame Retrieval (RFR) probability of bijective transformation codes ΠX and the different biological properties of ΠX related to their numbers of RNAs in GenBank's EST database, their polymerization rate, their number of amino acids and the chirality of amino acids they code. Results suggest that the circular code X with the functions of reading frame retrieval and maintenance in regular RNA transcription, may also have, through its bijective transformation codes ΠX, the same functions in nucleotide exchanging RNA transcription. Associations with properties such as amino acid chirality suggest that the RFR of X and its bijective transformations molded the origins of the genetic code's machinery. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Analysis of sequence repeats of proteins in the PDB.

PubMed

Mary Rajathei, David; Selvaraj, Samuel

2013-12-01

Internal repeats in protein sequences play a significant role in the evolution of protein structure and function. Applications of different bioinformatics tools help in the identification and characterization of these repeats. In the present study, we analyzed sequence repeats in a non-redundant set of proteins available in the Protein Data Bank (PDB). We used RADAR for detecting internal repeats in a protein, PDBeFOLD for assessing structural similarity, PDBsum for finding functional involvement and Pfam for domain assignment of the repeats in a protein. Through the analysis of sequence repeats, we found that identity of the sequence repeats falls in the range of 20-40% and, the superimposed structures of the most of the sequence repeats maintain similar overall folding. Analysis sequence repeats at the functional level reveals that most of the sequence repeats are involved in the function of the protein through functionally involved residues in the repeat regions. We also found that sequence repeats in single and two domain proteins often contained conserved sequence motifs for the function of the domain. Copyright © 2013 Elsevier Ltd. All rights reserved.
Personalized gene silencing therapeutics for Huntington disease.

PubMed

Kay, C; Skotte, N H; Southwell, A L; Hayden, M R

2014-07-01

Gene silencing offers a novel therapeutic strategy for dominant genetic disorders. In specific diseases, selective silencing of only one copy of a gene may be advantageous over non-selective silencing of both copies. Huntington disease (HD) is an autosomal dominant disorder caused by an expanded CAG trinucleotide repeat in the Huntingtin gene (HTT). Silencing both expanded and normal copies of HTT may be therapeutically beneficial, but preservation of normal HTT expression is preferred. Allele-specific methods can selectively silence the mutant HTT transcript by targeting either the expanded CAG repeat or single nucleotide polymorphisms (SNPs) in linkage disequilibrium with the expansion. Both approaches require personalized treatment strategies based on patient genotypes. We compare the prospect of safe treatment of HD by CAG- and SNP-specific silencing approaches and review HD population genetics used to guide target identification in the patient population. Clinical implementation of allele-specific HTT silencing faces challenges common to personalized genetic medicine, requiring novel solutions from clinical scientists and regulatory authorities. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Large-scale oscillation of structure-related DNA sequence features in human chromosome 21

NASA Astrophysics Data System (ADS)

Li, Wentian; Miramontes, Pedro

2006-08-01

Human chromosome 21 is the only chromosome in the human genome that exhibits oscillation of the (G+C) content of a cycle length of hundreds kilobases (kb) ( 500kb near the right telomere). We aim at establishing the existence of a similar periodicity in structure-related sequence features in order to relate this (G+C)% oscillation to other biological phenomena. The following quantities are shown to oscillate with the same 500kb periodicity in human chromosome 21: binding energy calculated by two sets of dinucleotide-based thermodynamic parameters, AA/TT and AAA/TTT bi- and tri-nucleotide density, 5'-TA-3' dinucleotide density, and signal for 10- or 11-base periodicity of AA/TT or AAA/TTT. These intrinsic quantities are related to structural features of the double helix of DNA molecules, such as base-pair binding, untwisting or unwinding, stiffness, and a putative tendency for nucleosome formation.
The Maximal C³ Self-Complementary Trinucleotide Circular Code X in Genes of Bacteria, Archaea, Eukaryotes, Plasmids and Viruses.

PubMed

Michel, Christian J

2017-04-18

In 1996, a set X of 20 trinucleotides was identified in genes of both prokaryotes and eukaryotes which has on average the highest occurrence in reading frame compared to its two shifted frames. Furthermore, this set X has an interesting mathematical property as X is a maximal C 3 self-complementary trinucleotide circular code. In 2015, by quantifying the inspection approach used in 1996, the circular code X was confirmed in the genes of bacteria and eukaryotes and was also identified in the genes of plasmids and viruses. The method was based on the preferential occurrence of trinucleotides among the three frames at the gene population level. We extend here this definition at the gene level. This new statistical approach considers all the genes, i.e., of large and small lengths, with the same weight for searching the circular code X . As a consequence, the concept of circular code, in particular the reading frame retrieval, is directly associated to each gene. At the gene level, the circular code X is strengthened in the genes of bacteria, eukaryotes, plasmids, and viruses, and is now also identified in the genes of archaea. The genes of mitochondria and chloroplasts contain a subset of the circular code X . Finally, by studying viral genes, the circular code X was found in DNA genomes, RNA genomes, double-stranded genomes, and single-stranded genomes.
Trinucleotide's quadruplet symmetries and natural symmetry law of DNA creation ensuing Chargaff's second parity rule.

PubMed

Rosandić, Marija; Vlahović, Ines; Glunčić, Matko; Paar, Vladimir

2016-07-01

For almost 50 years the conclusive explanation of Chargaff's second parity rule (CSPR), the equality of frequencies of nucleotides A=T and C=G or the equality of direct and reverse complement trinucleotides in the same DNA strand, has not been determined yet. Here, we relate CSPR to the interstrand mirror symmetry in 20 symbolic quadruplets of trinucleotides (direct, reverse complement, complement, and reverse) mapped to double-stranded genome. The symmetries of Q-box corresponding to quadruplets can be obtained as a consequence of Watson-Crick base pairing and CSPR together. Alternatively, assuming Natural symmetry law for DNA creation that each trinucleotide in one strand of DNA must simultaneously appear also in the opposite strand automatically leads to Q-box direct-reverse mirror symmetry which in conjunction with Watson-Crick base pairing generates CSPR. We demonstrate quadruplet's symmetries in chromosomes of wide range of organisms, from Escherichia coli to Neanderthal and human genomes, introducing novel quadruplet-frequency histograms and 3D-diagrams with combined interstrand frequencies. These "landscapes" are mutually similar in all mammals, including extinct Neanderthals, and somewhat different in most of older species. In human chromosomes 1-12, and X, Y the "landscapes" are almost identical and slightly different in the remaining smaller and telocentric chromosomes. Quadruplet frequencies could provide a new robust tool for characterization and classification of genomes and their evolutionary trajectories.
Mechanistic Insights into the Binding of Class IIa HDAC Inhibitors toward Spinocerebellar Ataxia Type-2: A 3D-QSAR and Pharmacophore Modeling Approach

PubMed Central

Sinha, Siddharth; Goyal, Sukriti; Somvanshi, Pallavi; Grover, Abhinav

2017-01-01

Spinocerebellar ataxia (SCA-2) type-2 is a rare neurological disorder among the nine polyglutamine disorders, mainly caused by polyQ (CAG) trinucleotide repeats expansion within gene coding ataxin-2 protein. The expanded trinucleotide repeats within the ataxin-2 protein sequesters transcriptional cofactors i.e., CREB-binding protein (CBP), Ataxin-2 binding protein 1 (A2BP1) leading to a state of hypo-acetylation and transcriptional repression. Histone de-acetylases inhibitors (HDACi) have been reported to restore transcriptional balance through inhibition of class IIa HDAC's, that leads to an increased acetylation and transcription as demonstrated through in-vivo studies on mouse models of Huntington's. In this study, 61 di-aryl cyclo-propanehydroxamic acid derivatives were used for developing three dimensional (3D) QSAR and pharmacophore models. These models were then employed for screening and selection of anti-ataxia compounds. The chosen QSAR model was observed to be statistically robust with correlation coefficient (r2) value of 0.6774, cross-validated correlation coefficient (q2) of 0.6157 and co-relation coefficient for external test set (pred_r2) of 0.7570. A high F-test value of 77.7093 signified the robustness of the model. Two potential drug leads ZINC 00608101 (SEI) and ZINC 00329110 (ACI) were selected after a coalesce procedure of pharmacophore based screening using the pharmacophore model ADDRR.20 and structural analysis using molecular docking and dynamics simulations. The pharmacophore and the 3D-QSAR model generated were further validated for their screening and prediction ability using the enrichment factor (EF), goodness of hit (GH), and receiver operating characteristics (ROC) curve analysis. The compounds SEI and ACI exhibited a docking score of −10.097 and −9.182 kcal/mol, respectively. An evaluation of binding conformation of ligand-bound protein complexes was performed with MD simulations for a time period of 30 ns along with free energy binding calculations using the g_mmpbsa technique. Prediction of inhibitory activities of the two lead compounds SEI (7.53) and ACI (6.84) using the 3D-QSAR model reaffirmed their inhibitory characteristics as potential anti-ataxia compounds. PMID:28119557

Anthropometric and craniofacial patterns in mentally retarded males with emphasis on the fragile X syndrome.

PubMed

Butler, M G; Pratesi, R; Watson, M S; Breg, W R; Singh, D N

1993-09-01

Anthropometric and craniofacial profile patterns indicating the percent difference from the overall mean were developed on 34 physical parameters with 31 white, mentally retarded males (23 adults and 8 children) with the fra(X) syndrome matched for age with 31 white, mentally retarded males without a known cause of their retardation. The fra(X) syndrome males consistently showed larger dimensions for all anthropometric variables, with significant differences for height, sitting height, arm span, hand length, middle finger length, hand breadth, foot length, foot breadth, and testicular volume. A craniofacial pattern did emerge between the two groups of mentally retarded males, but with overlap of several variables. Significant differences were noted for head circumference, head breadth, lower face height, bizygomatic diameter, inner canthal distance, ear length and ear width, with the fra(X) syndrome males having larger head dimensions (head circumference, head breadth, head length, face height and lower face height), but smaller measurements for minimal frontal diameter, bizygomatic diameter, bigonial diameter, and inner canthal distance. Several significant correlations were found with the variables for both mentally retarded males with and without the fra(X) syndrome. In a combined anthropometric and craniofacial profile of 19 variables comparing 26 white fra(X) syndrome males (13 with high expression (> 30%) and 13 with low expression (< 30%), but matched for age), a relatively flat profile was observed with no significant differences for any of the variables. Generally, fra(X) syndrome males with increased fragile X chromosome expression have larger amplifications of the CGG trinucleotide repeat of the FMR-1 gene. No physical differences were detectable in our study between fra(X) males with high expression and apparently larger amplifications of the CGG trinucleotide repeats compared with those patients with low expression. Our research illustrates the use of anthropometry in identifying differences between mentally retarded males with or without the fra(X) syndrome and offers a comprehensive approach for screening males for the fra(X) syndrome and selecting those individuals for cytogenetic and/or molecular genetic testing.
Anthropometric and craniofacial patterns in mentally retarded males with emphasis on the fragile X syndrome

PubMed Central

Butler, Merlin G.; Pratesi, Riccardo; Watson, Michael S.; Breg, W. Roy; Singh, Dharmdeo N.

2017-01-01

Anthropometric and craniofacial profile patterns indicating the percent difference from the overall mean were developed on 34 physical parameters with 31 white, mentally retarded males (23 adults and 8 children) with the fra(X) syndrome matched for age with 31 white, mentally retarded males without a known cause of their retardation. The fra(X) syndrome males consistently showed larger dimensions for all anthropometric variables, with significant differences for height, sitting height, arm span, hand length, middle finger length, hand breadth, foot length, foot breadth, and testicular volume. A craniofacial pattern did emerge between the two groups of mentally retarded males, but with overlap of several variables. Significant differences were noted for head circumference, head breadth, lower face height, bizygomatic diameter, inner canthal distance, ear length and ear width, with the fra(X) syndrome males having larger head dimensions (head circumference, head breadth, head length, face height and lower face height), but smaller measurements for minimal frontal diameter, bizygomatic diameter, bigonial diameter, and inner canthal distance. Several significant correlations were found with the variables for both mentally retarded males with and without the fra(X) syndrome. In a combined anthropometric and craniofacial profile of 19 variables comparing 26 white fra(X) syndrome males (13 with high expression (>30%) and 13 with low expression (< 30%), but matched for age), a relatively flat profile was observed with no significant differences for any of the variables. Generally, fra(X) syndrome males with increased fragile X chromosome expression have larger amplifications of the CGG trinucleotide repeat of the FMR-1 gene. No physical differences were detectable in our study between fra(X) males with high expression and apparently larger amplifications of the CGG trinucleotide repeats compared with those patients with low expression. Our research illustrates the use of anthropometry in identifying differences between mentally retarded males with or without the fra(X) syndrome and offers a comprehensive approach for screening males for the fra(X) syndrome and selecting those individuals for cytogenetic and/or molecular genetic testing. PMID:8275570
The Saccharomyces cerevisiae Mre11-Rad50-Xrs2 complex promotes trinucleotide repeat expansions independently of homologous recombination.

PubMed

Ye, Yanfang; Kirkham-McCarthy, Lucy; Lahue, Robert S

2016-07-01

Trinucleotide repeats (TNRs) are tandem arrays of three nucleotides that can expand in length to cause at least 17 inherited human diseases. Somatic expansions in patients can occur in differentiated tissues where DNA replication is limited and cannot be a primary source of somatic mutation. Instead, mouse models of TNR diseases have shown that both inherited and somatic expansions can be suppressed by the loss of certain DNA repair factors. It is generally believed that these repair factors cause misprocessing of TNRs, leading to expansions. Here we extend this idea to show that the Mre11-Rad50-Xrs2 (MRX) complex of Saccharomyces cerevisiae is a causative factor in expansions of short TNRs. Mutations that eliminate MRX subunits led to significant suppression of expansions whereas mutations that inactivate Rad51 had only a minor effect. Coupled with previous evidence, this suggests that MRX drives expansions of short TNRs through a process distinct from homologous recombination. The nuclease function of Mre11 was dispensable for expansions, suggesting that expansions do not occur by Mre11-dependent nucleolytic processing of the TNR. Epistasis between MRX and post-replication repair (PRR) was tested. PRR protects against expansions, so a rad5 mutant gave a high expansion rate. In contrast, the mre11 rad5 double mutant gave a suppressed expansion rate, indistinguishable from the mre11 single mutant. This suggests that MRX creates a TNR substrate for PRR. Protein acetylation was also tested as a mechanism regulating MRX activity in expansions. Six acetylation sites were identified in Rad50. Mutation of all six lysine residues to arginine gave partial bypass of a sin3 HDAC mutant, suggesting that Rad50 acetylation is functionally important for Sin3-mediated expansions. Overall we conclude that yeast MRX helps drive expansions of short TNRs by a mechanism distinct from its role in homologous recombination and independent of the nuclease function of Mre11. Copyright © 2016 Elsevier B.V. All rights reserved.
Mechanistic Insights into the Binding of Class IIa HDAC Inhibitors toward Spinocerebellar Ataxia Type-2: A 3D-QSAR and Pharmacophore Modeling Approach.

PubMed

Sinha, Siddharth; Goyal, Sukriti; Somvanshi, Pallavi; Grover, Abhinav

2016-01-01

Spinocerebellar ataxia (SCA-2) type-2 is a rare neurological disorder among the nine polyglutamine disorders, mainly caused by polyQ (CAG) trinucleotide repeats expansion within gene coding ataxin-2 protein. The expanded trinucleotide repeats within the ataxin-2 protein sequesters transcriptional cofactors i.e., CREB-binding protein (CBP), Ataxin-2 binding protein 1 (A2BP1) leading to a state of hypo-acetylation and transcriptional repression. Histone de-acetylases inhibitors (HDACi) have been reported to restore transcriptional balance through inhibition of class IIa HDAC's, that leads to an increased acetylation and transcription as demonstrated through in-vivo studies on mouse models of Huntington's. In this study, 61 di-aryl cyclo-propanehydroxamic acid derivatives were used for developing three dimensional (3D) QSAR and pharmacophore models. These models were then employed for screening and selection of anti-ataxia compounds. The chosen QSAR model was observed to be statistically robust with correlation coefficient ( r 2 ) value of 0.6774, cross-validated correlation coefficient ( q 2 ) of 0.6157 and co-relation coefficient for external test set ( pred _ r 2 ) of 0.7570. A high F -test value of 77.7093 signified the robustness of the model. Two potential drug leads ZINC 00608101 (SEI) and ZINC 00329110 (ACI) were selected after a coalesce procedure of pharmacophore based screening using the pharmacophore model ADDRR.20 and structural analysis using molecular docking and dynamics simulations. The pharmacophore and the 3D-QSAR model generated were further validated for their screening and prediction ability using the enrichment factor (EF), goodness of hit (GH), and receiver operating characteristics (ROC) curve analysis. The compounds SEI and ACI exhibited a docking score of -10.097 and -9.182 kcal/mol, respectively. An evaluation of binding conformation of ligand-bound protein complexes was performed with MD simulations for a time period of 30 ns along with free energy binding calculations using the g_mmpbsa technique. Prediction of inhibitory activities of the two lead compounds SEI (7.53) and ACI (6.84) using the 3D-QSAR model reaffirmed their inhibitory characteristics as potential anti-ataxia compounds.
Screening for fragile X syndrome.

PubMed

Murray, J; Cuckle, H; Taylor, G; Hewison, J

1997-01-01

BACKGROUND AND AIM OF REVIEW. In 1991, the gene responsible for fragile X syndrome, a common cause of learning disability, was discovered. As a result, diagnosis of the disorder has improved and its molecular genetics are now understood. This report seems to provide the information needed to decide whether to use DNA testing to screen for the disorder. HOW THE RESEARCH WAS CONDUCTED. A literature search of electronic reference databases of published and 'grey' literature was undertaken together with hand searching of the most recent publications. RESEARCH FINDINGS. NATURAL HISTORY. Physical characteristics of fragile X syndrome include facial atypia, joint laxity and, in boys, macro-orchidism. Most affected males have moderate-to-severe learning disabilities with IQs under 50 whereas most females have borderline IQs of 70-85. Behavioural problems are similar to those seen with autism and attention-deficit disorders. Although fragile X syndrome is not curable there are a number of medical, educational, psychological and social interventions that can improve the symptoms. About 6% of those with learning disabilities tested in institutions have fragile X syndrome. Population prevalence figures are 1 in 4000 in males and 1 in 8000 in females. GENETICS. The disorder is caused by a mutation in a gene on the X chromosome which includes a trinucleotide repeat sequence. The mutation is characterized by hyper-expansion of the repeat sequence leading to down-regulation of the gene. In males an allele with repeat size in excess of 200, termed a full mutation (FM), is always associated with the affected phenotype, whereas in females only half are affected. Individuals with alleles having repeat size in the range 55-199 are unaffected but in females the sequence is heritably unstable so that it is at high risk of expansion to an FM in her offspring. This allele is known as a pre-mutation (PM) to contrast it with the FM found in the affected individual. No spontaneous expansions directly from a normal allele to an FM have been observed. SCREENING STRATEGIES. The principal aims of screenng for fragile X syndrome is to reduce the birth prevalence of the disorder, by prenatal diagnosis and selective termination of pregnancy, or by reducing the number of pregnancies in women who have the FM or PM alleles. Possible screening strategies are: routine antenatal testing of apparently low risk pregnancies, preconceptual testing of young women, and systematic testing in affected families ('cascade' screening). A secondary aim is to bring forward the diagnosis of affected individuals so that they might benefit from early treatment. Active paediatric screening and neonatal screening could achieve this but there is no direct evidence of any great benefit from early diagnosis. SCREENING TESTS. Cytogenetic methods are unsuitable for screening purposes. Southern blotting of genomic DNA can be used but is inaccurate in measuring the size of small PMs, there is a long laboratory turnaround time, and it is relatively expensive. The best protocol is to amplify the DNA using polymerase chain reaction on all samples, and when there is a possible failure to amplify, a Southern blot.(ABSTRACT TRUNCATED)
Molecular characterisation of Atlantic salmon paramyxovirus (ASPV): A novel paramyxovirus associated with proliferative gill inflammation

USGS Publications Warehouse

Falk, K.; Batts, W.N.; Kvellestad, A.; Kurath, G.; Wiik-Nielsen, J.; Winton, J.R.

2008-01-01

Atlantic salmon paramyxovirus (ASPV) was isolated in 1995 from gills of farmed Atlantic salmon suffering from proliferative gill inflammation. The complete genome sequence of ASPV was determined, revealing a genome 16,968 nucleotides in length consisting of six non-overlapping genes coding for the nucleo- (N), phospho- (P), matrix- (M), fusion- (F), haemagglutinin-neuraminidase- (HN) and large polymerase (L) proteins in the order 3???-N-P-M-F-HN-L-5???. The various conserved features related to virus replication found in most paramyxoviruses were also found in ASPV. These include: conserved and complementary leader and trailer sequences, tri-nucleotide intergenic regions and highly conserved transcription start and stop signal sequences. The P gene expression strategy of ASPV was like that of the respiro-, morbilli- and henipaviruses, which express the P and C proteins from the primary transcript and edit a portion of the mRNA to encode V and W proteins. Sequence similarities among various features related to virus replication, pairwise comparisons of all deduced ASPV protein sequences with homologous regions from other members of the family Paramyxoviridae, and phylogenetic analyses of these amino acid sequences suggested that ASPV was a novel member of the sub-family Paramyxovirinae, most closely related to the respiroviruses. ?? 2008 Elsevier B.V. All rights reserved.
Spinocerebellar ataxia 17: full phenotype in a 41 CAG/CAA repeats carrier.

PubMed

Origone, Paola; Gotta, Fabio; Lamp, Merit; Trevisan, Lucia; Geroldi, Alessandro; Massucco, Davide; Grazzini, Matteo; Massa, Federico; Ticconi, Flavia; Bauckneht, Matteo; Marchese, Roberta; Abbruzzese, Giovanni; Bellone, Emilia; Mandich, Paola

2018-01-01

Spinocerebellar ataxia 17 (SCA17) is one of the most heterogeneous forms of autosomal dominant cerebellar ataxias with a large clinical spectrum which can mimic other movement disorders such as Huntington disease (HD), dystonia and parkinsonism. SCA17 is caused by an expansion of CAG/CAA repeat in the Tata binding protein ( TBP ) gene. Normal alleles contain 25 to 40 CAG/CAA repeats, alleles with 50 or greater CAG/CAA repeats are pathological with full penetrance. Alleles with 43 to 49 CAG/CAA repeats were also reported and their penetrance is estimated between 50 and 80%. Recently few symptomatic individuals having 41 and 42 repeats were reported but it is still unclear whether CAG/CAA repeats of 41 or 42 are low penetrance disease-causing alleles. Thus, phenotypic variability like the disease course in subject with SCA17 locus restricted expansions remains to be fully understood. The patients was a 63-year-old woman who, at 54 years, showed personality changes and increased frequency of falls. At 55 years of age neuropsychological tests showed executive attention and visuospatial deficit. At the age of 59 the patient developed dysarthria and a progressive cognitive deficit. The neurological examination showed moderate gait ataxia, dysdiadochokinesia and dysmetria, dysphagia, dysarthria and abnormal saccadic pursuit, severe axial asynergy during postural changes, choreiform dyskinesias. Molecular analysis of the TBP gene demonstrated an allele with 41 repeat suggesting that 41 CAG/CCG TBP repeats could be an allele associated with the full clinical spectrum of SCA17. The described case with the other similar cases described in the literature suggests that 41 CAG/CAA trinucleotides should be considered as critical threshold in SCA17. We suggest that SCA17 diagnosis should be suspected in patients presenting with movement disorders associated with other neurodegenerative signs and symptoms.
Bifunctional Anti-Huntingtin Proteasome-Directed Intrabodies Mediate Efficient Degradation of Mutant Huntingtin Exon 1 Protein Fragments

PubMed Central

Butler, David C.; Messer, Anne

2011-01-01

Huntington's disease (HD) is a fatal autosomal dominant neurodegenerative disorder caused by a trinucleotide (CAG)n repeat expansion in the coding sequence of the huntingtin gene, and an expanded polyglutamine (>37Q) tract in the protein. This results in misfolding and accumulation of huntingtin protein (htt), formation of neuronal intranuclear and cytoplasmic inclusions, and neuronal dysfunction/degeneration. Single-chain Fv antibodies (scFvs), expressed as intrabodies that bind htt and prevent aggregation, show promise as immunotherapeutics for HD. Intrastriatal delivery of anti-N-terminal htt scFv-C4 using an adeno-associated virus vector (AAV2/1) significantly reduces the size and number of aggregates in HDR6/1 transgenic mice; however, this protective effect diminishes with age and time after injection. We therefore explored enhancing intrabody efficacy via fusions to heterologous functional domains. Proteins containing a PEST motif are often targeted for proteasomal degradation and generally have a short half life. In ST14A cells, fusion of the C-terminal PEST region of mouse ornithine decarboxylase (mODC) to scFv-C4 reduces htt exon 1 protein fragments with 72 glutamine repeats (httex1-72Q) by ∼80–90% when compared to scFv-C4 alone. Proteasomal targeting was verified by either scrambling the mODC-PEST motif, or via proteasomal inhibition with epoxomicin. For these constructs, the proteasomal degradation of the scFv intrabody proteins themselves was reduced<25% by the addition of the mODC-PEST motif, with or without antigens. The remaining intrabody levels were amply sufficient to target N-terminal httex1-72Q protein fragment turnover. Critically, scFv-C4-PEST prevents aggregation and toxicity of httex1-72Q fragments at significantly lower doses than scFv-C4. Fusion of the mODC-PEST motif to intrabodies is a valuable general approach to specifically target toxic antigens to the proteasome for degradation. PMID:22216210
Microsatellite primers in Oenothera harringtonii (Onagraceae), an annual endemic to the shortgrass prairie of Colorado.

PubMed

Skogen, Krissa A; Hilpman, Evan T; Todd, Sadie L; Fant, Jeremie B

2012-08-01

Microsatellite markers were developed in the annual herb, Oenothera harringtonii, to investigate patterns of genetic diversity, gene flow, and parentage within and among populations of this Colorado endemic. Ten polymorphic loci were identified in O. harringtonii and tested in four populations sampled across the range of the species. These loci contained trinucleotide repeats with 7-29 alleles per locus. Nine of the 10 loci also amplified in O. caespitosa subsp. macroglottis, O. caespitosa subsp. marginata, and O. caespitosa subsp. navajoensis. In addition, we optimized three markers developed for O. biennis and provide reports of their effectiveness in all four taxa. These results indicate the utility of these markers in O. harringtonii for future studies of genetic structure, gene flow, and parentage as well as their applicability in other members of the O. caespitosa species complex.
[Myotonic dystrophy - a new insight into a well-known disease].

PubMed

Lusakowska, Anna; Sułek-Piatkowska, Anna

2010-01-01

Myotonic dystrophy (DM), the most common dystrophy in adults, is an autosomal dominant disease characterized by a variety of multisystemic features. Two genetically distinct forms of DM are identified - type 1 (DM1), the classic form first described by Steinert, and type 2 (DM2), identified by Ricker. DM1 is caused by trinucleotide expansion of CTG in the myotonic dystrophy protein kinase gene, whereas in DM2 the expansion of tetranucleotide repeats (CCTG) in the zinc finger protein 9 gene was identified. Both mutations are dynamic and are located in non-coding parts of the genes. Phenotype variability of DM1 and DM2 is caused by a molecular mechanism due to mutated RNA toxicity. This paper reviews the clinical features of both types of myotonic dystrophies and summarizes current views on pathogenesis of myotonic dystrophy.
The Maximal C3 Self-Complementary Trinucleotide Circular Code X in Genes of Bacteria, Archaea, Eukaryotes, Plasmids and Viruses

PubMed Central

Michel, Christian J.

2017-01-01

In 1996, a set X of 20 trinucleotides was identified in genes of both prokaryotes and eukaryotes which has on average the highest occurrence in reading frame compared to its two shifted frames. Furthermore, this set X has an interesting mathematical property as X is a maximal C3 self-complementary trinucleotide circular code. In 2015, by quantifying the inspection approach used in 1996, the circular code X was confirmed in the genes of bacteria and eukaryotes and was also identified in the genes of plasmids and viruses. The method was based on the preferential occurrence of trinucleotides among the three frames at the gene population level. We extend here this definition at the gene level. This new statistical approach considers all the genes, i.e., of large and small lengths, with the same weight for searching the circular code X. As a consequence, the concept of circular code, in particular the reading frame retrieval, is directly associated to each gene. At the gene level, the circular code X is strengthened in the genes of bacteria, eukaryotes, plasmids, and viruses, and is now also identified in the genes of archaea. The genes of mitochondria and chloroplasts contain a subset of the circular code X. Finally, by studying viral genes, the circular code X was found in DNA genomes, RNA genomes, double-stranded genomes, and single-stranded genomes. PMID:28420220
An extended sequence specificity for UV-induced DNA damage.

PubMed

Chung, Long H; Murray, Vincent

2018-01-01

The sequence specificity of UV-induced DNA damage was determined with a higher precision and accuracy than previously reported. UV light induces two major damage adducts: cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). Employing capillary electrophoresis with laser-induced fluorescence and taking advantages of the distinct properties of the CPDs and 6-4PPs, we studied the sequence specificity of UV-induced DNA damage in a purified DNA sequence using two approaches: end-labelling and a polymerase stop/linear amplification assay. A mitochondrial DNA sequence that contained a random nucleotide composition was employed as the target DNA sequence. With previous methodology, the UV sequence specificity was determined at a dinucleotide or trinucleotide level; however, in this paper, we have extended the UV sequence specificity to a hexanucleotide level. With the end-labelling technique (for 6-4PPs), the consensus sequence was found to be 5'-GCTC*AC (where C* is the breakage site); while with the linear amplification procedure, it was 5'-TCTT*AC. With end-labelling, the dinucleotide frequency of occurrence was highest for 5'-TC*, 5'-TT* and 5'-CC*; whereas it was 5'-TT* for linear amplification. The influence of neighbouring nucleotides on the degree of UV-induced DNA damage was also examined. The core sequences consisted of pyrimidine nucleotides 5'-CTC* and 5'-CTT* while an A at position "1" and C at position "2" enhanced UV-induced DNA damage. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.
Sequence repeats and protein structure

NASA Astrophysics Data System (ADS)

Hoang, Trinh X.; Trovato, Antonio; Seno, Flavio; Banavar, Jayanth R.; Maritan, Amos

2012-11-01

Repeats are frequently found in known protein sequences. The level of sequence conservation in tandem repeats correlates with their propensities to be intrinsically disordered. We employ a coarse-grained model of a protein with a two-letter amino acid alphabet, hydrophobic (H) and polar (P), to examine the sequence-structure relationship in the realm of repeated sequences. A fraction of repeated sequences comprises a distinct class of bad folders, whose folding temperatures are much lower than those of random sequences. Imperfection in sequence repetition improves the folding properties of the bad folders while deteriorating those of the good folders. Our results may explain why nature has utilized repeated sequences for their versatility and especially to design functional proteins that are intrinsically unstructured at physiological temperatures.
Elevation of RNA-binding protein CUGBP1 is an early event in an inducible heart-specific mouse model of myotonic dystrophy

PubMed Central

Wang, Guey-Shin; Kearney, Debra L.; De Biasi, Mariella; Taffet, George; Cooper, Thomas A.

2007-01-01

Myotonic dystrophy type 1 (DM1) is caused by a CTG trinucleotide expansion in the 3′ untranslated region (3′ UTR) of DM protein kinase (DMPK). The key feature of DM1 pathogenesis is nuclear accumulation of RNA, which causes aberrant alternative splicing of specific pre-mRNAs by altering the functions of CUG-binding proteins (CUGBPs). Cardiac involvement occurs in more than 80% of individuals with DM1 and is responsible for up to 30% of disease-related deaths. We have generated an inducible and heart-specific DM1 mouse model expressing expanded CUG RNA in the context of DMPK 3′ UTR that recapitulated pathological and molecular features of DM1 including dilated cardiomyopathy, arrhythmias, systolic and diastolic dysfunction, and misregulated alternative splicing. Combined in situ hybridization and immunofluorescent staining for CUGBP1 and CUGBP2, the 2 CUGBP1 and ETR-3 like factor (CELF) proteins expressed in heart, demonstrated elevated protein levels specifically in nuclei containing foci of CUG repeat RNA. A time-course study demonstrated that colocalization of MBNL1 with RNA foci and increased CUGBP1 occurred within hours of induced expression of CUG repeat RNA and coincided with reversion to embryonic splicing patterns. These results indicate that CUGBP1 upregulation is an early and primary response to expression of CUG repeat RNA. PMID:17823658
Genome-Wide Development and Use of Microsatellite Markers for Large-Scale Genotyping Applications in Foxtail Millet [Setaria italica (L.)

PubMed Central

Pandey, Garima; Misra, Gopal; Kumari, Kajal; Gupta, Sarika; Parida, Swarup Kumar; Chattopadhyay, Debasis; Prasad, Manoj

2013-01-01

The availability of well-validated informative co-dominant microsatellite markers and saturated genetic linkage map has been limited in foxtail millet (Setaria italica L.). In view of this, we conducted a genome-wide analysis and identified 28 342 microsatellite repeat-motifs spanning 405.3 Mb of foxtail millet genome. The trinucleotide repeats (∼48%) was prevalent when compared with dinucleotide repeats (∼46%). Of the 28 342 microsatellites, 21 294 (∼75%) primer pairs were successfully designed, and a total of 15 573 markers were physically mapped on 9 chromosomes of foxtail millet. About 159 markers were validated successfully in 8 accessions of Setaria sp. with ∼67% polymorphic potential. The high percentage (89.3%) of cross-genera transferability across millet and non-millet species with higher transferability percentage in bioenergy grasses (∼79%, Switchgrass and ∼93%, Pearl millet) signifies their importance in studying the bioenergy grasses. In silico comparative mapping of 15 573 foxtail millet microsatellite markers against the mapping data of sorghum (16.9%), maize (14.5%) and rice (6.4%) indicated syntenic relationships among the chromosomes of foxtail millet and target species. The results, thus, demonstrate the immense applicability of developed microsatellite markers in germplasm characterization, phylogenetics, construction of genetic linkage map for gene/quantitative trait loci discovery, comparative mapping in foxtail millet, including other millets and bioenergy grass species. PMID:23382459
Genome-wide development and use of microsatellite markers for large-scale genotyping applications in foxtail millet [Setaria italica (L.)].

PubMed

Pandey, Garima; Misra, Gopal; Kumari, Kajal; Gupta, Sarika; Parida, Swarup Kumar; Chattopadhyay, Debasis; Prasad, Manoj

2013-04-01

The availability of well-validated informative co-dominant microsatellite markers and saturated genetic linkage map has been limited in foxtail millet (Setaria italica L.). In view of this, we conducted a genome-wide analysis and identified 28 342 microsatellite repeat-motifs spanning 405.3 Mb of foxtail millet genome. The trinucleotide repeats (∼48%) was prevalent when compared with dinucleotide repeats (∼46%). Of the 28 342 microsatellites, 21 294 (∼75%) primer pairs were successfully designed, and a total of 15 573 markers were physically mapped on 9 chromosomes of foxtail millet. About 159 markers were validated successfully in 8 accessions of Setaria sp. with ∼67% polymorphic potential. The high percentage (89.3%) of cross-genera transferability across millet and non-millet species with higher transferability percentage in bioenergy grasses (∼79%, Switchgrass and ∼93%, Pearl millet) signifies their importance in studying the bioenergy grasses. In silico comparative mapping of 15 573 foxtail millet microsatellite markers against the mapping data of sorghum (16.9%), maize (14.5%) and rice (6.4%) indicated syntenic relationships among the chromosomes of foxtail millet and target species. The results, thus, demonstrate the immense applicability of developed microsatellite markers in germplasm characterization, phylogenetics, construction of genetic linkage map for gene/quantitative trait loci discovery, comparative mapping in foxtail millet, including other millets and bioenergy grass species.
MSH3 polymorphisms and protein levels affect CAG repeat instability in Huntington's disease mice.

PubMed

Tomé, Stéphanie; Manley, Kevin; Simard, Jodie P; Clark, Greg W; Slean, Meghan M; Swami, Meera; Shelbourne, Peggy F; Tillier, Elisabeth R M; Monckton, Darren G; Messer, Anne; Pearson, Christopher E

2013-01-01

Expansions of trinucleotide CAG/CTG repeats in somatic tissues are thought to contribute to ongoing disease progression through an affected individual's life with Huntington's disease or myotonic dystrophy. Broad ranges of repeat instability arise between individuals with expanded repeats, suggesting the existence of modifiers of repeat instability. Mice with expanded CAG/CTG repeats show variable levels of instability depending upon mouse strain. However, to date the genetic modifiers underlying these differences have not been identified. We show that in liver and striatum the R6/1 Huntington's disease (HD) (CAG)∼100 transgene, when present in a congenic C57BL/6J (B6) background, incurred expansion-biased repeat mutations, whereas the repeat was stable in a congenic BALB/cByJ (CBy) background. Reciprocal congenic mice revealed the Msh3 gene as the determinant for the differences in repeat instability. Expansion bias was observed in congenic mice homozygous for the B6 Msh3 gene on a CBy background, while the CAG tract was stabilized in congenics homozygous for the CBy Msh3 gene on a B6 background. The CAG stabilization was as dramatic as genetic deficiency of Msh2. The B6 and CBy Msh3 genes had identical promoters but differed in coding regions and showed strikingly different protein levels. B6 MSH3 variant protein is highly expressed and associated with CAG expansions, while the CBy MSH3 variant protein is expressed at barely detectable levels, associating with CAG stability. The DHFR protein, which is divergently transcribed from a promoter shared by the Msh3 gene, did not show varied levels between mouse strains. Thus, naturally occurring MSH3 protein polymorphisms are modifiers of CAG repeat instability, likely through variable MSH3 protein stability. Since evidence supports that somatic CAG instability is a modifier and predictor of disease, our data are consistent with the hypothesis that variable levels of CAG instability associated with polymorphisms of DNA repair genes may have prognostic implications for various repeat-associated diseases.
MSH3 Polymorphisms and Protein Levels Affect CAG Repeat Instability in Huntington's Disease Mice

PubMed Central

Simard, Jodie P.; Clark, Greg W.; Slean, Meghan M.; Swami, Meera; Shelbourne, Peggy F.; Tillier, Elisabeth R. M.; Monckton, Darren G.; Messer, Anne; Pearson, Christopher E.

2013-01-01

Expansions of trinucleotide CAG/CTG repeats in somatic tissues are thought to contribute to ongoing disease progression through an affected individual's life with Huntington's disease or myotonic dystrophy. Broad ranges of repeat instability arise between individuals with expanded repeats, suggesting the existence of modifiers of repeat instability. Mice with expanded CAG/CTG repeats show variable levels of instability depending upon mouse strain. However, to date the genetic modifiers underlying these differences have not been identified. We show that in liver and striatum the R6/1 Huntington's disease (HD) (CAG)∼100 transgene, when present in a congenic C57BL/6J (B6) background, incurred expansion-biased repeat mutations, whereas the repeat was stable in a congenic BALB/cByJ (CBy) background. Reciprocal congenic mice revealed the Msh3 gene as the determinant for the differences in repeat instability. Expansion bias was observed in congenic mice homozygous for the B6 Msh3 gene on a CBy background, while the CAG tract was stabilized in congenics homozygous for the CBy Msh3 gene on a B6 background. The CAG stabilization was as dramatic as genetic deficiency of Msh2. The B6 and CBy Msh3 genes had identical promoters but differed in coding regions and showed strikingly different protein levels. B6 MSH3 variant protein is highly expressed and associated with CAG expansions, while the CBy MSH3 variant protein is expressed at barely detectable levels, associating with CAG stability. The DHFR protein, which is divergently transcribed from a promoter shared by the Msh3 gene, did not show varied levels between mouse strains. Thus, naturally occurring MSH3 protein polymorphisms are modifiers of CAG repeat instability, likely through variable MSH3 protein stability. Since evidence supports that somatic CAG instability is a modifier and predictor of disease, our data are consistent with the hypothesis that variable levels of CAG instability associated with polymorphisms of DNA repair genes may have prognostic implications for various repeat-associated diseases. PMID:23468640
Targeting DMPK with Antisense Oligonucleotide Improves Muscle Strength in Myotonic Dystrophy Type 1 Mice.

PubMed

Jauvin, Dominic; Chrétien, Jessina; Pandey, Sanjay K; Martineau, Laurie; Revillod, Lucille; Bassez, Guillaume; Lachon, Aline; MacLeod, A Robert; Gourdon, Geneviève; Wheeler, Thurman M; Thornton, Charles A; Bennett, C Frank; Puymirat, Jack

2017-06-16

Myotonic dystrophy type 1 (DM1), a dominant hereditary muscular dystrophy, is caused by an abnormal expansion of a (CTG) n trinucleotide repeat in the 3' UTR of the human dystrophia myotonica protein kinase (DMPK) gene. As a consequence, mutant transcripts containing expanded CUG repeats are retained in nuclear foci and alter the function of splicing regulatory factors members of the MBNL and CELF families, resulting in alternative splicing misregulation of specific transcripts in affected DM1 tissues. In the present study, we treated DMSXL mice systemically with a 2'-4'-constrained, ethyl-modified (ISIS 486178) antisense oligonucleotide (ASO) targeted to the 3' UTR of the DMPK gene, which led to a 70% reduction in CUG exp RNA abundance and foci in different skeletal muscles and a 30% reduction in the heart. Furthermore, treatment with ISIS 486178 ASO improved body weight, muscle strength, and muscle histology, whereas no overt toxicity was detected. This is evidence that the reduction of CUG exp RNA improves muscle strength in DM1, suggesting that muscle weakness in DM1 patients may be improved following elimination of toxic RNAs. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Altered structural brain connectome in young adult fragile X premutation carriers.

PubMed

Leow, Alex; Harvey, Danielle; Goodrich-Hunsaker, Naomi J; Gadelkarim, Johnson; Kumar, Anand; Zhan, Liang; Rivera, Susan M; Simon, Tony J

2014-09-01

Fragile X premutation carriers (fXPC) are characterized by 55-200 CGG trinucleotide repeats in the 5' untranslated region on the Xq27.3 site of the X chromosome. Clinically, they are associated with the fragile X-Associated Tremor/Ataxia Syndrome, a late-onset neurodegenerative disorder with diffuse white matter neuropathology. Here, we conducted first-ever graph theoretical network analyses in fXPCs using 30-direction diffusion-weighted magnetic resonance images acquired from 42 healthy controls aged 18-44 years (HC; 22 male and 20 female) and 46 fXPCs (16 male and 30 female). Globally, we found no differences between the fXPCs and HCs within each gender for all global graph theoretical measures. In male fXPCs, global efficiency was significantly negatively associated with the number of CGG repeats. For nodal measures, significant group differences were found between male fXPCs and male HCs in the right fusiform and the right ventral diencephalon (for nodal efficiency), and in the left hippocampus [for nodal clustering coefficient (CC)]. In female fXPCs, CC in the left superior parietal cortex correlated with counting performance in an enumeration task. Copyright © 2014 Wiley Periodicals, Inc.

Effects of the enlargement of poly-glutamine segments on the structure and folding of ataxin-2 and ataxin-3 proteins

PubMed Central

Wen, Jingran; Scoles, Daniel R.; Facelli, Julio C.

2017-01-01

Spinocerebellar ataxia type 2 (SCA2) and type 3 (SCA3) are two common autosomal-dominant inherited ataxia syndromes, both of which are related to the unstable expansion of tri-nucleotide CAG repeats in the coding region of the related ATXN2 and ATXN3 genes, respectively. The poly-glutamine (poly-Q) tract encoded by the CAG repeats has long been recognized as an important factor in disease pathogenesis and progress. In this study, using the I-TASSER method for 3D structure prediction, we investigated the effect of poly-Q tract enlargement on the structure and folding of ataxin-2 and ataxin-3 proteins. Our results show good agreement with the known experimental structures of the Josephin and UIM domains providing credence to the simulation results presented here, which show that the enlargement of the poly-Q region not only affects the local structure of these regions but also affects the structures of functional domains as well as the whole protein. The changes observed in the predicted models of the UIM domains in ataxin-3 when the poly-Q track is enlarged provide new insights on possible pathogenic mechanisms. PMID:26861241
De novo transcriptome analysis of rose-scented geranium provides insights into the metabolic specificity of terpene and tartaric acid biosynthesis.

PubMed

Narnoliya, Lokesh K; Kaushal, Girija; Singh, Sudhir P; Sangwan, Rajender S

2017-01-13

Rose-scented geranium (Pelargonium sp.) is a perennial herb that produces a high value essential oil of fragrant significance due to the characteristic compositional blend of rose-oxide and acyclic monoterpenoids in foliage. Recently, the plant has also been shown to produce tartaric acid in leaf tissues. Rose-scented geranium represents top-tier cash crop in terms of economic returns and significance of the plant and plant products. However, there has hardly been any study on its metabolism and functional genomics, nor any genomic expression dataset resource is available in public domain. Therefore, to begin the gains in molecular understanding of specialized metabolic pathways of the plant, de novo sequencing of rose-scented geranium leaf transcriptome, transcript assembly, annotation, expression profiling as well as their validation were carried out. De novo transcriptome analysis resulted a total of 78,943 unique contigs (average length: 623 bp, and N50 length: 752 bp) from 15.44 million high quality raw reads. In silico functional annotation led to the identification of several putative genes representing terpene, ascorbic acid and tartaric acid biosynthetic pathways, hormone metabolism, and transcription factors. Additionally, a total of 6,040 simple sequence repeat (SSR) motifs were identified in 6.8% of the expressed transcripts. The highest frequency of SSR was of tri-nucleotides (50%). Further, transcriptome assembly was validated for randomly selected putative genes by standard PCR-based approach. In silico expression profile of assembled contigs were validated by real-time PCR analysis of selected transcripts. Being the first report on transcriptome analysis of rose-scented geranium the data sets and the leads and directions reflected in this investigation will serve as a foundation for pursuing and understanding molecular aspects of its biology, and specialized metabolic pathways, metabolic engineering, genetic diversity as well as molecular breeding.
Location analysis for the estrogen receptor-α reveals binding to diverse ERE sequences and widespread binding within repetitive DNA elements

PubMed Central

Mason, Christopher E.; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M.; Kallen, Roland G.; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B.

2010-01-01

Location analysis for estrogen receptor-α (ERα)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ERα-bound loci and quantify the incidence of ERE sequences under two stringencies of detection: <10% and 10–20% nucleotide deviation from the canonical ERE sequence. We demonstrate that ∼50% of all ERα-bound loci do not have a discernable ERE and show that most ERα-bound EREs are not perfect consensus EREs. Approximately one-third of all ERα-bound ERE sequences reside within repetitive DNA sequences, most commonly of the AluS family. In addition, the 3-bp spacer between the inverted ERE half-sites, rather than being random nucleotides, is C(A/T)G-enriched at bona fide receptor targets. Diverse ERα-bound loci were validated using electrophoretic mobility shift assay and ChIP-polymerase chain reaction (PCR). The functional significance of receptor-bound loci was demonstrated using luciferase reporter assays which proved that repetitive element ERE sequences contribute to enhancer function. ChIP-PCR demonstrated estrogen-dependent recruitment of the coactivator SRC3 to these loci in vivo. Our data demonstrate that ERα binds to widely variant EREs with less sequence specificity than had previously been suspected and that binding at repetitive and nonrepetitive genomic targets is favored by specific trinucleotide spacers. PMID:20047966
Location analysis for the estrogen receptor-alpha reveals binding to diverse ERE sequences and widespread binding within repetitive DNA elements.

PubMed

Mason, Christopher E; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M; Kallen, Roland G; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B

2010-04-01

Location analysis for estrogen receptor-alpha (ERalpha)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ERalpha-bound loci and quantify the incidence of ERE sequences under two stringencies of detection: <10% and 10-20% nucleotide deviation from the canonical ERE sequence. We demonstrate that approximately 50% of all ERalpha-bound loci do not have a discernable ERE and show that most ERalpha-bound EREs are not perfect consensus EREs. Approximately one-third of all ERalpha-bound ERE sequences reside within repetitive DNA sequences, most commonly of the AluS family. In addition, the 3-bp spacer between the inverted ERE half-sites, rather than being random nucleotides, is C(A/T)G-enriched at bona fide receptor targets. Diverse ERalpha-bound loci were validated using electrophoretic mobility shift assay and ChIP-polymerase chain reaction (PCR). The functional significance of receptor-bound loci was demonstrated using luciferase reporter assays which proved that repetitive element ERE sequences contribute to enhancer function. ChIP-PCR demonstrated estrogen-dependent recruitment of the coactivator SRC3 to these loci in vivo. Our data demonstrate that ERalpha binds to widely variant EREs with less sequence specificity than had previously been suspected and that binding at repetitive and nonrepetitive genomic targets is favored by specific trinucleotide spacers.
Microsatellites for Oenothera gayleana and O. hartwegii subsp. filifolia (Onagraceae), and their utility in section Calylophus.

PubMed

Lewis, Emily M; Fant, Jeremie B; Moore, Michael J; Hastings, Amy P; Larson, Erica L; Agrawal, Anurag A; Skogen, Krissa A

2016-02-01

Eleven nuclear and four plastid microsatellite markers were screened for two gypsum endemic species, Oenothera gayleana and O. hartwegii subsp. filifolia, and tested for cross-amplification in the remaining 11 taxa within Oenothera sect. Calylophus (Onagraceae). Microsatellite markers were tested in two to three populations spanning the ranges of both O. gayleana and O. hartwegii subsp. filifolia. The nuclear microsatellite loci consisted of both di- and trinucleotide repeats with one to 17 alleles per population. Several loci showed significant deviation from Hardy-Weinberg equilibrium, which may be evidence of chromosomal rings. The plastid microsatellite markers identified one to seven haplotypes per population. The transferability of these markers was confirmed in all 11 taxa within Oenothera sect. Calylophus. The microsatellite loci characterized here are the first developed and tested in Oenothera sect. Calylophus. These markers will be used to assess whether pollinator foraging distance influences population genetic parameters in predictable ways.
What has been learned from mouse models of the Fragile X Premutation and Fragile X-associated tremor/ataxia syndrome?

PubMed

Foote, Molly M; Careaga, Milo; Berman, Robert F

2016-08-01

To describe in this review how research using mouse models developed to study the Fragile X premutation (PM) and Fragile X-associated tremor/ataxia syndrome (FXTAS) have contributed to understanding these disorders. PM carriers bear an expanded CGG trinucleotide repeat on the Fragile X Mental Retardation 1 (FMR1) gene, and are at risk for developing the late onset neurodegenerative disorder FXTAS. Much has been learned about these genetic disorders from the development and study of mouse models. This includes new insights into the early cellular and molecular events that occur in PM carriers and in FXTAS, the presence of multiorgan pathology beyond the CNS, immunological dysregulation, unexpected synthesis of a potentially toxic peptide in FXTAS (i.e., FMRpolyG), and evidence that the disease process may be halted or reversed by appropriate molecular therapies given early in the course of disease.
Microsatellite primers for a species of South African everlasting daisy (Helichrysum odoratissimum; Gnaphalieae, Asteraceae)1

PubMed Central

Glennon, Kelsey L.; Cron, Glynis V.

2016-01-01

Premise of the study: Microsatellites were developed for the widespread Helichrysum odoratissimum (Asteraceae) to estimate gene flow across diploid populations and to test if gene flow occurs among other closely related lineages within this genus. Methods and Results: Ten primer pairs were developed and tested using populations across South Africa; however, only seven primer pairs were polymorphic for the target species. The seven polymorphic primers amplified di- and trinucleotide repeats with up to 16 alleles per locus among 125 diploid individuals used for analyses. Conclusions: These markers can be used to estimate gene flow among populations of known ploidy level of H. odoratissimum to test evolutionary hypotheses. Furthermore, these markers amplify successfully in other Helichrysum species, including the other three taxonomic Group 4 species, and therefore can be used to inform taxonomic work on these species. PMID:27213125
Characterization of 21 microsatellite markers from cogongrass, Imperata cylindrica (Poaceae), a weed species distributed worldwide.

PubMed

Chiang, Yu-Chung; Tsai, Chi-Chu; Hsu, Tsai-Wen; Chou, Chang-Hung

2012-11-01

Microsatellite loci were developed from Imperata cylindrica, a traditional medicinal herb in Asia and among the top 10 worst invasive weeds in the world, to aid in the identification of the limits of asexual clonal individuals. A total of 21 microsatellite markers, including 18 polymorphic and three monomorphic loci, were developed from I. cylindrica using a magnetic bead enrichment protocol. The primers amplified dinucleotide, trinucleotide, and complex repeats. The number of alleles ranged from one to 19 per locus, with an observed heterozygosity ranging from 0.09 to 1.00. Several loci deviated significantly from the within-population Hardy-Weinberg equilibrium as a result of asexual clonal reproduction. These polymorphic markers should be useful tools in further studies on the identification of the range of clonal reproduction units and the selection and classification of the medicinal cultivar.
Molecular mechanisms of fragile X syndrome: a twenty-year perspective.

PubMed

Santoro, Michael R; Bray, Steven M; Warren, Stephen T

2012-01-01

Fragile X syndrome (FXS) is a common form of inherited intellectual disability and is one of the leading known causes of autism. The mutation responsible for FXS is a large expansion of the trinucleotide CGG repeat in the 5' untranslated region of the X-linked gene FMR1. This expansion leads to DNA methylation of FMR1 and to transcriptional silencing, which results in the absence of the gene product, FMRP, a selective messenger RNA (mRNA)-binding protein that regulates the translation of a subset of dendritic mRNAs. FMRP is critical for mGluR (metabotropic glutamate receptor)-dependent long-term depression, as well as for other forms of synaptic plasticity; its absence causes excessive and persistent protein synthesis in postsynaptic dendrites and dysregulated synaptic function. Studies continue to refine our understanding of FMRP's role in synaptic plasticity and to uncover new functions of this protein, which have illuminated therapeutic approaches for FXS.
PABPN1 gene therapy for oculopharyngeal muscular dystrophy

PubMed Central

Malerba, A.; Klein, P.; Bachtarzi, H.; Jarmin, S. A.; Cordova, G.; Ferry, A.; Strings, V.; Espinoza, M. Polay; Mamchaoui, K.; Blumen, S. C.; St Guily, J. Lacau; Mouly, V.; Graham, M.; Butler-Browne, G.; Suhy, D. A.; Trollet, C.; Dickson, G.

2017-01-01

Oculopharyngeal muscular dystrophy (OPMD) is an autosomal dominant, late-onset muscle disorder characterized by ptosis, swallowing difficulties, proximal limb weakness and nuclear aggregates in skeletal muscles. OPMD is caused by a trinucleotide repeat expansion in the PABPN1 gene that results in an N-terminal expanded polyalanine tract in polyA-binding protein nuclear 1 (PABPN1). Here we show that the treatment of a mouse model of OPMD with an adeno-associated virus-based gene therapy combining complete knockdown of endogenous PABPN1 and its replacement by a wild-type PABPN1 substantially reduces the amount of insoluble aggregates, decreases muscle fibrosis, reverts muscle strength to the level of healthy muscles and normalizes the muscle transcriptome. The efficacy of the combined treatment is further confirmed in cells derived from OPMD patients. These results pave the way towards a gene replacement approach for OPMD treatment. PMID:28361972
Is There Convincing Evidence that Intermediate Repeats in the HTT Gene Cause Huntington's Disease?

PubMed

Oosterloo, Mayke; Van Belzen, Martine J; Bijlsma, Emilia K; Roos, Raymund A C

2015-01-01

Huntington's disease (HD) is a neurodegenerative disease associated with a CAG repeat expansion in the Huntingtin (HTT) gene. A trinucleotide size between 27 and 35 is considered 'intermediate' and not to cause symptoms and signs of HD. There are articles claiming otherwise, however publishing only the cases that have a HD phenotype introduces a significant publication bias. Our objective is to determine if there is convincing evidence that intermediate repeats (IA) cause HD. Previously published case reports on HTT intermediate repeat sizes and all cases from the Netherlands with an IA were reviewed for clinical symptoms and signs. Four patients had a clinical presentation of Huntington's disease and an IA out of ten reported cases in literature. Between 2001 and 2012, 1,690 patients were tested for HD in the Netherlands. One case out of 60 with an IA had a phenotype resembling HD, but had already been published in a case report. Given the high background frequency of intermediate alleles in several populations, the possibility of developing HD would have huge implications for 1-7% of the normal population. It is possible that IAs present as an endophenotype with the potential of subsequent clinical manifestations. However, given the scarcity of convincing cases, the lack of convincing biological evidence for pathogenicity of intermediate alleles, and many genes still to be discovered for HD mimics, we find that it is premature to claim that IAs can cause HD. We recommend systematic follow up of this group of individuals and if possible brain pathology for confirmation or exclusion of HD.
Identification, validation and cross-species transferability of novel Lavandula EST-SSRs.

PubMed

Adal, Ayelign M; Demissie, Zerihun A; Mahmoud, Soheil S

2015-04-01

We identified and characterized EST-SSRs with strong discrimination power against Lavandula angustifolia and Lavandula x intermedia . The markers also showed considerable cross-species transferability rate into six related Lavandula species. Lavenders (Lavandula) are important economical crops grown around the globe for essential oil production. In an attempt to develop genetic markers for these plants, we analyzed over 13,000 unigenes developed from L. angustifolia and L. x intermedia EST databases, and identified 3,459 simple sequence repeats (SSR), which were dominated by trinucleotides (41.2 %) and dinucleotides (31.45 %). Approximately, 19 % of the unigenes contained at least one SSR marker, over 60 % of which were localized in the UTRs. Only 252 EST-SSRs were 18 bp or longer from which 31 loci were validated, and 24 amplified discrete fragments with 85 % polymorphism in L. x intermedia and L. angustifolia. The average number of alleles in L. x intermedia and L. angustifolia were 3.42 and 3.71 per marker with average PIC values of 0.47 and 0.52, respectively. These values suggest a moderate to strong level of informativeness for the markers, with some loci producing unique fingerprints. The cross-species transferability rate of the markers ranges 50-100 % across eight species. The utility of these markers was assessed in eight Lavandula species and 15 L. angustifolia and L. x intermedia cultivars, and the dendrogram deduced from their similarity indexes successfully delineated the species into their respective sections and the cultivars into their respective species. These markers have potential for application in fingerprinting, diversity studies and marker-assisted breeding of Lavandula.
Altered Ca2+ signaling in skeletal muscle fibers of the R6/2 mouse, a model of Huntington’s disease

PubMed Central

Braubach, Peter; Orynbayev, Murat; Andronache, Zoita; Hering, Tanja; Landwehrmeyer, Georg Bernhard; Lindenberg, Katrin S.

2014-01-01

Huntington’s disease (HD) is caused by an expanded CAG trinucleotide repeat within the gene encoding the protein huntingtin. The resulting elongated glutamine (poly-Q) sequence of mutant huntingtin (mhtt) affects both central neurons and skeletal muscle. Recent reports suggest that ryanodine receptor–based Ca2+ signaling, which is crucial for skeletal muscle excitation–contraction coupling (ECC), is changed by mhtt in HD neurons. Consequently, we searched for alterations of ECC in muscle fibers of the R6/2 mouse, a mouse model of HD. We performed fluorometric recordings of action potentials (APs) and cellular Ca2+ transients on intact isolated toe muscle fibers (musculi interossei), and measured L-type Ca2+ inward currents on internally dialyzed fibers under voltage-clamp conditions. Both APs and AP-triggered Ca2+ transients showed slower kinetics in R6/2 fibers than in fibers from wild-type mice. Ca2+ removal from the myoplasm and Ca2+ release flux from the sarcoplasmic reticulum were characterized using a Ca2+ binding and transport model, which indicated a significant reduction in slow Ca2+ removal activity and Ca2+ release flux both after APs and under voltage-clamp conditions. In addition, the voltage-clamp experiments showed a highly significant decrease in L-type Ca2+ channel conductance. These results indicate profound changes of Ca2+ turnover in skeletal muscle of R6/2 mice and suggest that these changes may be associated with muscle pathology in HD. PMID:25348412
Structural Studies of the Tandem Tudor Domains of Fragile X Mental Retardation Related Proteins FXR1 and FXR2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Adams-Cioaba, Melanie A.; Guo, Yahong; Bian, ChuanBing

Expansion of the CGG trinucleotide repeat in the 5'-untranslated region of the FMR1, fragile X mental retardation 1, gene results in suppression of protein expression for this gene and is the underlying cause of Fragile X syndrome. In unaffected individuals, the FMRP protein, together with two additional paralogues (Fragile X Mental Retardation Syndrome-related Protein 1 and 2), associates with mRNA to form a ribonucleoprotein complex in the nucleus that is transported to dendrites and spines of neuronal cells. It is thought that the fragile X family of proteins contributes to the regulation of protein synthesis at sites where mRNAs aremore » locally translated in response to stimuli. Here, we report the X-ray crystal structures of the non-canonical nuclear localization signals of the FXR1 and FXR2 autosomal paralogues of FMRP, which were determined at 2.50 and 1.92 {angstrom}, respectively. The nuclear localization signals of the FXR1 and FXR2 comprise tandem Tudor domain architectures, closely resembling that of UHRF1, which is proposed to bind methylated histone H3K9. The FMRP, FXR1 and FXR2 proteins comprise a small family of highly conserved proteins that appear to be important in translational regulation, particularly in neuronal cells. The crystal structures of the N-terminal tandem Tudor domains of FXR1 and FXR2 revealed a conserved architecture with that of FMRP. Biochemical analysis of the tandem Tudor doamins reveals their ability to preferentially recognize trimethylated peptides in a sequence-specific manner.« less
[Mutation Analysis of 19 STR Loci in 20 723 Cases of Paternity Testing].

PubMed

Bi, J; Chang, J J; Li, M X; Yu, C Y

2017-06-01

To observe and analyze the confirmed cases of paternity testing, and to explore the mutation rules of STR loci. The mutant STR loci were screened from 20 723 confirmed cases of paternity testing by Goldeneye 20A system．The mutation rates, and the sources, fragment length, steps and increased or decreased repeat sequences of mutant alleles were counted for the analysis of the characteristics of mutation-related factors. A total of 548 mutations were found on 19 STR loci, and 557 mutation events were observed. The loci mutation rate was 0.07‰-2.23‰. The ratio of paternal to maternal mutant events was 3.06:1. One step mutation was the main mutation, and the number of the increased repeat sequences was almost the same as the decreased repeat sequences. The repeat sequences were more likely to decrease in two steps mutation and above. Mutation mainly occurred in the medium allele, and the number of the increased repeat sequences was almost the same as the decreased repeat sequences. In long allele mutations, the decreased repeat sequences were significantly more than the increased repeat sequences. The number of the increased repeat sequences was almost the same as the decreased repeat sequences in paternal mutation, while the decreased repeat sequences were more than the increased in maternal mutation. There are significant differences in the mutation rate of each locus. When one or two loci do not conform to the genetic law, other detection system should be added, and PI value should be calculated combined with the information of the mutate STR loci in order to further clarify the identification opinions. Copyright© by the Editorial Department of Journal of Forensic Medicine
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats

PubMed Central

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-01-01

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. PMID:26481363
Sequences characterization of microsatellite DNA sequences in Pacific abalone ( Haliotis discus hannai)

NASA Astrophysics Data System (ADS)

Li, Qi; Akihiro, Kijima

2007-01-01

The microsatellite-enriched library was constructed using magnetic bead hybridization selection method, and the microsatellite DNA sequences were analyzed in Pacific abalone Haliotis discus hannai. Three hundred and fifty white colonies were screened using PCR-based technique, and 84 clones were identified to potentially contain microsatellite repeat motif. The 84 clones were sequenced, and 42 microsatellites and 4 minisatellites with a minimum of five repeats were found (13.1% of white colonies screened). Besides the motif of CA contained in the oligoprobe, we also found other 16 types of microsatellite repeats including a dinucleotide repeat, two tetranucleotide repeats, twelve pentanucleotide repeats and a hexanucleotide repeat. According to Weber (1990), the microsatellite sequences obtained could be categorized structurally into perfect repeats (73.3%), imperfect repeats (13.3%), and compound repeats (13.4%). Among the microsatellite repeats, relatively short arrays (<20 repeats) were most abundant, accounting for 75.0%. The largest length of microsatellites was 48 repeats, and the average number of repeats was 13.4. The data on the composition and length distribution of microsatellites obtained in the present study can be useful for choosing the repeat motifs for microsatellite isolation in other abalone species.
Phylogenetic tree construction using trinucleotide usage profile (TUP).

PubMed

Chen, Si; Deng, Lih-Yuan; Bowman, Dale; Shiau, Jyh-Jen Horng; Wong, Tit-Yee; Madahian, Behrouz; Lu, Henry Horng-Shing

2016-10-06

It has been a challenging task to build a genome-wide phylogenetic tree for a large group of species containing a large number of genes with long nucleotides sequences. The most popular method, called feature frequency profile (FFP-k), finds the frequency distribution for all words of certain length k over the whole genome sequence using (overlapping) windows of the same length. For a satisfactory result, the recommended word length (k) ranges from 6 to 15 and it may not be a multiple of 3 (codon length). The total number of possible words needed for FFP-k can range from 4 6 =4096 to 4 15 . We propose a simple improvement over the popular FFP method using only a typical word length of 3. A new method, called Trinucleotide Usage Profile (TUP), is proposed based only on the (relative) frequency distribution using non-overlapping windows of length 3. The total number of possible words needed for TUP is 4 3 =64, which is much less than the total count for the recommended optimal "resolution" for FFP. To build a phylogenetic tree, we propose first representing each of the species by a TUP vector and then using an appropriate distance measure between pairs of the TUP vectors for the tree construction. In particular, we propose summarizing a DNA sequence by a matrix of three rows corresponding to three reading frames, recording the frequency distribution of the non-overlapping words of length 3 in each of the reading frame. We also provide a numerical measure for comparing trees constructed with various methods. Compared to the FFP method, our empirical study showed that the proposed TUP method is more capable of building phylogenetic trees with a stronger biological support. We further provide some justifications on this from the information theory viewpoint. Unlike the FFP method, the TUP method takes the advantage that the starting of the first reading frame is (usually) known. Without this information, the FFP method could only rely on the frequency distribution of overlapping words, which is the average (or mixture) of the frequency distributions of three possible reading frames. Consequently, we show (from the entropy viewpoint) that the FFP procedure could dilute important gene information and therefore provides less accurate classification.
Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution.

PubMed

Melters, Daniël P; Bradnam, Keith R; Young, Hugh A; Telis, Natalie; May, Michael R; Ruby, J Graham; Sebra, Robert; Peluso, Paul; Eid, John; Rank, David; Garcia, José Fernando; DeRisi, Joseph L; Smith, Timothy; Tobias, Christian; Ross-Ibarra, Jeffrey; Korf, Ian; Chan, Simon W L

2013-01-30

Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes.
Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution

PubMed Central

2013-01-01

Background Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Results Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. Conclusions While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes. PMID:23363705

Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm

PubMed Central

Glunčić, Matko; Paar, Vladimir

2013-01-01

The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K-string ensemble which enables a new method of direct mapping of symbolic DNA sequence into frequency domain, with straightforward identification of repeats as peaks in GRM diagram. In this way, we obtain very fast, efficient and highly automatized repeat finding tool. The method is robust to substitutions and insertions/deletions, as well as to various complexities of the sequence pattern. We present several case studies of GRM use, in order to illustrate its capabilities: identification of α-satellite tandem repeats and higher order repeats (HORs), identification of Alu dispersed repeats and of Alu tandems, identification of Period 3 pattern in exons, implementation of ‘magnifying glass’ effect, identification of complex HOR pattern, identification of inter-tandem transitional dispersed repeat sequences and identification of long segmental duplications. GRM algorithm is convenient for use, in particular, in cases of large repeat units, of highly mutated and/or complex repeats, and of global repeat maps for large genomic sequences (chromosomes and genomes). PMID:22977183
Identification and characterization of tandem repeats in exon III of dopamine receptor D4 (DRD4) genes from different mammalian species.

PubMed

Larsen, Svend Arild; Mogensen, Line; Dietz, Rune; Baagøe, Hans Jørgen; Andersen, Mogens; Werge, Thomas; Rasmussen, Henrik Berg

2005-12-01

In this study we have identified and characterized dopamine receptor D4 (DRD4) exon III tandem repeats in 33 public available nucleotide sequences from different mammalian species. We found that the tandem repeat in canids could be described in a novel and simple way, namely, as a structure composed of 15- and 12- bp modules. Tandem repeats composed of 18-bp modules were found in sequences from the horse, zebra, onager, and donkey, Asiatic bear, polar bear, common raccoon, dolphin, harbor porpoise, and domestic cat. Several of these sequences have been analyzed previously without a tandem repeat being found. In the domestic cow and gray seal we identified tandem repeats composed of 36-bp modules, each consisting of two closely related 18-bp basic units. A tandem repeat consisting of 9-bp modules was identified in sequences from mink and ferret. In the European otter we detected an 18-bp tandem repeat, while a tandem repeat consisting of 27-bp modules was identified in a sequence from European badger. Both these tandem repeats were composed of 9-bp basic units, which were closely related with the 9-bp repeat modules identified in the mink and ferret. Tandem repeats could not be identified in sequences from rodents. All tandem repeats possessed a high GC content with a strong bias for C. On phylogenetic analysis of the tandem repeats evolutionary related species were clustered into the same groups. The degree of conservation of the tandem repeats varied significantly between species. The deduced amino acid sequences of most of the tandem repeats exhibited a high propensity for disorder. This was also the case with an amino acid sequence of the human DRD4 exon III tandem repeat, which was included in the study for comparative purposes. We identified proline-containing motifs for SH3 and WW domain binding proteins, potential phosphorylation sites, PDZ domain binding motifs, and FHA domain binding motifs in the amino acid sequences of the tandem repeats. The numbers of potential functional sites varied pronouncedly between species. Our observations provide a platform for future studies of the architecture and evolution of the DRD4 exon III tandem repeat, and they suggest that differences in the structure of this tandem repeat contribute to specialization and generation of diversity in receptor function.
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

PubMed

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-11-16

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
The Pathogenic Role of Low Range Repeats in SCA17.

PubMed

Shin, Jung Hwan; Park, Hyeyoung; Ehm, Gwan Hee; Lee, Woong Woo; Yun, Ji Young; Kim, Young Eun; Lee, Jee-Young; Kim, Han-Joon; Kim, Jong-Min; Jeon, Beom Seok; Park, Sung-Sup

2015-01-01

SCA17 is an autosomal dominant cerebellar ataxia with expansion of the CAG/CAA trinucleotide repeats in the TATA-binding protein (TBP) gene. SCA17 can have various clinical presentations including parkinsonism, ataxia, chorea and dystonia. SCA17 is diagnosed by detecting the expanded CAG repeats in the TBP gene; however, in the literature, pathologic repeat numbers as low as 41 overlap with normal repeat numbers. The subjects in this study included patients with involuntary movement disorders such as cerebellar ataxia, parkinsonism, chorea and dystonia who visited Seoul National University Hospital between Jan. 2006 and Apr. 2014 and were screened for SCA17. Those who were diagnosed with other genetic diseases or nondegenerative diseases were excluded. DNA from healthy subjects who did not have a family history of parkinsonism, ataxia, psychiatric symptoms, chorea or dystonia served as the control. In total, 5242 chromosomes from 2099 patients and 522 normal controls were analyzed. The total number of patients included in the analysis was 2099 (parkinsonism, 1706; ataxia, 345; chorea, 37; and dystonia, 11). In the normal control, up to 44 repeats were found. In the 44 repeat group, there were 7 (0.3%) patients and 1 (0.2%) normal control. In 43 repeat group, there were 8 (0.4%) patients and 2 (0.4%) normal controls. In the 42 repeat group, there were 16 (0.8%) patients and 3 (0.6%) normal controls. In 41 repeat group, there were 48 (2.3%) patients and 8 (1.5%) normal controls. Considering the overlaps and non-significant differences in allelic frequencies between the patients and the normal controls with low-expansions, we could not determine a definitive cutoff value for the pathologic CAG repeat number of SCA17. Because the statistical analysis between the normal controls and patients with low range expansions failed to show any differences so far, we must consider that clinical cases with low range expansions could be idiopathic movement disorders showing coincidental CAG/CAA expansions. Thus, we need to reconsider the pathologic role of low range expansions (41-42). Long term follow up and comprehensive investigations using autopsy and imaging studies in patients and controls with low range expansions are necessary to determine the cutoff value for the pathologic CAG repeat number of SCA17.
Specific inhibition of protein synthesis in the rabbit reticulocyte lysate by two types of oligoribonucleotides

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wagner, T.; Gross, M.; Sigler, P.B.

1986-05-01

The oligonucleotides AUG, AUGG and AUGA, i.e. homologues of the initiation codon, are recognized as initiation sites by the protein synthetic machinery of the reticulocyte lysate. They induce the accumulation of inactive initiation complexes (80S x AUG x Met-tRNA/sub 1//sup Met/) and thereby deprive the system of active ribosomes. The only dinucleotide that inhibits protein synthesis is CA. CA and all trinucleotides of the form XCA and CAX (where X=U, C,A or G) block chain elongation at a level of 10/sup -5/M. Interestingly, inhibition by XCA is transient, while that by CAX becomes progressively greater with time. This phenomenon canmore » be explained by a 3'exonucleolytic activity in the lysate. Upon 3'terminal cleavage the XCA trinucleotides will lose the inhibitory CA, whereas CAX trinucleotides will simply be converted to CA, the specific inhibitor. This has been confirmed experimentally, since CC(/sup 3/H)A is completely hydrolyzed to CpC and p(/sup 3/H)A after 15 minutes of incubation. The mode of action of CA, while unclear, may be mediated by its similarity to the 3'-terminus of tRNA.« less
Repeated sequence sets in mitochondrial DNA molecules of root knot nematodes (Meloidogyne): nucleotide sequences, genome location and potential for host-race identification.

PubMed Central

Okimoto, R; Chamberlin, H M; Macfarlane, J L; Wolstenholme, D R

1991-01-01

Within a 7 kb segment of the mtDNA molecule of the root knot nematode, Meloidogyne javanica, that lacks standard mitochondrial genes, are three sets of strictly tandemly arranged, direct repeat sequences: approximately 36 copies of a 102 ntp sequence that contains a TaqI site; 11 copies of a 63 ntp sequence, and 5 copies of an 8 ntp sequence. The 7 kb repeat-containing segment is bounded by putative tRNAasp and tRNAf-met genes and the arrangement of sequences within this segment is: the tRNAasp gene; a unique 1,528 ntp segment that contains two highly stable hairpin-forming sequences; the 102 ntp repeat set; the 8 ntp repeat set; a unique 1,068 ntp segment; the 63 ntp repeat set; and the tRNAf-met gene. The nucleotide sequences of the 102 ntp copies and the 63 ntp copies have been conserved among the species examined. Data from Southern hybridization experiments indicate that 102 ntp and 63 ntp repeats occur in the mtDNAs of three, two and two races of M.incognita, M.hapla and M.arenaria, respectively. Nucleotide sequences of the M.incognita Race-3 102 ntp repeat were found to be either identical or highly similar to those of the M.javanica 102 ntp repeat. Differences in migration distance and number of 102 ntp repeat-containing bands seen in Southern hybridization autoradiographs of restriction-digested mtDNAs of M.javanica and the different host races of M.incognita, M.hapla and M.arenaria are sufficient to distinguish the different host races of each species. Images PMID:2027769
Variation, Repetition, And Choice

PubMed Central

Abreu-Rodrigues, Josele; Lattal, Kennon A; dos Santos, Cristiano V; Matos, Ricardo A

2005-01-01

Experiment 1 investigated the controlling properties of variability contingencies on choice between repeated and variable responding. Pigeons were exposed to concurrent-chains schedules with two alternatives. In the REPEAT alternative, reinforcers in the terminal link depended on a single sequence of four responses. In the VARY alternative, a response sequence in the terminal link was reinforced only if it differed from the n previous sequences (lag criterion). The REPEAT contingency generated low, constant levels of sequence variation whereas the VARY contingency produced levels of sequence variation that increased with the lag criterion. Preference for the REPEAT alternative tended to increase directly with the degree of variation required for reinforcement. Experiment 2 examined the potential confounding effects in Experiment 1 of immediacy of reinforcement by yoking the interreinforcer intervals in the REPEAT alternative to those in the VARY alternative. Again, preference for REPEAT was a function of the lag criterion. Choice between varying and repeating behavior is discussed with respect to obtained behavioral variability, probability of reinforcement, delay of reinforcement, and switching within a sequence. PMID:15828592
Huntington's disease accelerates epigenetic aging of human brain and disrupts DNA methylation levels.

PubMed

Horvath, Steve; Langfelder, Peter; Kwak, Seung; Aaronson, Jeff; Rosinski, Jim; Vogt, Thomas F; Eszes, Marika; Faull, Richard L M; Curtis, Maurice A; Waldvogel, Henry J; Choi, Oi-Wa; Tung, Spencer; Vinters, Harry V; Coppola, Giovanni; Yang, X William

2016-07-01

Age of Huntington's disease (HD) motoric onset is strongly related to the number of CAG trinucleotide repeats in the huntingtin gene, suggesting that biological tissue age plays an important role in disease etiology. Recently, a DNA methylation based biomarker of tissue age has been advanced as an epigenetic aging clock. We sought to inquire if HD is associated with an accelerated epigenetic age. DNA methylation data was generated for 475 brain samples from various brain regions of 26 HD cases and 39 controls. Overall, brain regions from HD cases exhibit a significant epigenetic age acceleration effect (p=0.0012). A multivariate model analysis suggests that HD status increases biological age by 3.2 years. Accelerated epigenetic age can be observed in specific brain regions (frontal lobe, parietal lobe, and cingulate gyrus). After excluding controls, we observe a negative correlation (r=-0.41, p=5.5×10-8) between HD gene CAG repeat length and the epigenetic age of HD brain samples. Using correlation network analysis, we identify 11 co-methylation modules with a significant association with HD status across 3 broad cortical regions. In conclusion, HD is associated with an accelerated epigenetic age of specific brain regions and more broadly with substantial changes in brain methylation levels.
EMQN/CMGS best practice guidelines for the molecular genetic testing of Huntington disease.

PubMed

Losekoot, Monique; van Belzen, Martine J; Seneca, Sara; Bauer, Peter; Stenhouse, Susan A R; Barton, David E

2013-05-01

Huntington disease (HD) is caused by the expansion of an unstable polymorphic trinucleotide (CAG)n repeat in exon 1 of the HTT gene, which translates into an extended polyglutamine tract in the protein. Laboratory diagnosis of HD involves estimation of the number of CAG repeats. Molecular genetic testing for HD is offered in a wide range of laboratories both within and outside the European community. In order to measure the quality and raise the standard of molecular genetic testing in these laboratories, the European Molecular Genetics Quality Network has organized a yearly external quality assessment (EQA) scheme for molecular genetic testing of HD for over 10 years. EQA compares a laboratory's output with a fixed standard both for genotyping and reporting of the results to the referring physicians. In general, the standard of genotyping is very high but the clarity of interpretation and reporting of the test result varies more widely. This emphasizes the need for best practice guidelines for this disorder. We have therefore developed these best practice guidelines for genetic testing for HD to assist in testing and reporting of results. The analytical methods and the potential pitfalls of molecular genetic testing are highlighted and the implications of the different test outcomes for the consultand and his or her family members are discussed.
Huntington's disease accelerates epigenetic aging of human brain and disrupts DNA methylation levels

PubMed Central

Horvath, Steve; Langfelder, Peter; Kwak, Seung; Aaronson, Jeff; Rosinski, Jim; Vogt, Thomas F.; Eszes, Marika; Faull, Richard L.M.; Curtis, Maurice A.; Waldvogel, Henry J.; Choi, Oi-Wa; Tung, Spencer; Vinters, Harry V.; Coppola, Giovanni; Yang, X. William

2016-01-01

Age of Huntington's disease (HD) motoric onset is strongly related to the number of CAG trinucleotide repeats in the huntingtin gene, suggesting that biological tissue age plays an important role in disease etiology. Recently, a DNA methylation based biomarker of tissue age has been advanced as an epigenetic aging clock. We sought to inquire if HD is associated with an accelerated epigenetic age. DNA methylation data was generated for 475 brain samples from various brain regions of 26 HD cases and 39 controls. Overall, brain regions from HD cases exhibit a significant epigenetic age acceleration effect (p=0.0012). A multivariate model analysis suggests that HD status increases biological age by 3.2 years. Accelerated epigenetic age can be observed in specific brain regions (frontal lobe, parietal lobe, and cingulate gyrus). After excluding controls, we observe a negative correlation (r=−0.41, p=5.5×10−8) between HD gene CAG repeat length and the epigenetic age of HD brain samples. Using correlation network analysis, we identify 11 co-methylation modules with a significant association with HD status across 3 broad cortical regions. In conclusion, HD is associated with an accelerated epigenetic age of specific brain regions and more broadly with substantial changes in brain methylation levels. PMID:27479945
Fragile X Syndrome

PubMed Central

Tassone, Flora; González-Teshima, Laura Yuriko; Forero-Forero, Jose Vicente; Ayala-Zapata, Sebastián; Hagerman, Randi

2014-01-01

Fragile X Syndrome (FXS) is a genetic disease due to a CGG trinucleotide expansion, named full mutation (greater than 200 CGG repeats), in the fragile X mental retardation 1 gene locus Xq27.3; which leads to an hypermethylated region in the gene promoter therefore silencing it and lowering the expression levels of the fragile X mental retardation 1, a protein involved in synaptic plasticity and maturation. Individuals with FXS present with intellectual disability, autism, hyperactivity, long face, large or prominent ears and macroorchidism at puberty and thereafter. Most of the young children with FXS will present with language delay, sensory hyper arousal and anxiety. Girls are less affected than boys, only 25% have intellectual disability. Given the genomic features of the syndrome, there are patients with a number of triplet repeats between 55 and 200, known as premutation carriers. Most carriers have a normal IQ but some have developmental problems. The diagnosis of FXS has evolved from karyotype with special culture medium, to molecular techniques that are more sensitive and specific including PCR and Southern Blot. During the last decade, the advances in the knowledge of FXS, has led to the development of investigations on pharmaceutical management or targeted treatments for FXS. Minocycline and sertraline have shown efficacy in children. PMID:25767309
Fork stalling and template switching as a mechanism for polyalanine tract expansion affecting the DYC mutant of HOXD13, a new murine model of synpolydactyly.

PubMed

Cocquempot, Olivier; Brault, Véronique; Babinet, Charles; Herault, Yann

2009-09-01

Polyalanine expansion diseases are proposed to result from unequal crossover of sister chromatids that increases the number of repeats. In this report we suggest an alternative mechanism we put forward while we investigated a new spontaneous mutant that we named "Dyc" for "Digit in Y and Carpe" phenotype. Phenotypic analysis revealed an abnormal limb patterning similar to that of the human inherited congenital disease synpolydactyly (SPD) and to the mouse mutant model Spdh. Both human SPD and mouse Spdh mutations affect the Hoxd13 gene within a 15-residue polyalanine-encoding repeat in the first exon of the gene, leading to a dominant negative HOXD13. Genetic analysis of the Dyc mutant revealed a trinucleotide expansion in the polyalanine-encoding region of the Hoxd13 gene resulting in a 7-alanine expansion. However, unlike the Spdh mutation, this expansion cannot result from a simple duplication of a short segment. Instead, we propose the fork stalling and template switching (FosTeS) described for generation of nonrecurrent genomic rearrangements as a possible mechanism for the Dyc polyalanine extension, as well as for other polyalanine expansions described in the literature and that could not be explained by unequal crossing over.
Fork Stalling and Template Switching As a Mechanism for Polyalanine Tract Expansion Affecting the DYC Mutant of HOXD13, a New Murine Model of Synpolydactyly

PubMed Central

Cocquempot, Olivier; Brault, Véronique; Babinet, Charles; Herault, Yann

2009-01-01

Polyalanine expansion diseases are proposed to result from unequal crossover of sister chromatids that increases the number of repeats. In this report we suggest an alternative mechanism we put forward while we investigated a new spontaneous mutant that we named “Dyc” for “Digit in Y and Carpe” phenotype. Phenotypic analysis revealed an abnormal limb patterning similar to that of the human inherited congenital disease synpolydactyly (SPD) and to the mouse mutant model Spdh. Both human SPD and mouse Spdh mutations affect the Hoxd13 gene within a 15-residue polyalanine-encoding repeat in the first exon of the gene, leading to a dominant negative HOXD13. Genetic analysis of the Dyc mutant revealed a trinucleotide expansion in the polyalanine-encoding region of the Hoxd13 gene resulting in a 7-alanine expansion. However, unlike the Spdh mutation, this expansion cannot result from a simple duplication of a short segment. Instead, we propose the fork stalling and template switching (FosTeS) described for generation of nonrecurrent genomic rearrangements as a possible mechanism for the Dyc polyalanine extension, as well as for other polyalanine expansions described in the literature and that could not be explained by unequal crossing over. PMID:19546318
Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

PubMed

Rehm, Charlotte; Wurmthaler, Lena A; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S

2015-01-01

In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1-5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6-9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria.
Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

PubMed Central

Rehm, Charlotte; Wurmthaler, Lena A.; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S.

2015-01-01

In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1–5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6–9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria. PMID:26695179
Viral morphogenesis is the dominant source of sequence censorship in M13 combinatorial peptide phage display.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rodi, D. J.; Soares, A. S.; Makowski, L.

Novel statistical methods have been developed and used to quantitate and annotate the sequence diversity within combinatorial peptide libraries on the basis of small numbers (1-200) of sequences selected at random from commercially available M13 p3-based phage display libraries. These libraries behave statistically as though they correspond to populations containing roughly 4.0{+-}1.6% of the random dodecapeptides and 7.9{+-}2.6% of the random constrained heptapeptides that are theoretically possible within the phage populations. Analysis of amino acid residue occurrence patterns shows no demonstrable influence on sequence censorship by Escherichia coli tRNA isoacceptor profiles or either overall codon or Class II codon usagemore » patterns, suggesting no metabolic constraints on recombinant p3 synthesis. There is an overall depression in the occurrence of cysteine, arginine and glycine residues and an overabundance of proline, threonine and histidine residues. The majority of position-dependent amino acid sequence bias is clustered at three positions within the inserted peptides of the dodecapeptide library, +1, +3 and +12 downstream from the signal peptidase cleavage site. Conformational tendency measures of the peptides indicate a significant preference for inserts favoring a {beta}-turn conformation. The observed protein sequence limitations can primarily be attributed to genetic codon degeneracy and signal peptidase cleavage preferences. These data suggest that for applications in which maximal sequence diversity is essential, such as epitope mapping or novel receptor identification, combinatorial peptide libraries should be constructed using codon-corrected trinucleotide cassettes within vector-host systems designed to minimize morphogenesis-related censorship.« less
Analysis of simple sequence repeat (SSR) structure and sequence within Epichloë endophyte genomes reveals impacts on gene structure and insights into ancestral hybridization events.

PubMed

Clayton, William; Eaton, Carla Jane; Dupont, Pierre-Yves; Gillanders, Tim; Cameron, Nick; Saikia, Sanjay; Scott, Barry

2017-01-01

Epichloë grass endophytes comprise a group of filamentous fungi of both sexual and asexual species. Known for the beneficial characteristics they endow upon their grass hosts, the identification of these endophyte species has been of great interest agronomically and scientifically. The use of simple sequence repeat loci and the variation in repeat elements has been used to rapidly identify endophyte species and strains, however, little is known of how the structure of repeat elements changes between species and strains, and where these repeat elements are located in the fungal genome. We report on an in-depth analysis of the structure and genomic location of the simple sequence repeat locus B10, commonly used for Epichloë endophyte species identification. The B10 repeat was found to be located within an exon of a putative bZIP transcription factor, suggesting possible impacts on polypeptide sequence and thus protein function. Analysis of this repeat in the asexual endophyte hybrid Epichloë uncinata revealed that the structure of B10 alleles reflects the ancestral species that hybridized to give rise to this species. Understanding the structure and sequence of these simple sequence repeats provides a useful set of tools for readily distinguishing strains and for gaining insights into the ancestral species that have undergone hybridization events.
TRAP: automated classification, quantification and annotation of tandemly repeated sequences.

PubMed

Sobreira, Tiago José P; Durham, Alan M; Gruber, Arthur

2006-02-01

TRAP, the Tandem Repeats Analysis Program, is a Perl program that provides a unified set of analyses for the selection, classification, quantification and automated annotation of tandemly repeated sequences. TRAP uses the results of the Tandem Repeats Finder program to perform a global analysis of the satellite content of DNA sequences, permitting researchers to easily assess the tandem repeat content for both individual sequences and whole genomes. The results can be generated in convenient formats such as HTML and comma-separated values. TRAP can also be used to automatically generate annotation data in the format of feature table and GFF files.
Regions of conservation and divergence in the 3' untranslated sequences of genomic RNA from Ross River virus isolates.

PubMed

Faragher, S G; Dalgarno, L

1986-07-20

The 3' untranslated (UT) sequences of the genomic RNAs of five geographic variants of the alphavirus Ross River virus (RRV) were determined and compared with the 3' UT sequence of RRV T48, the prototype strain. Part of the 3' UT region of Getah virus, a close serological relative of RRV, was also sequenced. The RRV 3' UT region varies markedly in length between variants. Large deletions or insertions, sequence rearrangements and single nucleotide substitutions are observed. A sequence tract of 49 to 58 nucleotides, which is repeated as four blocks in the RRV T48 3' UT region, occurs only once in the 3' UT region of one RRV strain (NB5092), indicating that the existence of repeat sequence blocks is not essential for RRV replication. However, the precise sequence of the 3' proximal copy of the repeat block and its position relative to the poly(A) tail were identical in all RRV isolates examined, suggesting that it has an important role in RRV replication. Nucleotide substitutions between RRV variants are distributed non-randomly along the length of the 3' UT region. The sequence of 120 to 130 nucleotides adjacent to the poly(A) tail is strongly conserved. Getah virus RNA contains three repeat sequence blocks in the 3' UT region. These are similar in sequence to those in RRV RNA but differ in their arrangement. Homology between the RRV and Getah 3' UT sequences is greatest in the 3' proximal repeat sequence block that shows three differences in 49 nucleotides. The 3' proximal repeat in Getah RNA occurs at the same position, relative to the poly(A) tail, as in all RRV variants. The RRV and Getah virus 3' UT sequences show extensive homology in the region between the 3' proximal repeat and the poly(A) tail but, apart from the repeat blocks themselves, they show no significant homology elsewhere.
Inversions and inverted transpositions as the basis for an almost universal "format" of genome sequences.

PubMed

Albrecht-Buehler, Guenter

2007-09-01

In genome duplexes that exceed 100 kb the frequency distributions of their trinucleotides (triplet profiles) are the same in both strands. This remarkable symmetry, sometimes called Chargaff's second parity rule, is not the result of base pairing, but can be explained as the result of countless inversions and inverted transpositions that occurred throughout evolution (G. Albrecht-Buehler, 2006, Proc. Natl. Acad. Sci. USA 103, 17828-17833). Furthermore, comparing the triplet profiles of genomes from a large number of different taxa and species revealed that they were not only strand-symmetrical, but even surprisingly similar to one another (majority profile; G. Albrecht-Buehler, 2007, Genomics 89, 596-601). The present article proposes that the same inversion/transposition mechanism(s) that created the strand symmetry may also explain the existence of the majority profile. Thus they may be key factors in the creation of an almost universal "format" in which genome sequences are written. One may speculate that this universality of genome format may facilitate horizontal gene transfer and, thus, accelerate evolution.

First Microsatellite Markers Developed from Cupuassu ESTs: Application in Diversity Analysis and Cross-Species Transferability to Cacao.

PubMed

Ferraz Dos Santos, Lucas; Moreira Fregapani, Roberta; Falcão, Loeni Ludke; Togawa, Roberto Coiti; Costa, Marcos Mota do Carmo; Lopes, Uilson Vanderlei; Peres Gramacho, Karina; Alves, Rafael Moyses; Micheli, Fabienne; Marcellino, Lucilia Helena

2016-01-01

The cupuassu tree (Theobroma grandiflorum) (Willd. ex Spreng.) Schum. is a fruitful species from the Amazon with great economical potential, due to the multiple uses of its fruit´s pulp and seeds in the food and cosmetic industries, including the production of cupulate, an alternative to chocolate. In order to support the cupuassu breeding program and to select plants presenting both pulp/seed quality and fungal disease resistance, SSRs from Next Generation Sequencing ESTs were obtained and used in diversity analysis. From 8,330 ESTs, 1,517 contained one or more SSRs (1,899 SSRs identified). The most abundant motifs identified in the EST-SSRs were hepta- and trinucleotides, and they were found with a minimum and maximum of 2 and 19 repeats, respectively. From the 1,517 ESTs containing SSRs, 70 ESTs were selected based on their functional annotation, focusing on pulp and seed quality, as well as resistance to pathogens. The 70 ESTs selected contained 77 SSRs, and among which, 11 were polymorphic in cupuassu genotypes. These EST-SSRs were able to discriminate the cupuassu genotype in relation to resistance/susceptibility to witches' broom disease, as well as to pulp quality (SST/ATT values). Finally, we showed that these markers were transferable to cacao genotypes, and that genome availability might be used as a predictive tool for polymorphism detection and primer design useful for both Theobroma species. To our knowledge, this is the first report involving EST-SSRs from cupuassu and is also a pioneer in the analysis of marker transferability from cupuassu to cacao. Moreover, these markers might contribute to develop or saturate the cupuassu and cacao genetic maps, respectively.
Bicistronic CACNA1A Gene Expression in Neurons Derived from Spinocerebellar Ataxia Type 6 Patient-Induced Pluripotent Stem Cells

PubMed Central

Bavassano, Carlo; Eigentler, Andreas; Stanika, Ruslan; Obermair, Gerald J.; Boesch, Sylvia; Dechant, Georg

2017-01-01

Spinocerebellar ataxia type 6 (SCA6) is an autosomal-dominant neurodegenerative disorder that is caused by a CAG trinucleotide repeat expansion in the CACNA1A gene. As one of the few bicistronic genes discovered in the human genome, CACNA1A encodes not only the α1A subunit of the P/Q type voltage-gated Ca2+ channel CaV2.1 but also the α1ACT protein, a 75 kDa transcription factor sharing the sequence of the cytoplasmic C-terminal tail of the α1A subunit. Isoforms of both proteins contain the polyglutamine (polyQ) domain that is expanded in SCA6 patients. Although certain SCA6 phenotypes appear to be specific for Purkinje neurons, other pathogenic effects of the SCA6 polyQ mutation can affect a broad spectrum of central nervous system (CNS) neuronal subtypes. We investigated the expression and function of CACNA1A gene products in human neurons derived from induced pluripotent stem cells from two SCA6 patients. Expression levels of CACNA1A encoding α1A subunit were similar between SCA6 and control neurons, and no differences were found in the subcellular distribution of CaV2.1 channel protein. The α1ACT immunoreactivity was detected in the majority of cell nuclei of SCA6 and control neurons. Although no SCA6 genotype-dependent differences in CaV2.1 channel function were observed, they were found in the expression levels of the α1ACT target gene Granulin (GRN) and in glutamate-induced cell vulnerability. PMID:28946818
First Microsatellite Markers Developed from Cupuassu ESTs: Application in Diversity Analysis and Cross-Species Transferability to Cacao

PubMed Central

Ferraz dos Santos, Lucas; Moreira Fregapani, Roberta; Falcão, Loeni Ludke; Togawa, Roberto Coiti; Costa, Marcos Mota do Carmo; Lopes, Uilson Vanderlei; Peres Gramacho, Karina; Alves, Rafael Moyses

2016-01-01

The cupuassu tree (Theobroma grandiflorum) (Willd. ex Spreng.) Schum. is a fruitful species from the Amazon with great economical potential, due to the multiple uses of its fruit´s pulp and seeds in the food and cosmetic industries, including the production of cupulate, an alternative to chocolate. In order to support the cupuassu breeding program and to select plants presenting both pulp/seed quality and fungal disease resistance, SSRs from Next Generation Sequencing ESTs were obtained and used in diversity analysis. From 8,330 ESTs, 1,517 contained one or more SSRs (1,899 SSRs identified). The most abundant motifs identified in the EST-SSRs were hepta- and trinucleotides, and they were found with a minimum and maximum of 2 and 19 repeats, respectively. From the 1,517 ESTs containing SSRs, 70 ESTs were selected based on their functional annotation, focusing on pulp and seed quality, as well as resistance to pathogens. The 70 ESTs selected contained 77 SSRs, and among which, 11 were polymorphic in cupuassu genotypes. These EST-SSRs were able to discriminate the cupuassu genotype in relation to resistance/susceptibility to witches’ broom disease, as well as to pulp quality (SST/ATT values). Finally, we showed that these markers were transferable to cacao genotypes, and that genome availability might be used as a predictive tool for polymorphism detection and primer design useful for both Theobroma species. To our knowledge, this is the first report involving EST-SSRs from cupuassu and is also a pioneer in the analysis of marker transferability from cupuassu to cacao. Moreover, these markers might contribute to develop or saturate the cupuassu and cacao genetic maps, respectively. PMID:26949967
Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure.

PubMed

Ruhlman, Tracey A; Zhang, Jin; Blazier, John C; Sabir, Jamal S M; Jansen, Robert K

2017-04-01

There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements. We used Illumina paired-end and PacBio single-molecule real-time (SMRT) sequences to characterize repeat structure in the plastome of Monsonia emarginata (Geraniaceae). We used OrgConv and inspected nucleotide alignments to infer ancestral nucleotides and identify gene conversion among repeats and mapped long (>1 kb) SMRT reads against the unit-genome assembly to identify alternative sequence arrangements. Although M. emarginata lacks the canonical IR, we found that large repeats (>1 kilobase; kb) represent ∼22% of the plastome nucleotide content. Among the largest repeats (>2 kb), we identified GC-biased gene conversion and mapping filtered, long SMRT reads to the M. emarginata unit-genome assembly revealed alternative, substoichiometric sequence arrangements. We offer a model based on RDR and gene conversion between long repeated sequences in the M. emarginata plastome and provide support that both intra-and intermolecular recombination between large repeats, particularly in repeat-rich plastomes, varies unit-genome structure while homogenizing the nucleotide sequence of repeats. © 2017 Botanical Society of America.
Microsatellites for Oenothera gayleana and O. hartwegii subsp. filifolia (Onagraceae), and their utility in section Calylophus1

PubMed Central

Lewis, Emily M.; Fant, Jeremie B.; Moore, Michael J.; Hastings, Amy P.; Larson, Erica L.; Agrawal, Anurag A.; Skogen, Krissa A.

2016-01-01

Premise of the study: Eleven nuclear and four plastid microsatellite markers were screened for two gypsum endemic species, Oenothera gayleana and O. hartwegii subsp. filifolia, and tested for cross-amplification in the remaining 11 taxa within Oenothera sect. Calylophus (Onagraceae). Methods and Results: Microsatellite markers were tested in two to three populations spanning the ranges of both O. gayleana and O. hartwegii subsp. filifolia. The nuclear microsatellite loci consisted of both di- and trinucleotide repeats with one to 17 alleles per population. Several loci showed significant deviation from Hardy–Weinberg equilibrium, which may be evidence of chromosomal rings. The plastid microsatellite markers identified one to seven haplotypes per population. The transferability of these markers was confirmed in all 11 taxa within Oenothera sect. Calylophus. Conclusions: The microsatellite loci characterized here are the first developed and tested in Oenothera sect. Calylophus. These markers will be used to assess whether pollinator foraging distance influences population genetic parameters in predictable ways. PMID:26949578
Genetic mapping to 10q23.3-q24.2, in a large Italian pedigree, of a new syndrome showing bilateral cataracts, gastroesophageal reflux, and spastic paraparesis with amyotrophy.

PubMed Central

Seri, M; Cusano, R; Forabosco, P; Cinti, R; Caroli, F; Picco, P; Bini, R; Morra, V B; De Michele, G; Lerone, M; Silengo, M; Pela, I; Borrone, C; Romeo, G; Devoto, M

1999-01-01

We have recently observed a large pedigree with a new rare autosomal dominant spastic paraparesis. In three subsequent generations, 13 affected individuals presented with bilateral cataracts, gastroesophageal reflux with persistent vomiting, and spastic paraparesis with amyotrophy. Bilateral cataracts occurred in all affected individuals, with the exception of one patient who presented with a chorioretinal dystrophy, whereas clinical signs of spastic paraparesis showed a variable expressivity. Using a genomewide mapping approach, we mapped the disorder to the long arm of chromosome 10 on band q23.3-q24.2, in a 12-cM chromosomal region where additional neurologic disorders have been localized. The spectrum of phenotypic manifestations in this family is reminiscent of a smaller pedigree, reported recently, confirming the possibility of a new syndrome. Finally, the anticipation of symptoms suggests that an unstable trinucleotide repeat may be responsible for the condition. PMID:9973297
Fragile X syndrome: loss of local mRNA regulation alters synaptic development and function.

PubMed

Bassell, Gary J; Warren, Stephen T

2008-10-23

Fragile X syndrome is the most common inherited form of cognitive deficiency in humans and perhaps the best-understood single cause of autism. A trinucleotide repeat expansion, inactivating the X-linked FMR1 gene, leads to the absence of the fragile X mental retardation protein. FMRP is a selective RNA-binding protein that regulates the local translation of a subset of mRNAs at synapses in response to activation of Gp1 metabotropic glutamate receptors (mGluRs) and possibly other receptors. In the absence of FMRP, excess and dysregulated mRNA translation leads to altered synaptic function and loss of protein synthesis-dependent plasticity. Recent evidence indicates the role of FMRP in regulated mRNA transport in dendrites. New studies also suggest a possible local function of FMRP in axons that may be important for guidance, synaptic development, and formation of neural circuits. The understanding of FMRP function at synapses has led to rationale therapeutic approaches.
Fragile X syndrome: mechanistic insights and therapeutic avenues regarding the role of potassium channels.

PubMed

Lee, Hye Young; Jan, Lily Yeh

2012-10-01

Fragile X syndrome (FXS) is a common form of mental disability and one of the known causes of autism. The mutation responsible for FXS is a large expansion of the trinucleotide CGG repeats that leads to DNA methylation of the fragile X mental retardation gene 1 (FMR1) and transcriptional silencing, resulting in the absence of fragile X mental retardation protein (FMRP), an mRNA binding protein. Although it is widely known that FMRP is critical for metabotropic glutamate receptor (mGluR)-dependent long-term depression (LTD), which has provided a general theme for developing pharmacological drugs for FXS, specific downstream targets of FMRP may also be of therapeutic value. Since alterations in potassium channel expression level or activity could underlie neuronal network defects in FXS, here we describe recent findings on how these channels might be altered in mouse models of FXS and the possible therapeutic avenues for treating FXS. Copyright © 2012 Elsevier Ltd. All rights reserved.
Methods for sequencing GC-rich and CCT repeat DNA templates

DOEpatents

Robinson, Donna L.

2007-02-20

The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.
The Contribution of Short Repeats of Low Sequence Complexity to Large Conifer Genomes

Treesearch

A. Schmidt; R.L. Doudrick; J.S. Heslop-Harrison; T. Schmidt

2000-01-01

Abstract: The abundance and genomic organization of six simple sequence repeats, consisting of di-, tri-, and tetranucleotide sequence motifs, and a minisatellite repeat have been analyzed in different gymnosperms by Southern hybridization. Within the gymnosperm genomes investigated, the abundance and genomic organization of micro- and...
Always look on both sides: Phylogenetic information conveyed by simple sequence repeat allele sequences

USDA-ARS?s Scientific Manuscript database

Simple sequence repeat (SSR) markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily,...
[Bioinformatics Analysis of Clustered Regularly Interspaced Short Palindromic Repeats in the Genomes of Shigella].

PubMed

Wang, Pengfei; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Wang, Linlin; Guo, Xiangjiao; Yang, Haiyan; Xi, Yuanlin

2015-04-01

This study was aimed to explore the features of clustered regularly interspaced short palindromic repeats (CRISPR) structures in Shigella by using bioinformatics. We used bioinformatics methods, including BLAST, alignment and RNA structure prediction, to analyze the CRISPR structures of Shigella genomes. The results showed that the CRISPRs existed in the four groups of Shigella, and the flanking sequences of upstream CRISPRs could be classified into the same group with those of the downstream. We also found some relatively conserved palindromic motifs in the leader sequences. Repeat sequences had the same group with corresponding flanking sequences, and could be classified into two different types by their RNA secondary structures, which contain "stem" and "ring". Some spacers were found to homologize with part sequences of plasmids or phages. The study indicated that there were correlations between repeat sequences and flanking sequences, and the repeats might act as a kind of recognition mechanism to mediate the interaction between foreign genetic elements and Cas proteins.
A novel species-specific tandem repeat DNA family from Sinapis arvensis: detection of telomere-like sequences.

PubMed

Kapila, R; Das, S; Srivastava, P S; Lakshmikumaran, M

1996-08-01

DNA sequences representing a tandemly repeated DNA family of the Sinapis arvensis genome were cloned and characterized. The 700-bp tandem repeat family is represented by two clones, pSA35 and pSA52, which are 697 and 709 bp in length, respectively. Dot matrix analysis of the sequences indicates the presence of repeated elements within each monomeric unit. Sequence analysis of the repetitive region of clones pSA35 and pSA52 shows that there are several copies of a 7-bp repeat element organized in tandem. The consensus sequence of this repeat element is 5'-TTTAGGG-3'. These elements are highly mutated and the difference in length between the two clones is due to different copy numbers of these elements. The repetitive region of clone pSA35 has 26 copies of the element TTTAGGG, whereas clone pSA52 has 28 copies. The repetitive region in both clones is flanked on either side by inverted repeats that may be footprints of a transposition event. Sequence comparison indicates that the element TTTAGGG is identical to telomeric repeats present in Arabidopsis, maize, tomato, and other plants. However, Bal31 digestion kinetics indicates non-telomeric localization of the 700-bp tandem repeats. The clones represent a novel repeat family as (i) they contain telomere-like motifs as subrepeats within each unit; and (ii) they do not hybridize to related crucifers and are species-specific in nature.
Stability of Tandem Repeats in the Drosophila Melanogaster HSR-Omega Nuclear RNA

PubMed Central

Hogan, N. C.; Slot, F.; Traverse, K. L.; Garbe, J. C.; Bendena, W. G.; Pardue, M. L.

1995-01-01

The Drosophila melanogaster Hsr-omega locus produces a nuclear RNA containing >5 kb of tandem repeat sequences. These repeats are unique to Hsr-omega and show concerted evolution similar to that seen with classical satellite DNAs. In D. melanogaster the monomer is ~280 bp. Sequences of 191/2 monomers differ by 8 +/- 5% (mean +/- SD), when all pairwise comparisons are considered. Differences are single nucleotide substitutions and 1-3 nucleotide deletions/insertions. Changes appear to be randomly distributed over the repeat unit. Outer repeats do not show the decrease in monomer homogeneity that might be expected if homogeneity is maintained by recombination. However, just outside the last complete repeat at each end, there are a few fragments of sequence similar to the monomer. The sequences in these flanking regions are not those predicted for sequences decaying in the absence of recombination. Instead, the fragmentation of the sequence homology suggests that flanking regions have undergone more severe disruptions, possibly during an insertion or amplification event. Hsr-omega alleles differing in the number of repeats are detected and appear to be stable over a few thousand generations; however, both increases and decreases in repeat numbers have been observed. The new alleles appear to be as stable as their predecessors. No alleles of less than ~5 kb nor more than ~16 kb of repeats were seen in any stocks examined. The evidence that there is a limit on the minimum number of repeats is consistent with the suggestion that these repeats are important in the function of the unusual Hsr-omega nuclear RNA. PMID:7540581
Increased Steady-State Mutant Huntingtin mRNA in Huntington's Disease Brain.

PubMed

Liu, Wanzhao; Chaurette, Joanna; Pfister, Edith L; Kennington, Lori A; Chase, Kathryn O; Bullock, Jocelyn; Vonsattel, Jean Paul G; Faull, Richard L M; Macdonald, Douglas; DiFiglia, Marian; Zamore, Phillip D; Aronin, Neil

2013-01-01

Huntington's disease is caused by expansion of CAG trinucleotide repeats in the first exon of the huntingtin gene, which is essential for both development and neurogenesis. Huntington's disease is autosomal dominant. The normal allele contains 6 to 35 CAG triplets (average, 18) and the mutant, disease-causing allele contains >36 CAG triplets (average, 42). We examined 279 postmortem brain samples, including 148 HD and 131 non-HD controls. A total of 108 samples from 87 HD patients that are heterozygous at SNP rs362307, with a normal allele (18 to 27 CAG repeats) and a mutant allele (39 to 73 CAG repeats) were used to measure relative abundance of mutant and wild-type huntingtin mRNA. We used allele-specific, quantitative RT-PCR based on SNP heterozygosity to estimate the relative amount of mutant versus normal huntingtin mRNA in postmortem brain samples from patients with Huntington's disease. In the cortex and striatum, the amount of mRNA from the mutant allele exceeds that from the normal allele in 75% of patients. In the cerebellum, no significant difference between the two alleles was evident. Brain tissues from non-HD controls show no significant difference between two alleles of huntingtin mRNAs. Allelic differences were more pronounced at early neuropathological grades (grades 1 and 2) than at late grades (grades 3 and 4). More mutant HTT than normal could arise from increased transcription of mutant HTT allele, or decreased clearance of mutant HTT mRNA, or both. An implication is that equimolar silencing of both alleles would increase the mutant HTT to normal HTT ratio.
Typing Clostridium difficile strains based on tandem repeat sequences

PubMed Central

2009-01-01

Background Genotyping of epidemic Clostridium difficile strains is necessary to track their emergence and spread. Portability of genotyping data is desirable to facilitate inter-laboratory comparisons and epidemiological studies. Results This report presents results from a systematic screen for variation in repetitive DNA in the genome of C. difficile. We describe two tandem repeat loci, designated 'TR6' and 'TR10', which display extensive sequence variation that may be useful for sequence-based strain typing. Based on an investigation of 154 C. difficile isolates comprising 75 ribotypes, tandem repeat sequencing demonstrated excellent concordance with widely used PCR ribotyping and equal discriminatory power. Moreover, tandem repeat sequences enabled the reconstruction of the isolates' largely clonal population structure and evolutionary history. Conclusion We conclude that sequence analysis of the two repetitive loci introduced here may be highly useful for routine typing of C. difficile. Tandem repeat sequence typing resolves phylogenetic diversity to a level equivalent to PCR ribotypes. DNA sequences may be stored in databases accessible over the internet, obviating the need for the exchange of reference strains. PMID:19133124
[Comparative analysis of clustered regularly interspaced short palindromic repeats (CRISPRs) loci in the genomes of halophilic archaea].

PubMed

Zhang, Fan; Zhang, Bing; Xiang, Hua; Hu, Songnian

2009-11-01

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a widespread system that provides acquired resistance against phages in bacteria and archaea. Here we aim to genome-widely analyze the CRISPR in extreme halophilic archaea, of which the whole genome sequences are available at present time. We used bioinformatics methods including alignment, conservation analysis, GC content and RNA structure prediction to analyze the CRISPR structures of 7 haloarchaeal genomes. We identified the CRISPR structures in 5 halophilic archaea and revealed a conserved palindromic motif in the flanking regions of these CRISPR structures. In addition, we found that the repeat sequences of large CRISPR structures in halophilic archaea were greatly conserved, and two types of predicted RNA secondary structures derived from the repeat sequences were likely determined by the fourth base of the repeat sequence. Our results support the proposal that the leader sequence may function as recognition site by having palindromic structures in flanking regions, and the stem-loop secondary structure formed by repeat sequences may function in mediating the interaction between foreign genetic elements and CAS-encoded proteins.
Oncogene GAEC1 regulates CAPN10 expression which predicts survival in esophageal squamous cell carcinoma

PubMed Central

Chan, Dessy; Tsoi, Miriam Yuen-Tung; Liu, Christina Di; Chan, Sau-Hing; Law, Simon Ying-Kit; Chan, Kwok-Wah; Chan, Yuen-Piu; Gopalan, Vinod; Lam, Alfred King-Yin; Tang, Johnny Cheuk-On

2013-01-01

AIM: To identify the downstream regulated genes of GAEC1 oncogene in esophageal squamous cell carcinoma and their clinicopathological significance. METHODS: The anti-proliferative effect of knocking down the expression of GAEC1 oncogene was studied by using the RNA interference (RNAi) approach through transfecting the GAEC1-overexpressed esophageal carcinoma cell line KYSE150 with the pSilencer vector cloned with a GAEC1-targeted sequence, followed by MTS cell proliferation assay and cell cycle analysis using flow cytometry. RNA was then extracted from the parental, pSilencer-GAEC1-targeted sequence transfected and pSilencer negative control vector transfected KYSE150 cells for further analysis of different patterns in gene expression. Genes differentially expressed with suppressed GAEC1 expression were then determined using Human Genome U133 Plus 2.0 cDNA microarray analysis by comparing with the parental cells and normalized with the pSilencer negative control vector transfected cells. The most prominently regulated genes were then studied by immunohistochemical staining using tissue microarrays to determine their clinicopathological correlations in esophageal squamous cell carcinoma by statistical analyses. RESULTS: The RNAi approach of knocking down gene expression showed the effective suppression of GAEC1 expression in esophageal squamous cell carcinoma cell line KYSE150 that resulted in the inhibition of cell proliferation and increase of apoptotic population. cDNA microarray analysis for identifying differentially expressed genes detected the greatest levels of downregulation of calpain 10 (CAPN10) and upregulation of trinucleotide repeat containing 6C (TNRC6C) transcripts when GAEC1 expression was suppressed. At the tissue level, the high level expression of calpain 10 protein was significantly associated with longer patient survival (month) of esophageal squamous cell carcinoma compared to the patients with low level of calpain 10 expression (37.73 ± 16.33 vs 12.62 ± 12.44, P = 0.032). No significant correction was observed among the TNRC6C protein expression level and the clinocopathologcial features of esophageal squamous cell carcinoma. CONCLUSION: GAEC1 regulates the expression of CAPN10 and TNRC6C downstream. Calpain 10 expression is a potential prognostic marker in patients with esophageal squamous cell carcinoma. PMID:23687414
Genome Wide Characterization of Simple Sequence Repeats in Cucumber

USDA-ARS?s Scientific Manuscript database

The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...
Telomere extension by telomerase and ALT generates variant repeats by mechanistically distinct processes

PubMed Central

Lee, Michael; Hills, Mark; Conomos, Dimitri; Stutz, Michael D.; Dagg, Rebecca A.; Lau, Loretta M.S.; Reddel, Roger R.; Pickett, Hilda A.

2014-01-01

Telomeres are terminal repetitive DNA sequences on chromosomes, and are considered to comprise almost exclusively hexameric TTAGGG repeats. We have evaluated telomere sequence content in human cells using whole-genome sequencing followed by telomere read extraction in a panel of mortal cell strains and immortal cell lines. We identified a wide range of telomere variant repeats in human cells, and found evidence that variant repeats are generated by mechanistically distinct processes during telomerase- and ALT-mediated telomere lengthening. Telomerase-mediated telomere extension resulted in biased repeat synthesis of variant repeats that differed from the canonical sequence at positions 1 and 3, but not at positions 2, 4, 5 or 6. This indicates that telomerase is most likely an error-prone reverse transcriptase that misincorporates nucleotides at specific positions on the telomerase RNA template. In contrast, cell lines that use the ALT pathway contained a large range of variant repeats that varied greatly between lines. This is consistent with variant repeats spreading from proximal telomeric regions throughout telomeres in a stochastic manner by recombination-mediated templating of DNA synthesis. The presence of unexpectedly large numbers of variant repeats in cells utilizing either telomere maintenance mechanism suggests a conserved role for variant sequences at human telomeres. PMID:24225324

Identification of Simple Sequence Repeats in Chloroplast Genomes of Magnoliids Through Bioinformatics Approach.

PubMed

Srivastava, Deepika; Shanker, Asheesh

2016-12-01

Basal angiosperms or Magnoliids is an important clade of commercially important plants which mainly include spices and edible fruits. In this study, 17 chloroplast genome sequences belonging to clade Magnoliids were screened for the identification of chloroplast simple sequence repeats (cpSSRs). Simple sequence repeats or microsatellites are short stretches of DNA up to 1-6 base pair in length. These repeats are ubiquitous and play important role in the development of molecular markers and to study the mapping of traits of economic, medical or ecological interest. A total of 479 SSRs were detected, showing average density of 1 SSR/6.91 kb. Depending on the repeat units, the length of SSRs ranged from 12 to 24 bp for mono-, 12 to 18 bp for di-, 12 to 26 bp for tri-, 12 to 24 bp for tetra-, 15 bp for penta- and 18 bp for hexanucleotide repeats. Mononucleotide repeats were the most frequent (207, 43.21 %) followed by tetranucleotide repeats (130, 27.13 %). Penta- and hexanucleotide repeats were least frequent or absent in these chloroplast genomes.
Molecular characterization and distribution of a 145-bp tandem repeat family in the genus Populus.

PubMed

Rajagopal, J; Das, S; Khurana, D K; Srivastava, P S; Lakshmikumaran, M

1999-10-01

This report aims to describe the identification and molecular characterization of a 145-bp tandem repeat family that accounts for nearly 1.5% of the Populus genome. Three members of this repeat family were cloned and sequenced from Populus deltoides and P. ciliata. The dimers of the repeat were sequenced in order to confirm the head-to-tail organization of the repeat. Hybridization-based analysis using the 145-bp tandem repeat as a probe on genomic DNA gave rise to ladder patterns which were identified to be a result of methylation and (or) sequence heterogeneity. Analysis of the methylation pattern of the repeat family using methylation-sensitive isoschizomers revealed variable methylation of the C residues and lack of methylation of the A residues. Sequence comparisons between the monomers revealed a high degree of sequence divergence that ranged between 6% and 11% in P. deltoides and between 4.2% and 8.3% in P. ciliata. This indicated the presence of sub-families within the 145-bp tandem family of repeats. Divergence was mainly due to the accumulation of point mutations and was concentrated in the central region of the repeat. The 145-bp tandem repeat family did not show significant homology to known tandem repeats from plants. A short stretch of 36 bp was found to show homology of 66.7% to a centromeric repeat from Chironomus plumosus. Dot-blot analysis and Southern hybridization data revealed the presence of the repeat family in 13 of the 14 Populus species examined. The absence of the 145-bp repeat from P. euphratica suggested that this species is relatively distant from other members of the genus, which correlates with taxonomic classifications. The widespread occurrence of the tandem family in the genus indicated that this family may be of ancient origin.
Small tandemly repeated DNA sequences of higher plants likely originate from a tRNA gene ancestor.

PubMed Central

Benslimane, A A; Dron, M; Hartmann, C; Rode, A

1986-01-01

Several monomers (177 bp) of a tandemly arranged repetitive nuclear DNA sequence of Brassica oleracea have been cloned and sequenced. They share up to 95% homology between one another and up to 80% with other satellite DNA sequences of Cruciferae, suggesting a common ancestor. Both strands of these monomers show more than 50% homology with many tRNA genes; the best homologies have been obtained with Lys and His yeast mitochondrial tRNA genes (respectively 64% and 60%). These results suggest that small tandemly repeated DNA sequences of plants may have evolved from a tRNA gene ancestor. These tandem repeats have probably arisen via a process involving reverse transcription of polymerase III RNA intermediates, as is the case for interspersed DNA sequences of mammalians. A model is proposed to explain the formation of such small tandemly repeated DNA sequences. Images PMID:3774553
Two tandemly repeated telomere-associated sequences in Nicotiana plumbaginifolia.

PubMed

Chen, C M; Wang, C T; Wang, C J; Ho, C H; Kao, Y Y; Chen, C C

1997-12-01

Two tandemly repeated telomere-associated sequences, NP3R and NP4R, have been isolated from Nicotiana plumbaginifolia. The length of a repeating unit for NP3R and NP4R is 165 and 180 nucleotides respectively. The abundance of NP3R, NP4R and telomeric repeats is, respectively, 8.4 x 10(4), 6 x 10(3) and 1.5 x 10(6) copies per haploid genome of N. plumbaginifolia. Fluorescence in situ hybridization revealed that NP3R is located at the ends and/or in interstitial regions of all 10 chromosomes and NP4R on the terminal regions of three chromosomes in the haploid genome of N. plumbaginifolia. Sequence homology search revealed that not only are NP3R and NP4R homologous to HRS60 and GRS, respectively, two tandem repeats isolated from N. tabacum, but that NP3R and NP4R are also related to each other, suggesting that they originated from a common ancestral sequence. The role of these repeated sequences in chromosome healing is discussed based on the observation that two to three copies of a telomere-similar sequence were present in each repeating unit of NP3R and NP4R.
Epigenetics of Huntington's Disease.

PubMed

Bassi, Silvia; Tripathi, Takshashila; Monziani, Alan; Di Leva, Francesca; Biagioli, Marta

2017-01-01

Huntington's disease (HD) is a genetic, fatal autosomal dominant neurodegenerative disorder typically occurring in midlife with symptoms ranging from chorea, to dementia, to personality disturbances (Philos Trans R Soc Lond Ser B Biol Sci 354:957-961, 1999). HD is inherited in a dominant fashion, and the underlying mutation in all cases is a CAG trinucleotide repeat expansion within exon 1 of the HD gene (Cell 72:971-983, 1993). The expanded CAG repeat, translated into a lengthened glutamine tract at the amino terminus of the huntingtin protein, affects its structural properties and functional activities. The effects are pleiotropic, as huntingtin is broadly expressed in different cellular compartments (i.e., cytosol, nucleus, mitochondria) as well as in all cell types of the body at all developmental stages, such that HD pathogenesis likely starts at conception and is a lifelong process (Front Neurosci 9:509, 2015). The rate-limiting mechanism(s) of neurodegeneration in HD still remains elusive: many different processes are commonly disrupted in HD cell lines and animal models, as well as in HD patient cells (Eur J Neurosci 27:2803-2820, 2008); however, epigenetic-chromatin deregulation, as determined by the analysis of DNA methylation, histone modifications, and noncoding RNAs, has now become a prevailing feature. Thus, the overarching goal of this chapter is to discuss the current status of the literature, reviewing how an aberrant epigenetic landscape can contribute to altered gene expression and neuronal dysfunction in HD.
Increased autophagy and apoptosis contribute to muscle atrophy in a myotonic dystrophy type 1 Drosophila model

PubMed Central

Bargiela, Ariadna; Cerro-Herreros, Estefanía; Fernandez-Costa, Juan M.; Vilchez, Juan J.; Llamusi, Beatriz; Artero, Ruben

2015-01-01

ABSTRACT Muscle mass wasting is one of the most debilitating symptoms of myotonic dystrophy type 1 (DM1) disease, ultimately leading to immobility, respiratory defects, dysarthria, dysphagia and death in advanced stages of the disease. In order to study the molecular mechanisms leading to the degenerative loss of adult muscle tissue in DM1, we generated an inducible Drosophila model of expanded CTG trinucleotide repeat toxicity that resembles an adult-onset form of the disease. Heat-shock induced expression of 480 CUG repeats in adult flies resulted in a reduction in the area of the indirect flight muscles. In these model flies, reduction of muscle area was concomitant with increased apoptosis and autophagy. Inhibition of apoptosis or autophagy mediated by the overexpression of DIAP1, mTOR (also known as Tor) or muscleblind, or by RNA interference (RNAi)-mediated silencing of autophagy regulatory genes, achieved a rescue of the muscle-loss phenotype. In fact, mTOR overexpression rescued muscle size to a size comparable to that in control flies. These results were validated in skeletal muscle biopsies from DM1 patients in which we found downregulated autophagy and apoptosis repressor genes, and also in DM1 myoblasts where we found increased autophagy. These findings provide new insights into the signaling pathways involved in DM1 disease pathogenesis. PMID:26092529
Increased autophagy and apoptosis contribute to muscle atrophy in a myotonic dystrophy type 1 Drosophila model.

PubMed

Bargiela, Ariadna; Cerro-Herreros, Estefanía; Fernandez-Costa, Juan M; Vilchez, Juan J; Llamusi, Beatriz; Artero, Ruben

2015-07-01

Muscle mass wasting is one of the most debilitating symptoms of myotonic dystrophy type 1 (DM1) disease, ultimately leading to immobility, respiratory defects, dysarthria, dysphagia and death in advanced stages of the disease. In order to study the molecular mechanisms leading to the degenerative loss of adult muscle tissue in DM1, we generated an inducible Drosophila model of expanded CTG trinucleotide repeat toxicity that resembles an adult-onset form of the disease. Heat-shock induced expression of 480 CUG repeats in adult flies resulted in a reduction in the area of the indirect flight muscles. In these model flies, reduction of muscle area was concomitant with increased apoptosis and autophagy. Inhibition of apoptosis or autophagy mediated by the overexpression of DIAP1, mTOR (also known as Tor) or muscleblind, or by RNA interference (RNAi)-mediated silencing of autophagy regulatory genes, achieved a rescue of the muscle-loss phenotype. In fact, mTOR overexpression rescued muscle size to a size comparable to that in control flies. These results were validated in skeletal muscle biopsies from DM1 patients in which we found downregulated autophagy and apoptosis repressor genes, and also in DM1 myoblasts where we found increased autophagy. These findings provide new insights into the signaling pathways involved in DM1 disease pathogenesis. © 2015. Published by The Company of Biologists Ltd.
Neurocognitive endophenotypes in CGG KI and Fmr1 KO mouse models of Fragile X-Associated disorders: an analysis of the state of the field

PubMed Central

Hunsaker, Michael R.

2013-01-01

It has become increasingly important that the field of behavioral genetics identifies not only the gross behavioral phenotypes associated with a given mutation, but also the behavioral endophenotypes that scale with the dosage of the particular mutation being studied. Over the past few years, studies evaluating the effects of the polymorphic CGG trinucleotide repeat on the FMR1 gene underlying Fragile X-Associated Disorders have reported preliminary evidence for a behavioral endophenotype in human Fragile X Premutation carrier populations as well as the CGG knock-in (KI) mouse model. More recently, the behavioral experiments used to test the CGG KI mouse model have been extended to the Fmr1 knock-out (KO) mouse model. When combined, these data provide compelling evidence for a clear neurocognitive endophenotype in the mouse models of Fragile X-Associated Disorders such that behavioral deficits scale predictably with genetic dosage. Similarly, it appears that the CGG KI mouse effectively models the histopathology in Fragile X-Associated Disorders across CGG repeats well into the full mutation range, resulting in a reliable histopathological endophenotype. These endophenotypes may influence future research directions into treatment strategies for not only Fragile X Syndrome, but also the Fragile X Premutation and Fragile X-Associated Tremor/Ataxia Syndrome (FXTAS). PMID:24627796
Properties of an unusual DNA primase from an archaeal plasmid

PubMed Central

Beck, Kirsten; Lipps, Georg

2007-01-01

Primases are specialized DNA-dependent RNA polymerases that synthesize a short oligoribonucleotide complementary to single-stranded template DNA. In the context of cellular DNA replication, primases are indispensable since DNA polymerases are not able to start DNA polymerization de novo. The primase activity of the replication protein from the archaeal plasmid pRN1 synthesizes a rather unusual mixed primer consisting of a single ribonucleotide at the 5′ end followed by seven deoxynucleotides. Ribonucleotides and deoxynucleotides are strictly required at the respective positions within the primer. Furthermore, in contrast to other archaeo-eukaryotic primases, the primase activity is highly sequence-specific and requires the trinucleotide motif GTG in the template. Primer synthesis starts outside of the recognition motif, immediately 5′ to the recognition motif. The fidelity of the primase synthesis is high, as non-complementary bases are not incorporated into the primer. PMID:17709343
A novel tandem repeat sequence located on human chromosome 4p: isolation and characterization.

PubMed

Kogi, M; Fukushige, S; Lefevre, C; Hadano, S; Ikeda, J E

1997-06-01

In an effort to analyze the genomic region of the distal half of human chromosome 4p, to where Huntington disease and other diseases have been mapped, we have isolated the cosmid clone (CRS447) that was likely to contain a region with specific repeat sequences. Clone CRS447 was subjected to detailed analysis, including chromosome mapping, restriction mapping, and DNA sequencing. Chromosome mapping by both a human-CHO hybrid cell panel and FISH revealed that CRS447 was predominantly located in the 4p15.1-15.3 region. CRS447 was shown to consist of tandem repeats of 4.7-kb units present on chromosome 4p. A single EcoRI unit was subcloned (pRS447), and the complete sequence was determined as 4752 nucleotides. When pRS447 was used as a probe, the number of copies of this repeat per haploid genome was estimated to be 50-70. Sequence analysis revealed that it contained two internal CA repeats and one putative ORF. Database search established that this sequence was unreported. However, two homologous STS markers were found in the database. We concluded that CRS447/pRS447 is a novel tandem repeat sequence that is mainly specific to human chromosome 4p.
Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

PubMed

Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

2011-01-01

Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.
Survey and Analysis of Microsatellites in the Silkworm, Bombyx mori

PubMed Central

Prasad, M. Dharma; Muthulakshmi, M.; Madhu, M.; Archak, Sunil; Mita, K.; Nagaraju, J.

2005-01-01

We studied microsatellite frequency and distribution in 21.76-Mb random genomic sequences, 0.67-Mb BAC sequences from the Z chromosome, and 6.3-Mb EST sequences of Bombyx mori. We mined microsatellites of ≥15 bases of mononucleotide repeats and ≥5 repeat units of other classes of repeats. We estimated that microsatellites account for 0.31% of the genome of B. mori. Microsatellite tracts of A, AT, and ATT were the most abundant whereas their number drastically decreased as the length of the repeat motif increased. In general, tri- and hexanucleotide repeats were overrepresented in the transcribed sequences except TAA, GTA, and TGA, which were in excess in genomic sequences. The Z chromosome sequences contained shorter repeat types than the rest of the chromosomes in addition to a higher abundance of AT-rich repeats. Our results showed that base composition of the flanking sequence has an influence on the origin and evolution of microsatellites. Transitions/transversions were high in microsatellites of ESTs, whereas the genomic sequence had an equal number of substitutions and indels. The average heterozygosity value for 23 polymorphic microsatellite loci surveyed in 13 diverse silkmoth strains having 2–14 alleles was 0.54. Only 36 (18.2%) of 198 microsatellite loci were polymorphic between the two divergent silkworm populations and 10 (5%) loci revealed null alleles. The microsatellite map generated using these polymorphic markers resulted in 8 linkage groups. B. mori microsatellite loci were the most conserved in its immediate ancestor, B. mandarina, followed by the wild saturniid silkmoth, Antheraea assama. PMID:15371363
Spatio-temporal Variations of Characteristic Repeating Earthquake Sequences along the Middle America Trench in Mexico

NASA Astrophysics Data System (ADS)

Dominguez, L. A.; Taira, T.; Hjorleifsdottir, V.; Santoyo, M. A.

2015-12-01

Repeating earthquake sequences are sets of events that are thought to rupture the same area on the plate interface and thus provide nearly identical waveforms. We systematically analyzed seismic records from 2001 through 2014 to identify repeating earthquakes with highly correlated waveforms occurring along the subduction zone of the Cocos plate. Using the correlation coefficient (cc) and spectral coherency (coh) of the vertical components as selection criteria, we found a set of 214 sequences whose waveforms exceed cc≥95% and coh≥95%. Spatial clustering along the trench shows large variations in repeating earthquakes activity. Particularly, the rupture zone of the M8.1, 1985 earthquake shows an almost absence of characteristic repeating earthquakes, whereas the Guerrero Gap zone and the segment of the trench close to the Guerrero-Oaxaca border shows a significantly larger number of repeating earthquakes sequences. Furthermore, temporal variations associated to stress changes due to major shows episodes of unlocking and healing of the interface. Understanding the different components that control the location and recurrence time of characteristic repeating sequences is a key factor to pinpoint areas where large megathrust earthquakes may nucleate and consequently to improve the seismic hazard assessment.
Algorithm to find distant repeats in a single protein sequence

PubMed Central

Banerjee, Nirjhar; Sarani, Rangarajan; Ranjani, Chellamuthu Vasuki; Sowmiya, Govindaraj; Michael, Daliah; Balakrishnan, Narayanasamy; Sekar, Kanagaraj

2008-01-01

Distant repeats in protein sequence play an important role in various aspects of protein analysis. A keen analysis of the distant repeats would enable to establish a firm relation of the repeats with respect to their function and three-dimensional structure during the evolutionary process. Further, it enlightens the diversity of duplication during the evolution. To this end, an algorithm has been developed to find all distant repeats in a protein sequence. The scores from Point Accepted Mutation (PAM) matrix has been deployed for the identification of amino acid substitutions while detecting the distant repeats. Due to the biological importance of distant repeats, the proposed algorithm will be of importance to structural biologists, molecular biologists, biochemists and researchers involved in phylogenetic and evolutionary studies. PMID:19052663
Base damage, local sequence context and TP53 mutation hotspots: a molecular dynamics study of benzo[a]pyrene induced DNA distortion and mutability

PubMed Central

Menzies, Georgina E.; Reed, Simon H.; Brancale, Andrea; Lewis, Paul D.

2015-01-01

The mutational pattern for the TP53 tumour suppressor gene in lung tumours differs to other cancer types by having a higher frequency of G:C>T:A transversions. The aetiology of this differing mutation pattern is still unknown. Benzo[a]pyrene,diol epoxide (BPDE) is a potent cigarette smoke carcinogen that forms guanine adducts at TP53 CpG mutation hotspot sites including codons 157, 158, 245, 248 and 273. We performed molecular modelling of BPDE-adducted TP53 duplex sequences to determine the degree of local distortion caused by adducts which could influence the ability of nucleotide excision repair. We show that BPDE adducted codon 157 has greater structural distortion than other TP53 G:C>T:A hotspot sites and that sequence context more distal to adjacent bases must influence local distortion. Using TP53 trinucleotide mutation signatures for lung cancer in smokers and non-smokers we further show that codons 157 and 273 have the highest mutation probability in smokers. Combining this information with adduct structural data we predict that G:C>T:A mutations at codon 157 in lung tumours of smokers are predominantly caused by BPDE. Our results provide insight into how different DNA sequence contexts show variability in DNA distortion at mutagen adduct sites that could compromise DNA repair at well characterized cancer related mutation hotspots. PMID:26400171
3-base periodicity in coding DNA is affected by intercodon dinucleotides

PubMed Central

Sánchez, Joaquín

2011-01-01

All coding DNAs exhibit 3-base periodicity (TBP), which may be defined as the tendency of nucleotides and higher order n-tuples, e.g. trinucleotides (triplets), to be preferentially spaced by 3, 6, 9 etc, bases, and we have proposed an association between TBP and clustering of same-phase triplets. We here investigated if TBP was affected by intercodon dinucleotide tendencies and whether clustering of same-phase triplets was involved. Under constant protein sequence intercodon dinucleotide frequencies depend on the distribution of synonymous codons. So, possible effects were revealed by randomly exchanging synonymous codons without altering protein sequences to subsequently document changes in TBP via frequency distribution of distances (FDD) of DNA triplets. A tripartite positive correlation was found between intercodon dinucleotide frequencies, clustering of same-phase triplets and TBP. So, intercodon C|A (where “|” indicates the boundary between codons) was more frequent in native human DNA than in the codon-shuffled sequences; higher C|A frequency occurred along with more frequent clustering of C|AN triplets (where N jointly represents A, C, G and T) and with intense CAN TBP. The opposite was found for C|G, which was less frequent in native than in shuffled sequences; lower C|G frequency occurred together with reduced clustering of C|GN triplets and with less intense CGN TBP. We hence propose that intercodon dinucleotides affect TBP via same-phase triplet clustering. A possible biological relevance of our findings is briefly discussed. PMID:21814388
Characterization of (CA)n microsatellite repeats from large-insert clones.

PubMed

Litt, M; Browne, D

2001-05-01

The most laborious part of developing (CA)n microsatellite repeats as genetic markers is constructing DNA clones to permit determination of sequences flanking the microsatellites. When cosmids or large-insert phage clones are used as primary sources of (CA)n repeat markers, they have traditionally been subcloned into plasmid vectors such as pUC18 or M13 mp 18/19 cloning vectors to obtain fragments of suitable size for DNA sequencing. This unit presents an alternative approach whereby a set of degenerate sequencing primers that anneal directly to (CA)n microsatellites can be used to determine sequences that are inaccessible with vector-derived primers. Because the primers anneal to the repeat and not to the vector, they can be used with subclones containing inserts of several kilobases and should, in theory, always give sequence in the regions directly flanking the repeat. Degeneracy at the 3 end of each of these primers prevents elongation of primers that have annealed out-of-register. The most laborious part of developing (CA)n microsatellite repeats as genetic markers is constructing DNA clones to permit.
Structural analysis of the rDNA intergenic spacer of Brassica nigra: evolutionary divergence of the spacers of the three diploid Brassica species.

PubMed

Bhatia, S; Singh Negi, M; Lakshmikumaran, M

1996-11-01

EcoRI restriction of the B. nigra rDNA recombinants, isolated from a lambda genomic library, showed that the 3.9-kb fragment corresponded to the Intergenic Spacer (IGS), which was sequenced and found to be 3,928 bp in size. Sequence and dot-matrix analyses showed that the organization of the B. nigra rDNA IGS was typical of most rDNA spacers, consisting of a central repetitive region and flanking unique sequences on either side. The repetitive region was composed of two repeat families-RF 'A' and RF 'B.' The B. nigra RF 'A' consisted of a tandem array of three full-length copies of a 106-bp sequence element. RF 'B' was composed of 66 tandemly repeated elements. Each 'B' element was only 21-bp in size and this is the smallest repeat unit identified in plant rDNA to date. The putative transcription initiation site (TIS) was identified as nucleotide position 3,110. Based on the sequence analysis it was suggested that the present organization of the repeat families was generated by successive cycles of deletions and amplifications and was being maintained by homogenization processes such as gene conversion and crossing-over.A detailed comparison of the rDNA IGS sequences of the three diploid Brassica species-namely, B. nigra, B. campestris, and B. oleracea-was carried out. First, comparisons revealed that B. campestris and B. oleracea were close to each other as the repeat families in both showed high sequence homology between each other. Second, the repeat elements in both the species were organized in an interspersed manner. Third, a 52-bp sequence, present just downstream of the repeats in B. campestris, was found to be identical to the B. oleracea repeats, thereby suggesting a common progenitor. On the other hand, in B. nigra no interspersion pattern of organization of repeats was observed. Further, the B. nigra RF 'A' was identified as distinct from the repeat families of B. campestris and B. oleracea. Based on this analysis, it was suggested that during speciation B. campestris and B. oleracea evolved in one lineage whereas B. nigra diverged into a separate lineage. The comparative analysis of the IGS helped in identifying not only conserved ancestral sequence motifs of possible functional significance such as promoters and enhancers, but also sequences which showed variation between the three diploid species and were therefore identified as species-specific sequences.
Complete genome sequence of Fer-de-Lance Virus reveals a novel gene in reptilian Paramyxoviruses

USGS Publications Warehouse

Kurath, G.; Batts, W.N.; Ahne, W.; Winton, J.R.

2004-01-01

The complete RNA genome sequence of the archetype reptilian paramyxovirus, Fer-de-Lance virus (FDLV), has been determined. The genome is 15,378 nucleotides in length and consists of seven nonoverlapping genes in the order 3??? N-U-P-M-F-HN-L 5???, coding for the nucleocapsid, unknown, phospho-, matrix, fusion, hemagglutinin-neuraminidase, and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and tri-nucleotide intergenic regions similar to those of other Paramyxoviridae. The FDLV P gene expression strategy is like that of rubulaviruses, which express the accessory V protein from the primary transcript and edit a portion of the mRNA to encode P and I proteins. There is also an overlapping open reading frame potentially encoding a small basic protein in the P gene. The gene designated U (unknown), encodes a deduced protein of 19.4 kDa that has no counterpart in other paramyxoviruses and has no similarity with sequences in the National Center for Biotechnology Information database. Active transcription of the U gene in infected cells was demonstrated by Northern blot analysis, and bicistronic N-U mRNA was also evident. The genomes of two other snake paramyxovirus genotypes were also found to have U genes, with 11 to 16% nucleotide divergence from the FDLV U gene. Pairwise comparisons of amino acid identities and phylogenetic analyses of all deduced FDLV protein sequences with homologous sequences from other Paramyxoviridae indicate that FDLV represents a new genus within the subfamily Paramyxovirinae. We suggest the name Ferlavirus for the new genus, with FDLV as the type species.
Interstitial telomeric sequences in vertebrate chromosomes: Origin, function, instability and evolution.

PubMed

Bolzán, Alejandro D

2017-07-01

By definition, telomeric sequences are located at the very ends or terminal regions of chromosomes. However, several vertebrate species show blocks of (TTAGGG)n repeats present in non-terminal regions of chromosomes, the so-called interstitial telomeric sequences (ITSs), interstitial telomeric repeats or interstitial telomeric bands, which include those intrachromosomal telomeric-like repeats located near (pericentromeric ITSs) or within the centromere (centromeric ITSs) and those telomeric repeats located between the centromere and the telomere (i.e., truly interstitial telomeric sequences) of eukaryotic chromosomes. According with their sequence organization, localization and flanking sequences, ITSs can be classified into four types: 1) short ITSs, 2) subtelomeric ITSs, 3) fusion ITSs, and 4) heterochromatic ITSs. The first three types have been described mainly in the human genome, whereas heterochromatic ITSs have been found in several vertebrate species but not in humans. Several lines of evidence suggest that ITSs play a significant role in genome instability and evolution. This review aims to summarize our current knowledge about the origin, function, instability and evolution of these telomeric-like repeats in vertebrate chromosomes. Copyright © 2017 Elsevier B.V. All rights reserved.

Clustered regularly interspaced short palindromic repeats (CRISPRs) for the genotyping of bacterial pathogens.

PubMed

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2009-01-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) are DNA sequences composed of a succession of repeats (23- to 47-bp long) separated by unique sequences called spacers. Polymorphism can be observed in different strains of a species and may be used for genotyping. We describe protocols and bioinformatics tools that allow the identification of CRISPRs from sequenced genomes, their comparison, and their component determination (the direct repeats and the spacers). A schematic representation of the spacer organization can be produced, allowing an easy comparison between strains.
Mapping Simple Repeated DNA Sequences in Heterochromatin of Drosophila Melanogaster

PubMed Central

Lohe, A. R.; Hilliker, A. J.; Roberts, P. A.

1993-01-01

Heterochromatin in Drosophila has unusual genetic, cytological and molecular properties. Highly repeated DNA sequences (satellites) are the principal component of heterochromatin. Using probes from cloned satellites, we have constructed a chromosome map of 10 highly repeated, simple DNA sequences in heterochromatin of mitotic chromosomes of Drosophila melanogaster. Despite extensive sequence homology among some satellites, chromosomal locations could be distinguished by stringent in situ hybridizations for each satellite. Only two of the localizations previously determined using gradient-purified bulk satellite probes are correct. Eight new satellite localizations are presented, providing a megabase-level chromosome map of one-quarter of the genome. Five major satellites each exhibit a multichromosome distribution, and five minor satellites hybridize to single sites on the Y chromosome. Satellites closely related in sequence are often located near one another on the same chromosome. About 80% of Y chromosome DNA is composed of nine simple repeated sequences, in particular (AAGAC)(n) (8 Mb), (AAGAG)(n) (7 Mb) and (AATAT)(n) (6 Mb). Similarly, more than 70% of the DNA in chromosome 2 heterochromatin is composed of five simple repeated sequences. We have also generated a high resolution map of satellites in chromosome 2 heterochromatin, using a series of translocation chromosomes whose breakpoints in heterochromatin were ordered by N-banding. Finally, staining and banding patterns of heterochromatic regions are correlated with the locations of specific repeated DNA sequences. The basis for the cytochemical heterogeneity in banding appears to depend exclusively on the different satellite DNAs present in heterochromatin. PMID:8375654
Identification of the centromeric repeat in the threespine stickleback fish (Gasterosteus aculeatus).

PubMed

Cech, Jennifer N; Peichel, Catherine L

2015-12-01

Centromere sequences exist as gaps in many genome assemblies due to their repetitive nature. Here we take an unbiased approach utilizing centromere protein A (CENP-A) chomatin immunoprecipitation followed by high-throughput sequencing to identify the centromeric repeat sequence in the threespine stickleback fish (Gasterosteus aculeatus). A 186-bp, AT-rich repeat was validated as centromeric using both fluorescence in situ hybridization (FISH) and immunofluorescence combined with FISH (IF-FISH) on interphase nuclei and metaphase spreads. This repeat hybridizes strongly to the centromere on all chromosomes, with the exception of weak hybridization to the Y chromosome. Together, our work provides the first validated sequence information for the threespine stickleback centromere.
Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution

PubMed Central

2012-01-01

Background Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis. PMID:23020678
Repeatless and repeat-based centromeres in potato: implications for centromere evolution.

PubMed

Gong, Zhiyun; Wu, Yufeng; Koblízková, Andrea; Torres, Giovana A; Wang, Kai; Iovene, Marina; Neumann, Pavel; Zhang, Wenli; Novák, Petr; Buell, C Robin; Macas, Jirí; Jiang, Jiming

2012-09-01

Centromeres in most higher eukaryotes are composed of long arrays of satellite repeats. By contrast, most newly formed centromeres (neocentromeres) do not contain satellite repeats and instead include DNA sequences representative of the genome. An unknown question in centromere evolution is how satellite repeat-based centromeres evolve from neocentromeres. We conducted a genome-wide characterization of sequences associated with CENH3 nucleosomes in potato (Solanum tuberosum). Five potato centromeres (Cen4, Cen6, Cen10, Cen11, and Cen12) consisted primarily of single- or low-copy DNA sequences. No satellite repeats were identified in these five centromeres. At least one transcribed gene was associated with CENH3 nucleosomes. Thus, these five centromeres structurally resemble neocentromeres. By contrast, six potato centromeres (Cen1, Cen2, Cen3, Cen5, Cen7, and Cen8) contained megabase-sized satellite repeat arrays that are unique to individual centromeres. The satellite repeat arrays likely span the entire functional cores of these six centromeres. At least four of the centromeric repeats were amplified from retrotransposon-related sequences and were not detected in Solanum species closely related to potato. The presence of two distinct types of centromeres, coupled with the boom-and-bust cycles of centromeric satellite repeats in Solanum species, suggests that repeat-based centromeres can rapidly evolve from neocentromeres by de novo amplification and insertion of satellite repeats in the CENH3 domains.
Repeatless and Repeat-Based Centromeres in Potato: Implications for Centromere Evolution[C][W

PubMed Central

Gong, Zhiyun; Wu, Yufeng; Koblížková, Andrea; Torres, Giovana A.; Wang, Kai; Iovene, Marina; Neumann, Pavel; Zhang, Wenli; Novák, Petr; Buell, C. Robin; Macas, Jiří; Jiang, Jiming

2012-01-01

Centromeres in most higher eukaryotes are composed of long arrays of satellite repeats. By contrast, most newly formed centromeres (neocentromeres) do not contain satellite repeats and instead include DNA sequences representative of the genome. An unknown question in centromere evolution is how satellite repeat-based centromeres evolve from neocentromeres. We conducted a genome-wide characterization of sequences associated with CENH3 nucleosomes in potato (Solanum tuberosum). Five potato centromeres (Cen4, Cen6, Cen10, Cen11, and Cen12) consisted primarily of single- or low-copy DNA sequences. No satellite repeats were identified in these five centromeres. At least one transcribed gene was associated with CENH3 nucleosomes. Thus, these five centromeres structurally resemble neocentromeres. By contrast, six potato centromeres (Cen1, Cen2, Cen3, Cen5, Cen7, and Cen8) contained megabase-sized satellite repeat arrays that are unique to individual centromeres. The satellite repeat arrays likely span the entire functional cores of these six centromeres. At least four of the centromeric repeats were amplified from retrotransposon-related sequences and were not detected in Solanum species closely related to potato. The presence of two distinct types of centromeres, coupled with the boom-and-bust cycles of centromeric satellite repeats in Solanum species, suggests that repeat-based centromeres can rapidly evolve from neocentromeres by de novo amplification and insertion of satellite repeats in the CENH3 domains. PMID:22968715
Molecular structure and chromosome distribution of three repetitive DNA families in Anemone hortensis L. (Ranunculaceae).

PubMed

Mlinarec, Jelena; Chester, Mike; Siljak-Yakovlev, Sonja; Papes, Drazena; Leitch, Andrew R; Besendorfer, Visnja

2009-01-01

The structure, abundance and location of repetitive DNA sequences on chromosomes can characterize the nature of higher plant genomes. Here we report on three new repeat DNA families isolated from Anemone hortensis L.; (i) AhTR1, a family of satellite DNA (stDNA) composed of a 554-561 bp long EcoRV monomer; (ii) AhTR2, a stDNA family composed of a 743 bp long HindIII monomer and; (iii) AhDR, a repeat family composed of a 945 bp long HindIII fragment that exhibits some sequence similarity to Ty3/gypsy-like retroelements. Fluorescence in-situ hybridization (FISH) to metaphase chromosomes of A. hortensis (2n = 16) revealed that both AhTR1 and AhTR2 sequences co-localized with DAPI-positive AT-rich heterochromatic regions. AhTR1 sequences occur at intercalary DAPI bands while AhTR2 sequences occur at 8-10 terminally located heterochromatic blocks. In contrast AhDR sequences are dispersed over all chromosomes as expected of a Ty3/gypsy-like element. AhTR2 and AhTR1 repeat families include polyA- and polyT-tracks, AT/TA-motifs and a pentanucleotide sequence (CAAAA) that may have consequences for chromatin packing and sequence homogeneity. AhTR2 repeats also contain TTTAGGG motifs and degenerate variants. We suggest that they arose by interspersion of telomeric repeats with subtelomeric repeats, before hybrid unit(s) amplified through the heterochromatic domain. The three repetitive DNA families together occupy approximately 10% of the A. hortensis genome. Comparative analyses of eight Anemone species revealed that the divergence of the A. hortensis genome was accompanied by considerable modification and/or amplification of repeats.
Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

DOEpatents

Weier, H.U.G.; Gray, J.W.

1995-06-27

A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.
Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

DOEpatents

Weier, Heinz-Ulrich G.; Gray, Joe W.

1995-01-01

A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.
Androgen receptor CAG repeat polymorphism and hypothalamic-pituitary-gonadal function in Filipino young adult males

PubMed Central

Ryan, Calen P.; McDade, Thomas W; Gettler, Lee T.; Eisenberg, Dan T.A.; Rzhetskaya, Margarita; Hayes, M. Geoffey; Kuzawa, Christopher W.

2016-01-01

Objectives Testosterone (T), the primary androgenic hormone in males, is stimulated through pulsatile secretion of LH and regulated through negative feedback inhibition at the hypothalamus and pituitary. The hypothalamic-pituitary-gonadal (HPG) axis also controls sperm production through the secretion of follicle-stimulating hormone (FSH). Negative feedback in the HPG axis is achieved in part through the binding of T to the androgen receptor (AR), which contains a highly variable trinucleotide repeat polymorphism (AR-CAGn). The number of repeats in the AR-CAGn inversely correlates with transcriptional activity of the AR. Thus, we predicted longer AR-CAGn to be associated with higher T, LH, and FSH levels. Methods We examined the relationship between AR-CAGn and total plasma T, LH, and FSH, as well as 'bioavailable' morning (AM-T) and evening (PM-T) testosterone in 722 young (21.5 ± 0.5 years) Filipino males. Results There was no relationship between AR-CAGn and total T, AM-T, or LH (P > 0.25 for all). We did observe a marginally non-significant (P = 0.066) correlation between AR-CAGn and PM-T in the predicted direction, and a negative correlation between AR-CAGn and FSH (P = 0.005). Conclusions Our results both support and differ from previous findings in this area, and study parameters that differ between our study and others, such as participant age, sample time, and the role of other hormones should be considered when interpreting our findings. While our data point to a modest effect of AR-CAGn on HPG regulation at best, the AR-CAGn may still affect somatic traits by regulating androgenic activity at peripheral tissues. PMID:27417274
De novo identification of highly diverged protein repeats by probabilistic consistency.

PubMed

Biegert, A; Söding, J

2008-03-15

An estimated 25% of all eukaryotic proteins contain repeats, which underlines the importance of duplication for evolving new protein functions. Internal repeats often correspond to structural or functional units in proteins. Methods capable of identifying diverged repeated segments or domains at the sequence level can therefore assist in predicting domain structures, inferring hypotheses about function and mechanism, and investigating the evolution of proteins from smaller fragments. We present HHrepID, a method for the de novo identification of repeats in protein sequences. It is able to detect the sequence signature of structural repeats in many proteins that have not yet been known to possess internal sequence symmetry, such as outer membrane beta-barrels. HHrepID uses HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs. In contrast to a previous method, the new method (1) generates a multiple alignment of repeats; (2) utilizes the transitive nature of homology through a novel merging procedure with fully probabilistic treatment of alignments; (3) improves alignment quality through an algorithm that maximizes the expected accuracy; (4) is able to identify different kinds of repeats within complex architectures by a probabilistic domain boundary detection method and (5) improves sensitivity through a new approach to assess statistical significance. Server: http://toolkit.tuebingen.mpg.de/hhrepid; Executables: ftp://ftp.tuebingen.mpg.de/pub/protevo/HHrepID
Detecting and Characterizing Repeating Earthquake Sequences During Volcanic Eruptions

NASA Astrophysics Data System (ADS)

Tepp, G.; Haney, M. M.; Wech, A.

2017-12-01

A major challenge in volcano seismology is forecasting eruptions. Repeating earthquake sequences often precede volcanic eruptions or lava dome activity, providing an opportunity for short-term eruption forecasting. Automatic detection of these sequences can lead to timely eruption notification and aid in continuous monitoring of volcanic systems. However, repeating earthquake sequences may also occur after eruptions or along with magma intrusions that do not immediately lead to an eruption. This additional challenge requires a better understanding of the processes involved in producing these sequences to distinguish those that are precursory. Calculation of the inverse moment rate and concepts from the material failure forecast method can lead to such insights. The temporal evolution of the inverse moment rate is observed to differ for precursory and non-precursory sequences, and multiple earthquake sequences may occur concurrently. These observations suggest that sequences may occur in different locations or through different processes. We developed an automated repeating earthquake sequence detector and near real-time alarm to send alerts when an in-progress sequence is identified. Near real-time inverse moment rate measurements can further improve our ability to forecast eruptions by allowing for characterization of sequences. We apply the detector to eruptions of two Alaskan volcanoes: Bogoslof in 2016-2017 and Redoubt Volcano in 2009. The Bogoslof eruption produced almost 40 repeating earthquake sequences between its start in mid-December 2016 and early June 2017, 21 of which preceded an explosive eruption, and 2 sequences in the months before eruptive activity. Three of the sequences occurred after the implementation of the alarm in late March 2017 and successfully triggered alerts. The nearest seismometers to Bogoslof are over 45 km away, requiring a detector that can work with few stations and a relatively low signal-to-noise ratio. During the Redoubt eruption, earthquake sequences were observed in the months leading up to the eruptive activity beginning in March 2009 as well as immediately preceding 7 of the 19 explosive events. In contrast to Bogoslof, Redoubt has a local monitoring network which allows for better detection and more detailed analysis of the repeating earthquake sequences.
The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats

PubMed Central

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2007-01-01

Background In Archeae and Bacteria, the repeated elements called CRISPRs for "clustered regularly interspaced short palindromic repeats" are believed to participate in the defence against viruses. Short sequences called spacers are stored in-between repeated elements. In the current model, motifs comprising spacers and repeats may target an invading DNA and lead to its degradation through a proposed mechanism similar to RNA interference. Analysis of intra-species polymorphism shows that new motifs (one spacer and one repeated element) are added in a polarised fashion. Although their principal characteristics have been described, a lot remains to be discovered on the way CRISPRs are created and evolve. As new genome sequences become available it appears necessary to develop automated scanning tools to make available CRISPRs related information and to facilitate additional investigations. Description We have produced a program, CRISPRFinder, which identifies CRISPRs and extracts the repeated and unique sequences. Using this software, a database is constructed which is automatically updated monthly from newly released genome sequences. Additional tools were created to allow the alignment of flanking sequences in search for similarities between different loci and to build dictionaries of unique sequences. To date, almost six hundred CRISPRs have been identified in 475 published genomes. Two Archeae out of thirty-seven and about half of Bacteria do not possess a CRISPR. Fine analysis of repeated sequences strongly supports the current view that new motifs are added at one end of the CRISPR adjacent to the putative promoter. Conclusion It is hoped that availability of a public database, regularly updated and which can be queried on the web will help in further dissecting and understanding CRISPR structure and flanking sequences evolution. Subsequent analyses of the intra-species CRISPR polymorphism will be facilitated by CRISPRFinder and the dictionary creator. CRISPRdb is accessible at PMID:17521438
Assembly of a phased diploid Candida albicans genome facilitates allele-specific measurements and provides a simple model for repeat and indel structure

PubMed Central

2013-01-01

Background Candida albicans is a ubiquitous opportunistic fungal pathogen that afflicts immunocompromised human hosts. With rare and transient exceptions the yeast is diploid, yet despite its clinical relevance the respective sequences of its two homologous chromosomes have not been completely resolved. Results We construct a phased diploid genome assembly by deep sequencing a standard laboratory wild-type strain and a panel of strains homozygous for particular chromosomes. The assembly has 700-fold coverage on average, allowing extensive revision and expansion of the number of known SNPs and indels. This phased genome significantly enhances the sensitivity and specificity of allele-specific expression measurements by enabling pooling and cross-validation of signal across multiple polymorphic sites. Additionally, the diploid assembly reveals pervasive and unexpected patterns in allelic differences between homologous chromosomes. Firstly, we see striking clustering of indels, concentrated primarily in the repeat sequences in promoters. Secondly, both indels and their repeat-sequence substrate are enriched near replication origins. Finally, we reveal an intimate link between repeat sequences and indels, which argues that repeat length is under selective pressure for most eukaryotes. This connection is described by a concise one-parameter model that explains repeat-sequence abundance in C. albicans as a function of the indel rate, and provides a general framework to interpret repeat abundance in species ranging from bacteria to humans. Conclusions The phased genome assembly and insights into repeat plasticity will be valuable for better understanding allele-specific phenomena and genome evolution. PMID:24025428
Unrelated sequences at the 5' end of mouse LINE-1 repeated elements define two distinct subfamilies.

PubMed Central

Wincker, P; Jubier-Maurin, V; Roizès, G

1987-01-01

Some full length members of the mouse long interspersed repeated DNA family L1Md have been shown to be associated at their 5' end with a variable number of tandem repetitions, the A repeats, that have been suggested to be transcription controlling elements. We report that the other type of repeat, named F, found at the 5' end of a few L1 elements is also an integral part of full length L1 copies. Sequencing shows that the F repeats are GC rich, and organized in tandem. The L1 copies associated with either A or F repeats can be correlated with two different subsets of L1 sequences distinguished by a series of variant nucleotides specific to each and by unassociated but frequent restriction sites. These findings suggest that sequence replacement has occurred at least once in 5' of L1Md, and is related to the generation of specific subfamilies. Images PMID:3684566
Plant chromosomes from end to end: telomeres, heterochromatin and centromeres.

PubMed

Lamb, Jonathan C; Yu, Weichang; Han, Fangpu; Birchler, James A

2007-04-01

Recent evidence indicates that heterochromatin in plants is composed of heterogeneous sequences, which are usually composed of transposable elements or tandem repeat arrays. These arrays are associated with chromatin modifications that produce a closed configuration that limits transcription. Centromere sequences in plants are usually composed of tandem repeat arrays that are homogenized across the genome. Analysis of such arrays in closely related taxa suggests a rapid turnover of the repeat unit that is typical of a particular species. In addition, two lines of evidence for an epigenetic component of centromere specification have been reported, namely an example of a neocentromere formed over sequences without the typical repeat array and examples of centromere inactivation. Although the telomere repeat unit is quite prevalent in the plant kingdom, unusual repeats have been found in some families. Recently, it was demonstrated that the introduction of telomere sequences into plants cells causes truncation of the chromosomes, and that this technique can be used to produce artificial chromosome platforms.
A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

PubMed Central

2010-01-01

Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT). Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the repeat may be disseminated by HGT and intra-genomic shuffling. Conclusions We describe novel features of PARCELs (Palindromic Amphipathic Repeat Coding ELements), a set of widely distributed repeat protein domains and coding sequences that were likely acquired through HGT by diverse unicellular microbes, further mobilized and diversified within genomes, and co-opted for expression in the membrane proteome of some taxa. Disseminated by multiple gene-centric vehicles, ORFs harboring these elements enhance accessory gene pools as part of the "mobilome" connecting genomes of various clades, in taxa sharing common niches. PMID:20626840
A TALE-inspired computational screen for proteins that contain approximate tandem repeats.

PubMed

Perycz, Malgorzata; Krwawicz, Joanna; Bochtler, Matthias

2017-01-01

TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen.
A TALE-inspired computational screen for proteins that contain approximate tandem repeats

PubMed Central

Krwawicz, Joanna

2017-01-01

TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen. PMID:28617832
Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing.

PubMed

Hribová, Eva; Neumann, Pavel; Matsumoto, Takashi; Roux, Nicolas; Macas, Jirí; Dolezel, Jaroslav

2010-09-16

Bananas and plantains (Musa spp.) are grown in more than a hundred tropical and subtropical countries and provide staple food for hundreds of millions of people. They are seed-sterile crops propagated clonally and this makes them vulnerable to a rapid spread of devastating diseases and at the same time hampers breeding improved cultivars. Although the socio-economic importance of bananas and plantains cannot be overestimated, they remain outside the focus of major research programs. This slows down the study of nuclear genome and the development of molecular tools to facilitate banana improvement. In this work, we report on the first thorough characterization of the repeat component of the banana (M. acuminata cv. 'Calcutta 4') genome. Analysis of almost 100 Mb of sequence data (0.15× genome coverage) permitted partial sequence reconstruction and characterization of repetitive DNA, making up about 30% of the genome. The results showed that the banana repeats are predominantly made of various types of Ty1/copia and Ty3/gypsy retroelements representing 16 and 7% of the genome respectively. On the other hand, DNA transposons were found to be rare. In addition to new families of transposable elements, two new satellite repeats were discovered and found useful as cytogenetic markers. To help in banana sequence annotation, a specific Musa repeat database was created, and its utility was demonstrated by analyzing the repeat composition of 62 genomic BAC clones. A low-depth 454 sequencing of banana nuclear genome provided the largest amount of DNA sequence data available until now for Musa and permitted reconstruction of most of the major types of DNA repeats. The information obtained in this study improves the knowledge of the long-range organization of banana chromosomes, and provides sequence resources needed for repeat masking and annotation during the Musa genome sequencing project. It also provides sequence data for isolation of DNA markers to be used in genetic diversity studies and in marker-assisted selection.

Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing

PubMed Central

2010-01-01

Background Bananas and plantains (Musa spp.) are grown in more than a hundred tropical and subtropical countries and provide staple food for hundreds of millions of people. They are seed-sterile crops propagated clonally and this makes them vulnerable to a rapid spread of devastating diseases and at the same time hampers breeding improved cultivars. Although the socio-economic importance of bananas and plantains cannot be overestimated, they remain outside the focus of major research programs. This slows down the study of nuclear genome and the development of molecular tools to facilitate banana improvement. Results In this work, we report on the first thorough characterization of the repeat component of the banana (M. acuminata cv. 'Calcutta 4') genome. Analysis of almost 100 Mb of sequence data (0.15× genome coverage) permitted partial sequence reconstruction and characterization of repetitive DNA, making up about 30% of the genome. The results showed that the banana repeats are predominantly made of various types of Ty1/copia and Ty3/gypsy retroelements representing 16 and 7% of the genome respectively. On the other hand, DNA transposons were found to be rare. In addition to new families of transposable elements, two new satellite repeats were discovered and found useful as cytogenetic markers. To help in banana sequence annotation, a specific Musa repeat database was created, and its utility was demonstrated by analyzing the repeat composition of 62 genomic BAC clones. Conclusion A low-depth 454 sequencing of banana nuclear genome provided the largest amount of DNA sequence data available until now for Musa and permitted reconstruction of most of the major types of DNA repeats. The information obtained in this study improves the knowledge of the long-range organization of banana chromosomes, and provides sequence resources needed for repeat masking and annotation during the Musa genome sequencing project. It also provides sequence data for isolation of DNA markers to be used in genetic diversity studies and in marker-assisted selection. PMID:20846365
Optimization of sequence alignment for simple sequence repeat regions.

PubMed

Jighly, Abdulqader; Hamwieh, Aladdin; Ogbonnaya, Francis C

2011-07-20

Microsatellites, or simple sequence repeats (SSRs), are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs) mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs).SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type.When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic phylogenic relationship.
A Dynamic Tandem Repeat in Monocotyledons Inferred from a Comparative Analysis of Chloroplast Genomes in Melanthiaceae.

PubMed

Do, Hoang Dang Khoa; Kim, Joo-Hwan

2017-01-01

Chloroplast genomes (cpDNA) are highly valuable resources for evolutionary studies of angiosperms, since they are highly conserved, are small in size, and play critical roles in plants. Slipped-strand mispairing (SSM) was assumed to be a mechanism for generating repeat units in cpDNA. However, research on the employment of different small repeated sequences through SSM events, which may induce the accumulation of distinct types of repeats within the same region in cpDNA, has not been documented. Here, we sequenced two chloroplast genomes from the endemic species Heloniopsis tubiflora (Korea) and Xerophyllum tenax (USA) to cover the gap between molecular data and explore "hot spots" for genomic events in Melanthiaceae. Comparative analysis of 23 complete cpDNA sequences revealed that there were different stages of deletion in the rps16 region across the Melanthiaceae. Based on the partial or complete loss of rps16 gene in cpDNA, we have firstly reported potential molecular markers for recognizing two sections ( Veratrum and Fuscoveratrum ) of Veratrum . Melathiaceae exhibits a significant change in the junction between large single copy and inverted repeat regions, ranging from trnH_GUG to a part of rps3 . Our results show an accumulation of tandem repeats in the rpl23-ycf2 regions of cpDNAs. Small conserved sequences exist and flank tandem repeats in further observation of this region across most of the examined taxa of Liliales. Therefore, we propose three scenarios in which different small repeated sequences were used during SSM events to generate newly distinct types of repeats. Occasionally, prior to the SSM process, point mutation event and double strand break repair occurred and induced the formation of initial repeat units which are indispensable in the SSM process. SSM may have likely occurred more frequently for short repeats than for long repeat sequences in tribe Parideae (Melanthiaceae, Liliales). Collectively, these findings add new evidence of dynamic results from SSM in chloroplast genomes which can be useful for further evolutionary studies in angiosperms. Additionally, genomics events in cpDNA are potential resources for mining molecular markers in Liliales.
Phosphate steering by Flap Endonuclease 1 promotes 5'-flap specificity and incision to prevent genome instability

DOE PAGES

Tsutakawa, Susan E.; Thompson, Mark J.; Arvai, Andrew S.; ...

2017-06-27

DNA replication and repair enzyme Flap Endonuclease 1 (FEN1) is vital for genome integrity, and FEN1 mutations arise in multiple cancers. FEN1 precisely cleaves single-stranded (ss) 5'-flaps one nucleotide into duplex (ds) DNA. Yet, how FEN1 selects for but does not incise the ss 5'-flap was enigmatic. Here we combine crystallographic, biochemical and genetic analyses to show that two dsDNA binding sites set the 5'polarity and to reveal unexpected control of the DNA phosphodiester backbone by electrostatic interactions. Via phosphate steering', basic residues energetically steer an inverted ss 5'-flap through a gateway over FEN1's active site and shift dsDNA formore » catalysis. Mutations of these residues cause an 18,000-fold reduction in catalytic rate in vitro and large-scale trinucleotide (GAA) n repeat expansions in vivo, implying failed phosphate-steering promotes an unanticipated lagging-strand template-switch mechanism during replication. Thus, phosphate steering is an unappreciated FEN1 function that enforces 5'-flap specificity and catalysis, preventing genomic instability.« less
Evidence of a polyclonal nature of myositis ossificans.

PubMed

Leithner, Andreas; Weinhaeusel, Andreas; Zeitlhofer, Petra; Koch, Horst; Radl, Roman; Windhager, Reinhard; Beham, Alfred; Haas, Oskar A

2005-04-01

Myositis ossificans is a localized, self-limiting, reparative lesion that is composed of reactive hypercellular fibrous tissue and bone. Although it is clearly a benign lesion, its clinical, radiological, and histological appearance may sometimes mimic a malignant tumor. Whether myositis ossificans represents a monoclonal or polyclonal hyperplastic proliferation is not yet known. To address this question, we therefore extracted DNA from the respective paraffin-embedded tumor tissues of nine women with a median age of 50 years at diagnosis (range: 20-84 years) and studied the X inactivation pattern by means of methylation-sensitive polymerase chain reaction and primers that target the polymorphic CGG trinucleotide repeat of the FMR1 gene. The fact that we did not detect any skewing of the X inactivation pattern in the five successfully analyzed cases corroborates the notion that myositis ossificans results from a polyclonal proliferation and confirms that it is a reactive, reparative process. Analysis of the X inactivation pattern may, thus, supplement the differential diagnostic work-up of cases with an uncertain histology, at least in the informative proportion of female patients.
[Molecular-targeted therapy for neurodegenerative diseases].

PubMed

Sobue, Gen

2009-11-01

Neurodegenerative diseases have been construed as incurable disorders. However, therapeutic development for these diseases is now facing a turning point: analyses of cellular and animal models have provided insights into pathogenesis of neurodegenerative diseases, and have indicated rational therapeutic approaches to them. Therefore, how to realize molecular targeted therapy for neurodegenerative diseases is becoming one of the most challenging issues in the clinical neurology. Primarily, pathophysiological understanding of the disease from basic science is the first step. For the successful clinical trials, effective trial design, sufficient economic and social support, and education are indispensable. The development of androgen deprivation therapy for spinal and bulbar muscular atrophy (SBMA) is a representative study in this field. SBMA is a hereditary neurodegenerative disease caused by expansion of a trinucleotide CAG repeat in the first exon of the androgen receptor (AR) gene. There is increasing evidence that testosterone, the ligand of AR, plays a pivotal role in the neurodegeneration in SBMA. The striking success of androgen deprivation therapy in SBMA mouse models has been translated into phase 2, and then phase 3, clinical trials.
Phosphate steering by Flap Endonuclease 1 promotes 5'-flap specificity and incision to prevent genome instability

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tsutakawa, Susan E.; Thompson, Mark J.; Arvai, Andrew S.

DNA replication and repair enzyme Flap Endonuclease 1 (FEN1) is vital for genome integrity, and FEN1 mutations arise in multiple cancers. FEN1 precisely cleaves single-stranded (ss) 5'-flaps one nucleotide into duplex (ds) DNA. Yet, how FEN1 selects for but does not incise the ss 5'-flap was enigmatic. Here we combine crystallographic, biochemical and genetic analyses to show that two dsDNA binding sites set the 5'polarity and to reveal unexpected control of the DNA phosphodiester backbone by electrostatic interactions. Via phosphate steering', basic residues energetically steer an inverted ss 5'-flap through a gateway over FEN1's active site and shift dsDNA formore » catalysis. Mutations of these residues cause an 18,000-fold reduction in catalytic rate in vitro and large-scale trinucleotide (GAA) n repeat expansions in vivo, implying failed phosphate-steering promotes an unanticipated lagging-strand template-switch mechanism during replication. Thus, phosphate steering is an unappreciated FEN1 function that enforces 5'-flap specificity and catalysis, preventing genomic instability.« less
Behavioral and genetic correlates of the neural response to infant crying among human fathers

PubMed Central

Mascaro, Jennifer S.; Hackett, Patrick D.; Gouzoules, Harold; Lori, Adriana

2014-01-01

Although evolution has shaped human infant crying and the corresponding response from caregivers, there is marked variation in paternal involvement and caretaking behavior, highlighting the importance of understanding the neurobiology supporting optimal paternal responses to cries. We explored the neural response to infant cries in fathers of children aged 1–2, and its relationship with hormone levels, variation in the androgen receptor (AR) gene, parental attitudes and parental behavior. Although number of AR CAG trinucleotide repeats was positively correlated with neural activity in brain regions important for empathy (anterior insula and inferior frontal gyrus), restrictive attitudes were inversely correlated with neural activity in these regions and with regions involved with emotion regulation (orbitofrontal cortex). Anterior insula activity had a non-linear relationship with paternal caregiving, such that fathers with intermediate activation were most involved. These results suggest that restrictive attitudes may be associated with decreased empathy and emotion regulation in response to a child in distress, and that moderate anterior insula activity reflects an optimal level of arousal that supports engaged fathering. PMID:24336349
Purification, crystallization and preliminary X-ray diffraction analysis of the human mismatch repair protein MutS[beta

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tseng, Quincy; Orans, Jillian; Hast, Michael A.

2012-03-16

MutS{beta} is a eukaryotic mismatch repair protein that preferentially targets extrahelical unpaired nucleotides and shares partial functional redundancy with MutS{alpha} (MSH2-MSH6). Although mismatch recognition by MutS{alpha} has been shown to involve a conserved Phe-X-Glu motif, little is known about the lesion-binding mechanism of MutS{beta}. Combined MSH3/MSH6 deficiency triggers a strong predisposition to cancer in mice and defects in msh2 and msh6 account for roughly half of hereditary nonpolyposis colorectal cancer mutations. These three MutS homologs are also believed to play a role in trinucleotide repeat instability, which is a hallmark of many neurodegenerative disorders. The baculovirus overexpression and purification ofmore » recombinant human MutS{beta} and three truncation mutants are presented here. Binding assays with heteroduplex DNA were carried out for biochemical characterization. Crystallization and preliminary X-ray diffraction analysis of the protein bound to a heteroduplex DNA substrate are also reported.« less
Molecular and bioinformatic analysis of the FB-NOF transposable element.

PubMed

Badal, Martí; Portela, Anna; Xamena, Noel; Cabré, Oriol

2006-04-12

The Drosophila melanogaster transposable element FB-NOF is known to play a role in genome plasticity through the generation of all sort of genomic rearrangements. Moreover, several insertional mutants due to FB mobilizations have been reported. Its structure and sequence, however, have been poorly studied mainly as a consequence of the long, complex and repetitive sequence of FB inverted repeats. This repetitive region is composed of several 154 bp blocks, each with five almost identical repeats. In this paper, we report the sequencing process of 2 kb long FB inverted repeats of a complete FB-NOF element, with high precision and reliability. This achievement has been possible using a new map of the FB repetitive region, which identifies unambiguously each repeat with new features that can be used as landmarks. With this new vision of the element, a list of FB-NOF in the D. melanogaster genomic clones has been done, improving previous works that used only bioinformatic algorithms. The availability of many FB and FB-NOF sequences allowed an analysis of the FB insertion sequences that showed no sequence specificity, but a preference for A/T rich sequences. The position of NOF into FB is also studied, revealing that it is always located after a second repeat in a random block. With the results of this analysis, we propose a model of transposition in which NOF jumps from FB to FB, using an unidentified transposase enzyme that should specifically recognize the second repeat end of the FB blocks.
The repetitive landscape of the chicken genome.

PubMed

Wicker, Thomas; Robertson, Jon S; Schulze, Stefan R; Feltus, F Alex; Magrini, Vincent; Morrison, Jason A; Mardis, Elaine R; Wilson, Richard K; Peterson, Daniel G; Paterson, Andrew H; Ivarie, Robert

2005-01-01

Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, and low-copy fractions of the chicken genome. Sequencing high-copy DNA of chicken to about 2.7 x coverage of its estimated sequence complexity led to the initial identification of several new repeat families, which were then used for a survey of the newly released first draft of the complete chicken genome. The analysis provided insight into the diversity and biology of known repeat structures such as CR1 and CNM, for which only limited sequence data had previously been available. Cot sequence data also resulted in the identification of four novel repeats (Birddawg, Hitchcock, Kronos, and Soprano), two new subfamilies of CR1 repeats, and many elements absent from the chicken genome assembly. Multiple autonomous elements were found for a novel Mariner-like transposon, Galluhop, in addition to nonautonomous deletion derivatives. Phylogenetic analysis of the high-copy repeats CR1, Galluhop, and Birddawg provided insight into two distinct genome dispersion strategies. This study also exemplifies the power of the CBCS method to create representative databases for the repetitive fractions of genomes for which only limited sequence data is available.
The repetitive landscape of the chicken genome

PubMed Central

Wicker, Thomas; Robertson, Jon S.; Schulze, Stefan R.; Feltus, F. Alex; Magrini, Vincent; Morrison, Jason A.; Mardis, Elaine R.; Wilson, Richard K.; Peterson, Daniel G.; Paterson, Andrew H.; Ivarie, Robert

2005-01-01

Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, and low-copy fractions of the chicken genome. Sequencing high-copy DNA of chicken to about 2.7× coverage of its estimated sequence complexity led to the initial identification of several new repeat families, which were then used for a survey of the newly released first draft of the complete chicken genome. The analysis provided insight into the diversity and biology of known repeat structures such as CR1 and CNM, for which only limited sequence data had previously been available. Cot sequence data also resulted in the identification of four novel repeats (Birddawg, Hitchcock, Kronos, and Soprano), two new subfamilies of CR1 repeats, and many elements absent from the chicken genome assembly. Multiple autonomous elements were found for a novel Mariner-like transposon, Galluhop, in addition to nonautonomous deletion derivatives. Phylogenetic analysis of the high-copy repeats CR1, Galluhop, and Birddawg provided insight into two distinct genome dispersion strategies. This study also exemplifies the power of the CBCS method to create representative databases for the repetitive fractions of genomes for which only limited sequence data is available. PMID:15256510
ATP hydrolysis provides functions that promote rejection of pairings between different copies of long repeated sequences

PubMed Central

Danilowicz, Claudia; Hermans, Laura; Coljee, Vincent; Prévost, Chantal

2017-01-01

Abstract During DNA recombination and repair, RecA family proteins must promote rapid joining of homologous DNA. Repeated sequences with >100 base pair lengths occupy more than 1% of bacterial genomes; however, commitment to strand exchange was believed to occur after testing ∼20–30 bp. If that were true, pairings between different copies of long repeated sequences would usually become irreversible. Our experiments reveal that in the presence of ATP hydrolysis even 75 bp sequence-matched strand exchange products remain quite reversible. Experiments also indicate that when ATP hydrolysis is present, flanking heterologous dsDNA regions increase the reversibility of sequence matched strand exchange products with lengths up to ∼75 bp. Results of molecular dynamics simulations provide insight into how ATP hydrolysis destabilizes strand exchange products. These results inspired a model that shows how pairings between long repeated sequences could be efficiently rejected even though most homologous pairings form irreversible products. PMID:28854739
The Influence of Primary and Secondary DNA Structure in Deletion and Duplication between Direct Repeats in Escherichia Coli

PubMed Central

Trinh, T. Q.; Sinden, R. R.

1993-01-01

We describe a system to measure the frequency of both deletions and duplications between direct repeats. Short 17- and 18-bp palindromic and nonpalindromic DNA sequences were cloned into the EcoRI site within the chloramphenicol acetyltransferase gene of plasmids pBR325 and pJT7. This creates an insert between direct repeated EcoRI sites and results in a chloramphenicol-sensitive phenotype. Selection for chloramphenicol resistance was utilized to select chloramphenicol resistant revertants that included those with precise deletion of the insert from plasmid pBR325 and duplication of the insert in plasmid pJT7. The frequency of deletion or duplication varied more than 500-fold depending on the sequence of the short sequence inserted into the EcoRI site. For the nonpalindromic inserts, multiple internal direct repeats and the length of the direct repeats appear to influence the frequency of deletion. Certain palindromic DNA sequences with the potential to form DNA hairpin structures that might stabilize the misalignment of direct repeats had a high frequency of deletion. Other DNA sequences with the potential to form structures that might destabilize misalignment of direct repeats had a very low frequency of deletion. Duplication mutations occurred at the highest frequency when the DNA between the direct repeats contained no direct or inverted repeats. The presence of inverted repeats dramatically reduced the frequency of duplications. The results support the slippage-misalignment model, suggesting that misalignment occurring during DNA replication leads to deletion and duplication mutations. The results also support the idea that the formation of DNA secondary structures during DNA replication can facilitate and direct specific mutagenic events. PMID:8325478
Visual ModuleOrganizer: a graphical interface for the detection and comparative analysis of repeat DNA modules

PubMed Central

2014-01-01

Background DNA repeats, such as transposable elements, minisatellites and palindromic sequences, are abundant in sequences and have been shown to have significant and functional roles in the evolution of the host genomes. In a previous study, we introduced the concept of a repeat DNA module, a flexible motif present in at least two occurences in the sequences. This concept was embedded into ModuleOrganizer, a tool allowing the detection of repeat modules in a set of sequences. However, its implementation remains difficult for larger sequences. Results Here we present Visual ModuleOrganizer, a Java graphical interface that enables a new and optimized version of the ModuleOrganizer tool. To implement this version, it was recoded in C++ with compressed suffix tree data structures. This leads to less memory usage (at least 120-fold decrease in average) and decreases by at least four the computation time during the module detection process in large sequences. Visual ModuleOrganizer interface allows users to easily choose ModuleOrganizer parameters and to graphically display the results. Moreover, Visual ModuleOrganizer dynamically handles graphical results through four main parameters: gene annotations, overlapping modules with known annotations, location of the module in a minimal number of sequences, and the minimal length of the modules. As a case study, the analysis of FoldBack4 sequences clearly demonstrated that our tools can be extended to comparative and evolutionary analyses of any repeat sequence elements in a set of genomic sequences. With the increasing number of sequences available in public databases, it is now possible to perform comparative analyses of repeated DNA modules in a graphic and friendly manner within a reasonable time period. Availability Visual ModuleOrganizer interface and the new version of the ModuleOrganizer tool are freely available at: http://lcb.cnrs-mrs.fr/spip.php?rubrique313. PMID:24678954
Long interspersed repeated DNA (LINE) causes polymorphism at the rat insulin 1 locus.

PubMed

Lakshmikumaran, M S; D'Ambrosio, E; Laimins, L A; Lin, D T; Furano, A V

1985-09-01

The insulin 1, but not the insulin 2, locus is polymorphic (i.e., exhibits allelic variation) in rats. Restriction enzyme analysis and hybridization studies showed that the polymorphic region is 2.2 kilobases upstream of the insulin 1 coding region and is due to the presence or absence of an approximately 2.7-kilobase repeated DNA element. DNA sequence determination showed that this DNA element is a member of a long interspersed repeated DNA family (LINE) that is highly repeated (greater than 50,000 copies) and highly transcribed in the rat. Although the presence or absence of LINE sequences at the insulin 1 locus occurs in both the homozygous and heterozygous states, LINE-containing insulin 1 alleles are more prevalent in the rat population than are alleles without LINEs. Restriction enzyme analysis of the LINE-containing alleles indicated that at least two versions of the LINE sequence may be present at the insulin 1 locus in different rats. Either repeated transposition of LINE sequences or gene conversion between the resident insulin 1 LINE and other sequences in the genome are possible explanations for this.
Genome-wide characterization and selection of expressed sequence tag simple sequence repeat primers for optimized marker distribution and reliability in peach

USDA-ARS?s Scientific Manuscript database

Expressed sequence tag (EST) simple sequence repeats (SSRs) in Prunus were mined, and flanking primers designed and used for genome-wide characterization and selection of primers to optimize marker distribution and reliability. A total of 12,618 contigs were assembled from 84,727 ESTs, along with 34...
Microsatellite analysis in the genome of Acanthaceae: An in silico approach.

PubMed

Kaliswamy, Priyadharsini; Vellingiri, Srividhya; Nathan, Bharathi; Selvaraj, Saravanakumar

2015-01-01

Acanthaceae is one of the advanced and specialized families with conventionally used medicinal plants. Simple sequence repeats (SSRs) play a major role as molecular markers for genome analysis and plant breeding. The microsatellites existing in the complete genome sequences would help to attain a direct role in the genome organization, recombination, gene regulation, quantitative genetic variation, and evolution of genes. The current study reports the frequency of microsatellites and appropriate markers for the Acanthaceae family genome sequences. The whole nucleotide sequences of Acanthaceae species were obtained from National Center for Biotechnology Information database and screened for the presence of SSRs. SSR Locator tool was used to predict the microsatellites and inbuilt Primer3 module was used for primer designing. Totally 110 repeats from 108 sequences of Acanthaceae family plant genomes were identified, and the occurrence of dinucleotide repeats was found to be abundant in the genome sequences. The essential amino acid isoleucine was found rich in all the sequences. We also designed the SSR-based primers/markers for 59 sequences of this family that contains microsatellite repeats in their genome. The identified microsatellites and primers might be useful for breeding and genetic studies of plants that belong to Acanthaceae family in the future.
Isolation and mapping of telomeric pentanucleotide (TAACC)n repeats of the Pacific whiteleg shrimp, Penaeus vannamei, using fluorescence in situ hybridization.

PubMed

Alcivar-Warren, Acacia; Meehan-Meola, Dawn; Wang, Yongping; Guo, Ximing; Zhou, Linghua; Xiang, Jianhai; Moss, Shaun; Arce, Steve; Warren, William; Xu, Zhenkang; Bell, Kireina

2006-01-01

To develop genetic and physical maps for shrimp, accurate information on the actual number of chromosomes and a large number of genetic markers is needed. Previous reports have shown two different chromosome numbers for the Pacific whiteleg shrimp, Penaeus vannamei, the most important penaeid shrimp species cultured in the Western hemisphere. Preliminary results obtained by direct sequencing of clones from a Sau3A-digested genomic library of P. vannamei ovary identified a large number of (TAACC/GGTTA)-containing SSRs. The objectives of this study were to (1) examine the frequency of (TAACC)n repeats in 662 P. vannamei genomic clones that were directly sequenced, and perform homology searches of these clones, (2) confirm the number of chromosomes in testis of P. vannamei, and (3) localize the TAACC repeats in P. vannamei chromosome spreads using fluorescence in situ hybridization (FISH). Results for objective 1 showed that 395 out of the 662 clones sequenced contained single or multiple SSRs with three or more repeat motifs, 199 of which contained variable tandem repeats of the pentanucleotide (TAACC/GGTTA)n, with 3 to 14 copies per sequence. The frequency of (TAACC)n repeats in P. vannamei is 4.68 kb for SSRs with five or more repeat motifs. Sequence comparisons using the BLASTN nonredundant and expressed sequence tag (EST) databases indicated that most of the TAACC-containing clones were similar to either the core pentanucleotide repeat in PVPENTREP locus (GenBank accession no. X82619) or portions of 28S rRNA. Transposable elements (transposase for Tn1000 and reverse transcriptase family members), hypothetical or unnamed protein products, and genes of known function such as 18S and 28S rRNAs, heat shock protein 70, and thrombospondin were identified in non-TAACC-containing clones. For objective 2, the meiotic chromosome number of P. vannamei was confirmed as N = 44. For objective 3, four FISH probes (P1 to P4) containing different numbers of TAACC repeats produced positive signals on telomeres of P. vannamei chromosomes. A few chromosomes had positive signals interstitially. Probe signal strength and chromosome coverage differed in the general order of P1>P2>P3>P4, which correlated with the length of TAACC repeats within the probes: 83, 66, 35, and 30 bp, respectively, suggesting that the TAACC repeats, and not the flanking sequences, produced the TAACC signals at chromosome ends and TAACC is likely the telomere sequence for P. vannamei.
Complete mitochondrial genome of the whiter-spotted flower chafer, Protaetia brevitarsis (Coleoptera: Scarabaeidae).

PubMed

Kim, Min Jee; Im, Hyun Hwak; Lee, Kwang Youll; Han, Yeon Soo; Kim, Iksoo

2014-06-01

Abstract The complete nucleotide sequences of the mitochondrial genome from the whiter-spotted flower chafer, Protaetia brevitarsis (Coleoptera: Scarabaeidae), was determined. The 20,319-bp long circular genome is the longest among completely sequenced Coleoptera. As is typical in animals, the P. brevitarsis genome consisted of two ribosomal RNAs, 22 transfer RNAs, 13 protein-coding genes and one A + T-rich region. Although the size of the coding genes was typical, the non-coding A + T-rich region was 5654 bp, which is the longest in insects. The extraordinary length of this region was composed of 28,117-bp tandem repeats and 782-bp tandem repeats. These repeat sequences were encompassed by three non-repeat sequences constituting 1804 bp.

Chromosomal Targeting by the Type III-A CRISPR-Cas System Can Reshape Genomes in Staphylococcus aureus

PubMed Central

Guan, Jing; Wang, Wanying

2017-01-01

ABSTRACT CRISPR-Cas (clustered regularly interspaced short palindromic repeat [CRISPR]-CRISPR-associated protein [Cas]) systems can provide protection against invading genetic elements by using CRISPR RNAs (crRNAs) as a guide to locate and degrade the target DNA. CRISPR-Cas systems have been classified into two classes and five types according to the content of cas genes. Previous studies have indicated that CRISPR-Cas systems can avoid viral infection and block plasmid transfer. Here we show that chromosomal targeting by the Staphylococcus aureus type III-A CRISPR-Cas system can drive large-scale genome deletion and alteration within integrated staphylococcal cassette chromosome mec (SCCmec). The targeting activity of the CRISPR-Cas system is associated with the complementarity between crRNAs and protospacers, and 10- to 13-nucleotide truncations of spacers partially block CRISPR attack and more than 13-nucleotide truncation can fully abolish targeting, suggesting that a minimal length is required to license cleavage. Avoiding base pairings in the upstream region of protospacers is also necessary for CRISPR targeting. Successive trinucleotide complementarity between the 5′ tag of crRNAs and protospacers can disrupt targeting. Our findings reveal that type III-A CRISPR-Cas systems can modulate bacterial genome stability and may serve as a high-efficiency tool for deleting resistance or virulence genes in bacteria. IMPORTANCE Staphylococcus aureus is a pathogen that can cause a wide range of infections in humans. Studies have suggested that CRISPR-Cas systems can drive the loss of integrated mobile genetic elements (MGEs) by chromosomal targeting. Here we demonstrate that CRISPR-mediated cleavage contributes to the partial deletion of integrated SCCmec in methicillin-resistant S. aureus (MRSA), which provides a strategy for the treatment of MRSA infections. The spacer within artificial CRISPR arrays should contain more than 25 nucleotides for immunity, and consecutive trinucleotide pairings between a selected target and the 5′ tag of crRNA can block targeting. These findings add to our understanding of the molecular mechanisms of the type III-A CRISPR-Cas system and provide a novel strategy for the exploitation of engineered CRISPR immunity against integrated MGEs in bacteria for clinical and industrial applications. PMID:29152580
Genome-wide survey and analysis of microsatellites in nematodes, with a focus on the plant-parasitic species Meloidogyne incognita.

PubMed

Castagnone-Sereno, Philippe; Danchin, Etienne G J; Deleury, Emeline; Guillemaud, Thomas; Malausa, Thibaut; Abad, Pierre

2010-10-25

Microsatellites are the most popular source of molecular markers for studying population genetic variation in eukaryotes. However, few data are currently available about their genomic distribution and abundance across the phylum Nematoda. The recent completion of the genomes of several nematode species, including Meloidogyne incognita, a major agricultural pest worldwide, now opens the way for a comparative survey and analysis of microsatellites in these organisms. Using MsatFinder, the total numbers of 1-6 bp perfect microsatellites detected in the complete genomes of five nematode species (Brugia malayi, Caenorhabditis elegans, M. hapla, M. incognita, Pristionchus pacificus) ranged from 2,842 to 61,547, and covered from 0.09 to 1.20% of the nematode genomes. Under our search criteria, the most common repeat motifs for each length class varied according to the different nematode species considered, with no obvious relation to the AT-richness of their genomes. Overall, (AT)n, (AG)n and (CT)n were the three most frequent dinucleotide microsatellite motifs found in the five genomes considered. Except for two motifs in P. pacificus, all the most frequent trinucleotide motifs were AT-rich, with (AAT)n and (ATT)n being the only common to the five nematode species. A particular attention was paid to the microsatellite content of the plant-parasitic species M. incognita. In this species, a repertoire of 4,880 microsatellite loci was identified, from which 2,183 appeared suitable to design markers for population genetic studies. Interestingly, 1,094 microsatellites were identified in 801 predicted protein-coding regions, 99% of them being trinucleotides. When compared against the InterPro domain database, 497 of these CDS were successfully annotated, and further assigned to Gene Ontology terms. Contrasted patterns of microsatellite abundance and diversity were characterized in five nematode genomes, even in the case of two closely related Meloidogyne species. 2,245 di- to hexanucleotide loci were identified in the genome of M. incognita, providing adequate material for the future development of a wide range of microsatellite markers in this major plant parasite.
Chromosomal Targeting by the Type III-A CRISPR-Cas System Can Reshape Genomes in Staphylococcus aureus.

PubMed

Guan, Jing; Wang, Wanying; Sun, Baolin

2017-01-01

CRISPR-Cas (clustered regularly interspaced short palindromic repeat [CRISPR]-CRISPR-associated protein [Cas]) systems can provide protection against invading genetic elements by using CRISPR RNAs (crRNAs) as a guide to locate and degrade the target DNA. CRISPR-Cas systems have been classified into two classes and five types according to the content of cas genes. Previous studies have indicated that CRISPR-Cas systems can avoid viral infection and block plasmid transfer. Here we show that chromosomal targeting by the Staphylococcus aureus type III-A CRISPR-Cas system can drive large-scale genome deletion and alteration within integrated staphylococcal cassette chromosome mec (SCC mec ). The targeting activity of the CRISPR-Cas system is associated with the complementarity between crRNAs and protospacers, and 10- to 13-nucleotide truncations of spacers partially block CRISPR attack and more than 13-nucleotide truncation can fully abolish targeting, suggesting that a minimal length is required to license cleavage. Avoiding base pairings in the upstream region of protospacers is also necessary for CRISPR targeting. Successive trinucleotide complementarity between the 5' tag of crRNAs and protospacers can disrupt targeting. Our findings reveal that type III-A CRISPR-Cas systems can modulate bacterial genome stability and may serve as a high-efficiency tool for deleting resistance or virulence genes in bacteria. IMPORTANCE Staphylococcus aureus is a pathogen that can cause a wide range of infections in humans. Studies have suggested that CRISPR-Cas systems can drive the loss of integrated mobile genetic elements (MGEs) by chromosomal targeting. Here we demonstrate that CRISPR-mediated cleavage contributes to the partial deletion of integrated SCC mec in methicillin-resistant S. aureus (MRSA), which provides a strategy for the treatment of MRSA infections. The spacer within artificial CRISPR arrays should contain more than 25 nucleotides for immunity, and consecutive trinucleotide pairings between a selected target and the 5' tag of crRNA can block targeting. These findings add to our understanding of the molecular mechanisms of the type III-A CRISPR-Cas system and provide a novel strategy for the exploitation of engineered CRISPR immunity against integrated MGEs in bacteria for clinical and industrial applications.
Randomized controlled trial of ethyl-eicosapentaenoic acid in Huntington disease: the TREND-HD study.

PubMed

2008-12-01

To determine whether ethyl-eicosapentaenoic acid (ethyl-EPA), an omega-3 fatty acid, improves the motor features of Huntington disease. Six-month multicenter, randomized, double-blind, placebo-controlled trial followed by a 6-month open-label phase without disclosing initial treatment assignments. Forty-one research sites in the United States and Canada. Three hundred sixteen adults with Huntington disease, enriched for a population with shorter trinucleotide (cytosine-adenine-guanine) repeat length expansions. Random assignment to placebo or ethyl-EPA, 1 g twice a day, followed by open-label treatment with ethyl-EPA. Six-month change in the Total Motor Score 4 component of the Unified Huntington's Disease Rating Scale analyzed for all research participants and those with shorter cytosine-adenine-guanine repeat length expansions (<45). At 6 months, the Total Motor Score 4 point change for patients receiving ethyl-EPA did not differ from that for those receiving placebo. No differences were found in measures of function, cognition, or global impression. Before public disclosure of the 6-month placebo-controlled results, 192 individuals completed the open-label phase. The Total Motor Score 4 change did not worsen for those who received active treatment for 12 continuous months compared with those who received active treatment for only 6 months (2.0-point worsening; P=.02). Ethyl-EPA was not beneficial in patients with Huntington disease during 6 months of placebo-controlled evaluation. Clinical Trial Registry clinicaltrials.gov Identifier: NCT00146211.
Massively parallel sequencing of forensic STRs: Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements.

PubMed

Parson, Walther; Ballard, David; Budowle, Bruce; Butler, John M; Gettings, Katherine B; Gill, Peter; Gusmão, Leonor; Hares, Douglas R; Irwin, Jodi A; King, Jonathan L; Knijff, Peter de; Morling, Niels; Prinz, Mechthild; Schneider, Peter M; Neste, Christophe Van; Willuweit, Sascha; Phillips, Christopher

2016-05-01

The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established nomenclature for CE-based STR analysis will remain unchanged in the future, the nomenclature of sequence-based STR genotypes will need to follow updated rules and be generated by expert systems that translate MPS sequences to match CE conventions in order to guarantee compatibility between the different generations of STR data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Structure and stability of the ankyrin domain of the Drosophila Notch receptor.

PubMed

Zweifel, Mark E; Leahy, Daniel J; Hughson, Frederick M; Barrick, Doug

2003-11-01

The Notch receptor contains a conserved ankyrin repeat domain that is required for Notch-mediated signal transduction. The ankyrin domain of Drosophila Notch contains six ankyrin sequence repeats previously identified as closely matching the ankyrin repeat consensus sequence, and a putative seventh C-terminal sequence repeat that exhibits lower similarity to the consensus sequence. To better understand the role of the Notch ankyrin domain in Notch-mediated signaling and to examine how structure is distributed among the seven ankyrin sequence repeats, we have determined the crystal structure of this domain to 2.0 angstroms resolution. The seventh, C-terminal, ankyrin sequence repeat adopts a regular ankyrin fold, but the first, N-terminal ankyrin repeat, which contains a 15-residue insertion, appears to be largely disordered. The structure reveals a substantial interface between ankyrin polypeptides, showing a high degree of shape and charge complementarity, which may be related to homotypic interactions suggested from indirect studies. However, the Notch ankyrin domain remains largely monomeric in solution, demonstrating that this interface alone is not sufficient to promote tight association. Using the structure, we have classified reported mutations within the Notch ankyrin domain that are known to disrupt signaling into those that affect buried residues and those restricted to surface residues. We show that the buried substitutions greatly decrease protein stability, whereas the surface substitutions have only a marginal affect on stability. The surface substitutions are thus likely to interfere with Notch signaling by disrupting specific Notch-effector interactions and map the sites of these interactions.
Molecular architecture of classical cytological landmarks: Centromeres and telomeres

DOE Office of Scientific and Technical Information (OSTI.GOV)

Meyne, J.

1994-11-01

Both the human telomere repeat and the pericentromeric repeat sequence (GGAAT)n were isolated based on evolutionary conservation. Their isolation was based on the premise that chromosomal features as structurally and functionally important as telomeres and centromeres should be highly conserved. Both sequences were isolated by high stringency screening of a human repetitive DNA library with rodent repetitive DNA. The pHuR library (plasmid Human Repeat) used for this project was enriched for repetitive DNA by using a modification of the standard DNA library preparation method. Usually DNA for a library is cut with restriction enzymes, packaged, infected, and the library ismore » screened. A problem with this approach is that many tandem repeats don`t have any (or many) common restriction sites. Therefore, many of the repeat sequences will not be represented in the library because they are not restricted to a viable length for the vector used. To prepare the pHuR library, human DNA was mechanically sheared to a small size. These relatively short DNA fragments were denatured and then renatured to C{sub o}t 50. Theoretically only repetitive DNA sequences should renature under C{sub o}t 50 conditions. The single-stranded regions were digested using S1 nuclease, leaving the double-stranded, renatured repeat sequences.« less
Sequences spanning the leader-repeat junction mediate CRISPR adaptation to phage in Streptococcus thermophilus

PubMed Central

Wei, Yunzhou; Chesne, Megan T.; Terns, Rebecca M.; Terns, Michael P.

2015-01-01

CRISPR-Cas systems are RNA-based immune systems that protect prokaryotes from invaders such as phages and plasmids. In adaptation, the initial phase of the immune response, short foreign DNA fragments are captured and integrated into host CRISPR loci to provide heritable defense against encountered foreign nucleic acids. Each CRISPR contains a ∼100–500 bp leader element that typically includes a transcription promoter, followed by an array of captured ∼35 bp sequences (spacers) sandwiched between copies of an identical ∼35 bp direct repeat sequence. New spacers are added immediately downstream of the leader. Here, we have analyzed adaptation to phage infection in Streptococcus thermophilus at the CRISPR1 locus to identify cis-acting elements essential for the process. We show that the leader and a single repeat of the CRISPR locus are sufficient for adaptation in this system. Moreover, we identified a leader sequence element capable of stimulating adaptation at a dormant repeat. We found that sequences within 10 bp of the site of integration, in both the leader and repeat of the CRISPR, are required for the process. Our results indicate that information at the CRISPR leader-repeat junction is critical for adaptation in this Type II-A system and likely other CRISPR-Cas systems. PMID:25589547
Structure, organization, and sequence of alpha satellite DNA from human chromosome 17: evidence for evolution by unequal crossing-over and an ancestral pentamer repeat shared with the human X chromosome.

PubMed

Waye, J S; Willard, H F

1986-09-01

The centromeric regions of all human chromosomes are characterized by distinct subsets of a diverse tandemly repeated DNA family, alpha satellite. On human chromosome 17, the predominant form of alpha satellite is a 2.7-kilobase-pair higher-order repeat unit consisting of 16 alphoid monomers. We present the complete nucleotide sequence of the 16-monomer repeat, which is present in 500 to 1,000 copies per chromosome 17, as well as that of a less abundant 15-monomer repeat, also from chromosome 17. These repeat units were approximately 98% identical in sequence, differing by the exclusion of precisely 1 monomer from the 15-monomer repeat. Homologous unequal crossing-over is suggested as a probable mechanism by which the different repeat lengths on chromosome 17 were generated, and the putative site of such a recombination event is identified. The monomer organization of the chromosome 17 higher-order repeat unit is based, in part, on tandemly repeated pentamers. A similar pentameric suborganization has been previously demonstrated for alpha satellite of the human X chromosome. Despite the organizational similarities, substantial sequence divergence distinguishes these subsets. Hybridization experiments indicate that the chromosome 17 and X subsets are more similar to each other than to the subsets found on several other human chromosomes. We suggest that the chromosome 17 and X alpha satellite subsets may be related components of a larger alphoid subfamily which have evolved from a common ancestral repeat into the contemporary chromosome-specific subsets.
Nucleotide sequences of Dictyostelium discoideum developmentally regulated cDNAs rich in (AAC) imply proteins that contain clusters of asparagine, glutamine, or threonine.

PubMed

Shaw, D R; Richter, H; Giorda, R; Ohmachi, T; Ennis, H L

1989-09-01

A Dictyostelium discoideum repetitive element composed of long repeats of the codon (AAC) is found in developmentally regulated transcripts. The concentration of (AAC) sequences is low in mRNA from dormant spores and growing cells and increases markedly during spore germination and multicellular development. The sequence hybridizes to many different sized Dictyostelium DNA restriction fragments indicating that it is scattered throughout the genome. Four cDNA clones isolated contain (AAC) sequences in the deduced coding region. Interestingly, the (AAC)-rich sequences are present in all three reading frames in the deduced proteins, i.e., AAC (asparagine), ACA (threonine) and CAA (glutamine). Three of the clones contain only one of these in-frame so that the individual proteins carry either asparagine, threonine, or glutamine clusters, not mixtures. However, one clone is both glutamine- and asparagine-rich. The (AAC) portion of the transcripts are reiterated 300 times in the haploid genome while the other portions of the cDNAs represent single copy genes, whose sequences show no similarity other than the (AAC) repeats. The repeated sequence is similar to the opa or M sequence found in Drosophila melanogaster notch and homeo box genes and in fly developmentally regulated transcripts. The transcripts are present on polysomes suggesting that they are translated. Although the function of these repeats is unknown, long amino acid repeats are a characteristic feature of extracellular proteins of lower eukaryotes.
Design, production and molecular structure of a new family of artificial alpha-helicoidal repeat proteins (αRep) based on thermostable HEAT-like repeats.

PubMed

Urvoas, Agathe; Guellouz, Asma; Valerio-Lepiniec, Marie; Graille, Marc; Durand, Dominique; Desravines, Danielle C; van Tilbeurgh, Herman; Desmadril, Michel; Minard, Philippe

2010-11-26

Repeat proteins have a modular organization and a regular architecture that make them attractive models for design and directed evolution experiments. HEAT repeat proteins, although very common, have not been used as a scaffold for artificial proteins, probably because they are made of long and irregular repeats. Here, we present and validate a consensus sequence for artificial HEAT repeat proteins. The sequence was defined from the structure-based sequence analysis of a thermostable HEAT-like repeat protein. Appropriate sequences were identified for the N- and C-caps. A library of genes coding for artificial proteins based on this sequence design, named αRep, was assembled using new and versatile methodology based on circular amplification. Proteins picked randomly from this library are expressed as soluble proteins. The biophysical properties of proteins with different numbers of repeats and different combinations of side chains in hypervariable positions were characterized. Circular dichroism and differential scanning calorimetry experiments showed that all these proteins are folded cooperatively and are very stable (T(m) >70 °C). Stability of these proteins increases with the number of repeats. Detailed gel filtration and small-angle X-ray scattering studies showed that the purified proteins form either monomers or dimers. The X-ray structure of a stable dimeric variant structure was solved. The protein is folded with a highly regular topology and the repeat structure is organized, as expected, as pairs of alpha helices. In this protein variant, the dimerization interface results directly from the variable surface enriched in aromatic residues located in the randomized positions of the repeats. The dimer was crystallized both in an apo and in a PEG-bound form, revealing a very well defined binding crevice and some structure flexibility at the interface. This fortuitous binding site could later prove to be a useful binding site for other low molecular mass partners. Copyright © 2010 Elsevier Ltd. All rights reserved.
TRedD—A database for tandem repeats over the edit distance

PubMed Central

Sokol, Dina; Atagun, Firat

2010-01-01

A ‘tandem repeat’ in DNA is a sequence of two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats are common in the genomes of both eukaryotic and prokaryotic organisms. They are significant markers for human identity testing, disease diagnosis, sequence homology and population studies. In this article, we describe a new database, TRedD, which contains the tandem repeats found in the human genome. The database is publicly available online, and the software for locating the repeats is also freely available. The definition of tandem repeats used by TRedD is a new and innovative definition based upon the concept of ‘evolutive tandem repeats’. In addition, we have developed a tool, called TandemGraph, to graphically depict the repeats occurring in a sequence. This tool can be coupled with any repeat finding software, and it should greatly facilitate analysis of results. Database URL: http://tandem.sci.brooklyn.cuny.edu/ PMID:20624712
ST proteins, a new family of plant tandem repeat proteins with a DUF2775 domain mainly found in Fabaceae and Asteraceae.

PubMed

Albornos, Lucía; Martín, Ignacio; Iglesias, Rebeca; Jiménez, Teresa; Labrador, Emilia; Dopico, Berta

2012-11-07

Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats. ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development. We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the described group of 20 to 40 amino acid tandem repeat proteins and also from known cell wall proteins with repeat sequences. Several putative roles in plant physiology can be inferred from the characteristics found.
ST proteins, a new family of plant tandem repeat proteins with a DUF2775 domain mainly found in Fabaceae and Asteraceae

PubMed Central

2012-01-01

Background Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats. Results ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development. Conclusions We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the described group of 20 to 40 amino acid tandem repeat proteins and also from known cell wall proteins with repeat sequences. Several putative roles in plant physiology can be inferred from the characteristics found. PMID:23134664
A candidate gene for choanal atresia in alpaca.

PubMed

Reed, Kent M; Bauer, Miranda M; Mendoza, Kristelle M; Armién, Aníbal G

2010-03-01

Choanal atresia (CA) is a common nasal craniofacial malformation in New World domestic camelids (alpaca and llama). CA results from abnormal development of the nasal passages and is especially debilitating to newborn crias. CA in camelids shares many of the clinical manifestations of a similar condition in humans (CHARGE syndrome). Herein we report on the regulatory gene CHD7 of alpaca, whose homologue in humans is most frequently associated with CHARGE. Sequence of the CHD7 coding region was obtained from a non-affected cria. The complete coding region was 9003 bp, corresponding to a translated amino acid sequence of 3000 aa. Additional genomic sequences corresponding to a significant portion of the CHD7 gene were identified and assembled from the 2x alpaca whole genome sequence, providing confirmatory sequence for much of the CHD7 coding region. The alpaca CHD7 mRNA sequence was 97.9% similar to the human sequence, with the greatest sequence difference being an insertion in exon 38 that results in a polyalanine repeat (A12). Polymorphism in this repeat was tested for association with CA in alpaca by cloning and sequencing the repeat from both affected and non-affected individuals. Variation in length of the poly-A repeat was not associated with CA. Complete sequencing of the CHD7 gene will be necessary to determine whether other mutations in CHD7 are the cause of CA in camelids.
A theory that may explain the Hayflick limit--a means to delete one copy of a repeating sequence during each cell cycle in certain human cells such as fibroblasts.

PubMed

Naveilhan, P; Baudet, C; Jabbour, W; Wion, D

1994-09-01

A model that may explain the limited division potential of certain cells such as human fibroblasts in culture is presented. The central postulate of this theory is that there exists, prior to certain key exons that code for materials needed for cell division, a unique sequence of specific repeating segments of DNA. One copy of such repeating segments is deleted during each cell cycle in cells that are not protected from such deletion through methylation of their cytosine residues. According to this theory, the means through which such repeated sequences are removed, one per cycle, is through the sequential action of enzymes that act much as bacterial restriction enzymes do--namely to produce scissions in both strands of DNA in areas that correspond to the DNA base sequence recognition specificities of such enzymes. After the first scission early in a replicative cycle, that enzyme becomes inhibited, but the cleavage of the first site exposes the closest site in the repetitive element to the action of a second restriction enzyme after which that enzyme also becomes inhibited. Then repair occurs, regenerating the original first site. Through this sequential activation and inhibition of two different restriction enzymes, only one copy of the repeating sequence is deleted during each cell cycle. In effect, the repeating sequence operates as a precise counter of the numbers of cell doubling that have occurred since the cells involved differentiated during development.
Molecular characterization and physical localization of highly repetitive DNA sequences from Brazilian Alstroemeria species.

PubMed

Kuipers, A G J; Kamstra, S A; de Jeu, M J; Visser, R G F

2002-01-01

Highly repetitive DNA sequences were isolated from genomic DNA libraries of Alstroemeria psittacina and A. inodora. Among the repetitive sequences that were isolated, tandem repeats as well as dispersed repeats could be discerned. The tandem repeats belonged to a family of interlinked Sau3A subfragments with sizes varying from 68-127 bp, and constituted a larger HinfI repeat of approximately 400 bp. Southern hybridization showed a similar molecular organization of the tandem repeats in each of the Brazilian Alstroemeria species tested. None of the repeats hybridized with DNA from Chilean Alstroemeria species, which indicates that they are specific for the Brazilian species. In-situ localization studies revealed the tandem repeats to be localized in clusters on the chromosomes of A. inodora and A. psittacina: distal hybridization sites were found on chromosome arms 2PS, 6PL, 7PS, 7PL and 8PL, interstitial sites on chromosome arms 2PL, 3PL, 4PL and 5PL. The applicability of the tandem repeats for cytogenetic analysis of interspecific hybrids and their role in heterochromatin organization are discussed.
Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.)

USDA-ARS?s Scientific Manuscript database

Background: Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed S...
Development and characterization of simple sequence repeats for Bipolaris sokiniana and cross transferability to related species

USDA-ARS?s Scientific Manuscript database

Simple sequence repeats (SSR) markers were developed from a small insert genomic library for Bipolaris sorokiniana, a mitosporic fungal pathogen that causes spot blotch and root rot in switchgrass. About 59% of sequenced clones (n=384) harbored various SSR motifs. After eliminating the redundant seq...
Are the TTAGG and TTAGGG telomeric repeats phylogenetically conserved in aculeate Hymenoptera?

NASA Astrophysics Data System (ADS)

Menezes, Rodolpho S. T.; Bardella, Vanessa B.; Cabral-de-Mello, Diogo C.; Lucena, Daercio A. A.; Almeida, Eduardo A. B.

2017-10-01

Despite the (TTAGG)n telomeric repeat supposed being the ancestral DNA motif of telomeres in insects, it was repeatedly lost within some insect orders. Notably, parasitoid hymenopterans and the social wasp Metapolybia decorata (Gribodo) lack the (TTAGG)n sequence, but in other representatives of Hymenoptera, this motif was noticed, such as different ant species and the honeybee. These findings raise the question of whether the insect telomeric repeat is or not phylogenetically predominant in Hymenoptera. Thus, we evaluated the occurrence of both the (TTAGG)n sequence and the vertebrate telomere sequence (TTAGGG)n using dot-blotting hybridization in 25 aculeate species of Hymenoptera. Our results revealed the absence of (TTAGG)n sequence in all tested species, elevating the number of hymenopteran families lacking this telomeric sequence to 13 out of the 15 tested families so far. The (TTAGGG)n was not observed in any tested species. Based on our data and compiled information, we suggest that the (TTAGG)n sequence was putatively lost in the ancestor of Apocrita with at least two subsequent independent regains (in Formicidae and Apidae).

Repetitive DNA in the pea (Pisum sativum L.) genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula

PubMed Central

Macas, Jiří; Neumann, Pavel; Navrátilová, Alice

2007-01-01

Background Extraordinary size variation of higher plant nuclear genomes is in large part caused by differences in accumulation of repetitive DNA. This makes repetitive DNA of great interest for studying the molecular mechanisms shaping architecture and function of complex plant genomes. However, due to methodological constraints of conventional cloning and sequencing, a global description of repeat composition is available for only a very limited number of higher plants. In order to provide further data required for investigating evolutionary patterns of repeated DNA within and between species, we used a novel approach based on massive parallel sequencing which allowed a comprehensive repeat characterization in our model species, garden pea (Pisum sativum). Results Analysis of 33.3 Mb sequence data resulted in quantification and partial sequence reconstruction of major repeat families occurring in the pea genome with at least thousands of copies. Our results showed that the pea genome is dominated by LTR-retrotransposons, estimated at 140,000 copies/1C. Ty3/gypsy elements are less diverse and accumulated to higher copy numbers than Ty1/copia. This is in part due to a large population of Ogre-like retrotransposons which alone make up over 20% of the genome. In addition to numerous types of mobile elements, we have discovered a set of novel satellite repeats and two additional variants of telomeric sequences. Comparative genome analysis revealed that there are only a few repeat sequences conserved between pea and soybean genomes. On the other hand, all major families of pea mobile elements are well represented in M. truncatula. Conclusion We have demonstrated that even in a species with a relatively large genome like pea, where a single 454-sequencing run provided only 0.77% coverage, the generated sequences were sufficient to reconstruct and analyze major repeat families corresponding to a total of 35–48% of the genome. These data provide a starting point for further investigations of legume plant genomes based on their global comparative analysis and for the development of more sophisticated approaches for data mining. PMID:18031571
Molecular basis of length polymorphism in the human zeta-globin gene complex.

PubMed Central

Goodbourn, S E; Higgs, D R; Clegg, J B; Weatherall, D J

1983-01-01

The length polymorphism between the human zeta-globin gene and its pseudogene is caused by an allele-specific variation in the copy number of a tandemly repeating 36-base-pair sequence. This sequence is related to a tandemly repeated 14-base-pair sequence in the 5' flanking region of the human insulin gene, which is known to cause length polymorphism, and to a repetitive sequence in intervening sequence (IVS) 1 of the pseudo-zeta-globin gene. Evidence is presented that the latter is also of variable length, probably because of differences in the copy number of the tandem repeat. The homology between the three length polymorphisms may be an indication of the presence of a more widespread group of related sequences in the human genome, which might be useful for generalized linkage studies. PMID:6308667
Identification and Analysis of Novel Amino-Acid Sequence Repeats in Bacillus anthracis str. Ames Proteome Using Computational Tools

PubMed Central

Hemalatha, G. R.; Rao, D. Satyanarayana; Guruprasad, L.

2007-01-01

We have identified four repeats and ten domains that are novel in proteins encoded by the Bacillus anthracis str. Ames proteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 57-amino-acid-residue PxV domain, (2) 122-amino-acid-residue FxF domain, (3) 111-amino-acid-residue YEFF domain, (4) 109-amino-acid-residue IMxxH domain, (5) 103-amino-acid-residue VxxT domain, (6) 84-amino-acid-residue ExW domain, (7) 104-amino-acid-residue NTGFIG domain, (8) 36-amino-acid-residue NxGK repeat, (9) 95-amino-acid-residue VYV domain, (10) 75-amino-acid-residue KEWE domain, (11) 59-amino-acid-residue AFL domain, (12) 53-amino-acid-residue RIDVK repeat, (13) (a) 41-amino-acid-residue AGQF repeat and (b) 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure. PMID:17538688
Genetic profiling of Trypanosoma cruzi directly in infected tissues using nested PCR of polymorphic microsatellites.

PubMed

Valadares, Helder Magno Silva; Pimenta, Juliana Ramos; de Freitas, Jorge Marcelo; Duffy, Tomás; Bartholomeu, Daniella C; Oliveira, Riva de Paula; Chiari, Egler; Moreira, Maria da Consolação Vieira; Filho, Geraldo Brasileiro; Schijman, Alejandro Gabriel; Franco, Glória Regina; Machado, Carlos Renato; Pena, Sérgio Danilo Junho; Macedo, Andréa Mara

2008-06-01

The investigation of the importance of the genetics of Trypanosoma cruzi in determining the clinical course of Chagas disease will depend on precise characterisation of the parasites present in the tissue lesions. This can be adequately accomplished by the use of hypervariable nuclear markers such as microsatellites. However the unilocal nature of these loci and the scarcity of parasites in chronic lesions make it necessary to use high sensitivity PCR with nested primers, whose design depends on the availability of long flanking regions, a feature not hitherto available for any known T. cruzi microsatellites. Herein, making use of the extensive T. cruzi genome sequence now available and using the Tandem Repeats Finder software, it was possible to identify and characterise seven new microsatellite loci--six composed of trinucleotide (TcTAC15, TcTAT20, TcAAT8, TcATT14, TcGAG10 and TcCAA10) and one composed of tetranucleotide (TcAAAT6) motifs. All except the TcCAA10 locus were physically mapped onto distinct intergenic regions of chromosome III of the CL Brener clone contigs. The TcCAA10 locus was localised within a hypothetical protein gene in the T. cruzi genome. All microsatellites were polymorphic and useful for T. cruzi genetic variability studies. Using the TcTAC15 locus it was possible to separate the strains belonging to the T. cruzi I lineage (DTU I) from those belonging to T. cruzi II (DTU IIb), T. cruzi III (DTU IIc) and a hybrid group (DTU IId, IIe). The long flanking regions of these novel microsatellites allowed construction of nested primers and the use of full nested PCR protocols. This strategy enabled us to detect and differentiate T. cruzi strains directly in clinical specimens including heart, blood, CSF and skin tissues from patients in the acute and chronic phases of Chagas disease.
Mango (Mangifera indica L.) germplasm diversity based on single nucleotide polymorphisms derived from the transcriptome.

PubMed

Sherman, Amir; Rubinstein, Mor; Eshed, Ravit; Benita, Miri; Ish-Shalom, Mazal; Sharabi-Schwager, Michal; Rozen, Ada; Saada, David; Cohen, Yuval; Ophir, Ron

2015-11-14

Germplasm collections are an important source for plant breeding, especially in fruit trees which have a long duration of juvenile period. Thus, efforts have been made to study the diversity of fruit tree collections. Even though mango is an economically important crop, most of the studies on diversity in mango collections have been conducted with a small number of genetic markers. We describe a de novo transcriptome assembly from mango cultivar 'Keitt'. Variation discovery was performed using Illumina resequencing of 'Keitt' and 'Tommy Atkins' cultivars identified 332,016 single-nucleotide polymorphisms (SNPs) and 1903 simple-sequence repeats (SSRs). Most of the SSRs (70.1%) were of trinucleotide with the preponderance of motif (GGA/AAG)n and only 23.5% were di-nucleotide SSRs with the mostly of (AT/AT)n motif. Further investigation of the diversity in the Israeli mango collection was performed based on a subset of 293 SNPs. Those markers have divided the Israeli mango collection into two major groups: one group included mostly mango accessions from Southeast Asia (Malaysia, Thailand, Indonesia) and India and the other with mainly of Floridian and Israeli mango cultivars. The latter group was more polymorphic (FS=-0.1 on the average) and was more of an admixture than the former group. A slight population differentiation was detected (FST=0.03), suggesting that if the mango accessions of the western world apparently was originated from Southeast Asia, as has been previously suggested, the duration of cultivation was not long enough to develop a distinct genetic background. Whole-transcriptome reconstruction was used to significantly broaden the mango's genetic variation resources, i.e., SNPs and SSRs. The set of SNP markers described in this study is novel. A subset of SNPs was sampled to explore the Israeli mango collection and most of them were polymorphic in many mango accessions. Therefore, we believe that these SNPs will be valuable as they recapitulate and strengthen the history of mango diversity.
Complete mitochondrial genome of the larch hawk moth, Sphinx morio (Lepidoptera: Sphingidae).

PubMed

Kim, Min Jee; Choi, Sei-Woong; Kim, Iksoo

2013-12-01

The larch hawk moth, Sphinx morio, belongs to the lepidopteran family Sphingidae that has long been studied as a family of model insects in a diverse field. In this study, we describe the complete mitochondrial genome (mitogenome) sequences of the species in terms of general genomic features and characteristic short repetitive sequences found in the A + T-rich region. The 15,299-bp-long genome consisted of a typical set of genes (13 protein-coding genes, 2 rRNA genes, and 22 tRNA genes) and one major non-coding A + T-rich region, with the typical arrangement found in Lepidoptera. The 316-bp-long A + T-rich region located between srRNA and tRNA(Met) harbored the conserved sequence blocks that are typically found in lepidopteran insects. Additionally, the A + T-rich region of S. morio contained three characteristic repeat sequences that are rarely found in Lepidoptera: two identical 12-bp repeat, three identical 5-bp-long tandem repeat, and six nearly identical 5-6 bp long repeat sequences.
PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

PubMed

Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

2011-01-01

PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
Long interspersed repeated DNA (LINE) causes polymorphism at the rat insulin 1 locus.

PubMed Central

Lakshmikumaran, M S; D'Ambrosio, E; Laimins, L A; Lin, D T; Furano, A V

1985-01-01

The insulin 1, but not the insulin 2, locus is polymorphic (i.e., exhibits allelic variation) in rats. Restriction enzyme analysis and hybridization studies showed that the polymorphic region is 2.2 kilobases upstream of the insulin 1 coding region and is due to the presence or absence of an approximately 2.7-kilobase repeated DNA element. DNA sequence determination showed that this DNA element is a member of a long interspersed repeated DNA family (LINE) that is highly repeated (greater than 50,000 copies) and highly transcribed in the rat. Although the presence or absence of LINE sequences at the insulin 1 locus occurs in both the homozygous and heterozygous states, LINE-containing insulin 1 alleles are more prevalent in the rat population than are alleles without LINEs. Restriction enzyme analysis of the LINE-containing alleles indicated that at least two versions of the LINE sequence may be present at the insulin 1 locus in different rats. Either repeated transposition of LINE sequences or gene conversion between the resident insulin 1 LINE and other sequences in the genome are possible explanations for this. Images PMID:3016521
Spectroscopic insights into quadruplexes of five-repeat telomere DNA sequences upon G-block damage.

PubMed

Dvořáková, Zuzana; Vorlíčková, Michaela; Renčiuk, Daniel

2017-11-01

The DNA lesions, resulting from oxidative damage, were shown to destabilize human telomere four-repeat quadruplex and to alter its structure. Long telomere DNA, as a repetitive sequence, offers, however, other mechanisms of dealing with the lesion: extrusion of the damaged repeat into loop or shifting the quadruplex position by one repeat. Using circular dichroism and UV absorption spectroscopy and polyacrylamide electrophoresis, we studied consequences of lesions at different positions of the model five-repeat human telomere DNA sequences on the structure and stability of their quadruplexes in sodium and in potassium. The repeats affected by lesion are preferentially positioned as terminal overhangs of the core quadruplex structurally similar to the four-repeat one. Forced affecting of the inner repeats leads to presence of variety of more parallel folds in potassium. In sodium the designed models form mixture of two dominant antiparallel quadruplexes whose population varies with the position of the affected repeat. The shapes of quadruplex CD spectra, namely the height of dominant peaks, significantly correlate with melting temperatures. Lesion in one guanine tract of a more than four repeats long human telomere DNA sequence may cause re-positioning of its quadruplex arrangement associated with a shift of the structure to less common quadruplex conformations. The type of the quadruplex depends on the loop position and external conditions. The telomere DNA quadruplexes are quite resistant to the effect of point mutations due to the telomere DNA repetitive nature, although their structure and, consequently, function might be altered. Copyright © 2017. Published by Elsevier B.V.
Microsatellite analysis in the genome of Acanthaceae: An in silico approach

PubMed Central

Kaliswamy, Priyadharsini; Vellingiri, Srividhya; Nathan, Bharathi; Selvaraj, Saravanakumar

2015-01-01

Background: Acanthaceae is one of the advanced and specialized families with conventionally used medicinal plants. Simple sequence repeats (SSRs) play a major role as molecular markers for genome analysis and plant breeding. The microsatellites existing in the complete genome sequences would help to attain a direct role in the genome organization, recombination, gene regulation, quantitative genetic variation, and evolution of genes. Objective: The current study reports the frequency of microsatellites and appropriate markers for the Acanthaceae family genome sequences. Materials and Methods: The whole nucleotide sequences of Acanthaceae species were obtained from National Center for Biotechnology Information database and screened for the presence of SSRs. SSR Locator tool was used to predict the microsatellites and inbuilt Primer3 module was used for primer designing. Results: Totally 110 repeats from 108 sequences of Acanthaceae family plant genomes were identified, and the occurrence of dinucleotide repeats was found to be abundant in the genome sequences. The essential amino acid isoleucine was found rich in all the sequences. We also designed the SSR-based primers/markers for 59 sequences of this family that contains microsatellite repeats in their genome. Conclusion: The identified microsatellites and primers might be useful for breeding and genetic studies of plants that belong to Acanthaceae family in the future. PMID:25709226
The repeating nucleotide sequence in the repetitive mitochondrial DNA from a "low-density" petite mutant of yeast.

PubMed Central

Van Kreijl, C F; Bos, J L

1977-01-01

The repeating nucleotide sequence of 68 base pairs in the mtDNA from an ethidium-induced cytoplasmic petite mutant of yeast has been determined. For sequence analysis specifically primed and terminated RNA copies, obtained by in vitro transcription of the separated strands, were use. The sequence consists of 66 consecutive AT base pairs flanked by two GC pairs and comprises nearly all of the mutant mitochondrial genome. The sequence, moreover, also represents the first part of wild-type mtDNA sequence so far. Images PMID:198740
Comparative genome-wide polymorphic microsatellite markers in Antarctic penguins through next generation sequencing

PubMed Central

Vianna, Juliana A.; Noll, Daly; Mura-Jornet, Isidora; Valenzuela-Guerra, Paulina; González-Acuña, Daniel; Navarro, Cristell; Loyola, David E.; Dantas, Gisele P. M.

2017-01-01

Abstract Microsatellites are valuable molecular markers for evolutionary and ecological studies. Next generation sequencing is responsible for the increasing number of microsatellites for non-model species. Penguins of the Pygoscelis genus are comprised of three species: Adélie (P. adeliae), Chinstrap (P. antarcticus) and Gentoo penguin (P. papua), all distributed around Antarctica and the sub-Antarctic. The species have been affected differently by climate change, and the use of microsatellite markers will be crucial to monitor population dynamics. We characterized a large set of genome-wide microsatellites and evaluated polymorphisms in all three species. SOLiD reads were generated from the libraries of each species, identifying a large amount of microsatellite loci: 33,677, 35,265 and 42,057 for P. adeliae, P. antarcticus and P. papua, respectively. A large number of dinucleotide (66,139), trinucleotide (29,490) and tetranucleotide (11,849) microsatellites are described. Microsatellite abundance, diversity and orthology were characterized in penguin genomes. We evaluated polymorphisms in 170 tetranucleotide loci, obtaining 34 polymorphic loci in at least one species and 15 polymorphic loci in all three species, which allow to perform comparative studies. Polymorphic markers presented here enable a number of ecological, population, individual identification, parentage and evolutionary studies of Pygoscelis, with potential use in other penguin species. PMID:28898354
DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest.

PubMed

Manavalan, Balachandran; Shin, Tae Hwan; Lee, Gwang

2018-01-05

DNase I hypersensitive sites (DHSs) are genomic regions that provide important information regarding the presence of transcriptional regulatory elements and the state of chromatin. Therefore, identifying DHSs in uncharacterized DNA sequences is crucial for understanding their biological functions and mechanisms. Although many experimental methods have been proposed to identify DHSs, they have proven to be expensive for genome-wide application. Therefore, it is necessary to develop computational methods for DHS prediction. In this study, we proposed a support vector machine (SVM)-based method for predicting DHSs, called DHSpred (DNase I Hypersensitive Site predictor in human DNA sequences), which was trained with 174 optimal features. The optimal combination of features was identified from a large set that included nucleotide composition and di- and trinucleotide physicochemical properties, using a random forest algorithm. DHSpred achieved a Matthews correlation coefficient and accuracy of 0.660 and 0.871, respectively, which were 3% higher than those of control SVM predictors trained with non-optimized features, indicating the efficiency of the feature selection method. Furthermore, the performance of DHSpred was superior to that of state-of-the-art predictors. An online prediction server has been developed to assist the scientific community, and is freely available at: http://www.thegleelab.org/DHSpred.html.
DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest

PubMed Central

Manavalan, Balachandran; Shin, Tae Hwan; Lee, Gwang

2018-01-01

DNase I hypersensitive sites (DHSs) are genomic regions that provide important information regarding the presence of transcriptional regulatory elements and the state of chromatin. Therefore, identifying DHSs in uncharacterized DNA sequences is crucial for understanding their biological functions and mechanisms. Although many experimental methods have been proposed to identify DHSs, they have proven to be expensive for genome-wide application. Therefore, it is necessary to develop computational methods for DHS prediction. In this study, we proposed a support vector machine (SVM)-based method for predicting DHSs, called DHSpred (DNase I Hypersensitive Site predictor in human DNA sequences), which was trained with 174 optimal features. The optimal combination of features was identified from a large set that included nucleotide composition and di- and trinucleotide physicochemical properties, using a random forest algorithm. DHSpred achieved a Matthews correlation coefficient and accuracy of 0.660 and 0.871, respectively, which were 3% higher than those of control SVM predictors trained with non-optimized features, indicating the efficiency of the feature selection method. Furthermore, the performance of DHSpred was superior to that of state-of-the-art predictors. An online prediction server has been developed to assist the scientific community, and is freely available at: http://www.thegleelab.org/DHSpred.html PMID:29416743
A Lossy Compression Technique Enabling Duplication-Aware Sequence Alignment

PubMed Central

Freschi, Valerio; Bogliolo, Alessandro

2012-01-01

In spite of the recognized importance of tandem duplications in genome evolution, commonly adopted sequence comparison algorithms do not take into account complex mutation events involving more than one residue at the time, since they are not compliant with the underlying assumption of statistical independence of adjacent residues. As a consequence, the presence of tandem repeats in sequences under comparison may impair the biological significance of the resulting alignment. Although solutions have been proposed, repeat-aware sequence alignment is still considered to be an open problem and new efficient and effective methods have been advocated. The present paper describes an alternative lossy compression scheme for genomic sequences which iteratively collapses repeats of increasing length. The resulting approximate representations do not contain tandem duplications, while retaining enough information for making their comparison even more significant than the edit distance between the original sequences. This allows us to exploit traditional alignment algorithms directly on the compressed sequences. Results confirm the validity of the proposed approach for the problem of duplication-aware sequence alignment. PMID:22518086
Tandemly repeated sequences in mtDNA control region of whitefish, Coregonus lavaretus.

PubMed

Brzuzan, P

2000-06-01

Length variation of the mitochondrial DNA control region was observed with PCR amplification of a sample of 138 whitefish (Coregonus lavaretus). Nucleotide sequences of representative PCR products showed that the variation was due to the presence of an approximately 100-bp motif tandemly repeated two, three, or five times in the region between the conserved sequence block-3 (CSB-3) and the gene for phenylalanine tRNA. This is the first report on the tandem array composed of long repeat units in mitochondrial DNA of salmonids.
Improved nucleic acid descriptors for siRNA efficacy prediction.

PubMed

Sciabola, Simone; Cao, Qing; Orozco, Modesto; Faustino, Ignacio; Stanton, Robert V

2013-02-01

Although considerable progress has been made recently in understanding how gene silencing is mediated by the RNAi pathway, the rational design of effective sequences is still a challenging task. In this article, we demonstrate that including three-dimensional descriptors improved the discrimination between active and inactive small interfering RNAs (siRNAs) in a statistical model. Five descriptor types were used: (i) nucleotide position along the siRNA sequence, (ii) nucleotide composition in terms of presence/absence of specific combinations of di- and trinucleotides, (iii) nucleotide interactions by means of a modified auto- and cross-covariance function, (iv) nucleotide thermodynamic stability derived by the nearest neighbor model representation and (v) nucleic acid structure flexibility. The duplex flexibility descriptors are derived from extended molecular dynamics simulations, which are able to describe the sequence-dependent elastic properties of RNA duplexes, even for non-standard oligonucleotides. The matrix of descriptors was analysed using three statistical packages in R (partial least squares, random forest, and support vector machine), and the most predictive model was implemented in a modeling tool we have made publicly available through SourceForge. Our implementation of new RNA descriptors coupled with appropriate statistical algorithms resulted in improved model performance for the selection of siRNA candidates when compared with publicly available siRNA prediction tools and previously published test sets. Additional validation studies based on in-house RNA interference projects confirmed the robustness of the scoring procedure in prospective studies.
Compositional Bias in Naïve and Chemically-modified Phage-Displayed Libraries uncovered by Paired-end Deep Sequencing.

PubMed

He, Bifang; Tjhung, Katrina F; Bennett, Nicholas J; Chou, Ying; Rau, Andrea; Huang, Jian; Derda, Ratmir

2018-01-19

Understanding the composition of a genetically-encoded (GE) library is instrumental to the success of ligand discovery. In this manuscript, we investigate the bias in GE-libraries of linear, macrocyclic and chemically post-translationally modified (cPTM) tetrapeptides displayed on the M13KE platform, which are produced via trinucleotide cassette synthesis (19 codons) and NNK-randomized codon. Differential enrichment of synthetic DNA {S}, ligated vector {L} (extension and ligation of synthetic DNA into the vector), naïve libraries {N} (transformation of the ligated vector into the bacteria followed by expression of the library for 4.5 hours to yield a "naïve" library), and libraries chemically modified by aldehyde ligation and cysteine macrocyclization {M} characterized by paired-end deep sequencing, detected a significant drop in diversity in {L} → {N}, but only a minor compositional difference in {S} → {L} and {N} → {M}. Libraries expressed at the N-terminus of phage protein pIII censored positively charged amino acids Arg and Lys; libraries expressed between pIII domains N1 and N2 overcame Arg/Lys-censorship but introduced new bias towards Gly and Ser. Interrogation of biases arising from cPTM by aldehyde ligation and cysteine macrocyclization unveiled censorship of sequences with Ser/Phe. Analogous analysis can be used to explore library diversity in new display platforms and optimize cPTM of these libraries.
Plasmid integration in a wide range of bacteria mediated by the integrase of Lactobacillus delbrueckii bacteriophage mv4.

PubMed Central

Auvray, F; Coddeville, M; Ritzenthaler, P; Dupont, L

1997-01-01

Bacteriophage mv4 is a temperate phage infecting Lactobacillus delbrueckii subsp. bulgaricus. During lysogenization, the phage integrates its genome into the host chromosome at the 3' end of a tRNA(Ser) gene through a site-specific recombination process (L. Dupont et al., J. Bacteriol., 177:586-595, 1995). A nonreplicative vector (pMC1) based on the mv4 integrative elements (attP site and integrase-coding int gene) is able to integrate into the chromosome of a wide range of bacterial hosts, including Lactobacillus plantarum, Lactobacillus casei (two strains), Lactococcus lactis subsp. cremoris, Enterococcus faecalis, and Streptococcus pneumoniae. Integrative recombination of pMC1 into the chromosomes of all of these species is dependent on the int gene product and occurs specifically at the pMC1 attP site. The isolation and sequencing of pMC1 integration sites from these bacteria showed that in lactobacilli, pMC1 integrated into the conserved tRNA(Ser) gene. In the other bacterial species where this tRNA gene is less or not conserved; secondary integration sites either in potential protein-coding regions or in intergenic DNA were used. A consensus sequence was deduced from the analysis of the different integration sites. The comparison of these sequences demonstrated the flexibility of the integrase for the bacterial integration site and suggested the importance of the trinucleotide CCT at the 5' end of the core in the strand exchange reaction. PMID:9068626
Artificial sRNAs activating the Gac/Rsm signal transduction pathway in Pseudomonas fluorescens.

PubMed

Valverde, Claudio

2009-04-01

In Pseudomonas fluorescens CHA0, the synthesis of antifungal compounds is post-transcriptionally activated by the Gac/Rsm cascade. The two-component system GacS/GacA promotes transcription of three small regulatory RNAs (i.e., sRNAs), RsmX, RsmY, and RsmZ, which remove the regulatory proteins RsmA and RsmE from the ribosome-binding sites of exoproduct-related mRNAs. The GacS/GacA-dependent accumulation of RsmX/Y/Z and formation of RsmX/Y/Z-RsmA/E complexes relieve mRNA translational repression. Other bacteria as E. coli and Vibrio spp. utilize similar sRNA-protein based systems to adjust mRNA translation (e.g., the E. coli Csr system for carbon storage, motility and biofilm regulation). The Rsm/Csr sRNAs are remarkably similar in that they contain several stem-loops with an invariant GGA trinucleotide exposed in the hairpin loop that would be the characteristic structural-sequence motifs relevant for sRNA activity and stability. Here it is shown that the dysfunctional Gac/Rsm cascade of P. fluorescens DeltarsmXYZ mutants could be restored by appropriate transcription levels of artificial genes encoding RNAs with unrelated primary sequence but with two or more hairpins displaying the RsmA/E binding motifs. The results support the hypothesis that the molecular mimicry of Rsm/Csr sRNAs is based on proper secondary structures that expose critical binding motifs irrespective of their overall sequence.

Reprint of: Early Behavioural Facilitation by Temporal Expectations in Complex Visual-motor Sequences.

PubMed

Heideman, Simone G; van Ede, Freek; Nobre, Anna C

2018-05-24

In daily life, temporal expectations may derive from incidental learning of recurring patterns of intervals. We investigated the incidental acquisition and utilisation of combined temporal-ordinal (spatial/effector) structure in complex visual-motor sequences using a modified version of a serial reaction time (SRT) task. In this task, not only the series of targets/responses, but also the series of intervals between subsequent targets was repeated across multiple presentations of the same sequence. Each participant completed three sessions. In the first session, only the repeating sequence was presented. During the second and third session, occasional probe blocks were presented, where a new (unlearned) spatial-temporal sequence was introduced. We first confirm that participants not only got faster over time, but that they were slower and less accurate during probe blocks, indicating that they incidentally learned the sequence structure. Having established a robust behavioural benefit induced by the repeating spatial-temporal sequence, we next addressed our central hypothesis that implicit temporal orienting (evoked by the learned temporal structure) would have the largest influence on performance for targets following short (as opposed to longer) intervals between temporally structured sequence elements, paralleling classical observations in tasks using explicit temporal cues. We found that indeed, reaction time differences between new and repeated sequences were largest for the short interval, compared to the medium and long intervals, and that this was the case, even when comparing late blocks (where the repeated sequence had been incidentally learned), to early blocks (where this sequence was still unfamiliar). We conclude that incidentally acquired temporal expectations that follow a sequential structure can have a robust facilitatory influence on visually-guided behavioural responses and that, like more explicit forms of temporal orienting, this effect is most pronounced for sequence elements that are expected at short inter-element intervals. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Evolutionary conservation of sequence and secondary structures inCRISPR repeats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kunin, Victor; Sorek, Rotem; Hugenholtz, Philip

Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in {approx}40% of bacterial and all archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CAS), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been proposed that the CRISPR/CAS system samples, maintains a record of, and inactivates invasive DNA that the cell has encountered, and therefore constitutes a prokaryotic analog of an immune system. Here we analyze CRISPR repeatsmore » identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. All individual repeats in any given cluster were inferred to form characteristic RNA secondary structure, ranging from non-existent to pronounced. Stable secondary structures included G:U base pairs and exhibited multiple compensatory base changes in the stem region, indicating evolutionary conservation and functional importance. We also show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification including specific relationships between CRISPR and CAS subtypes.« less
Heterogeneity of the Epstein-Barr Virus (EBV) Major Internal Repeat Reveals Evolutionary Mechanisms of EBV and a Functional Defect in the Prototype EBV Strain B95-8.

PubMed

Ba Abdullah, Mohammed M; Palermo, Richard D; Palser, Anne L; Grayson, Nicholas E; Kellam, Paul; Correia, Samantha; Szymula, Agnieszka; White, Robert E

2017-12-01

Epstein-Barr virus (EBV) is a ubiquitous pathogen of humans that can cause several types of lymphoma and carcinoma. Like other herpesviruses, EBV has diversified through both coevolution with its host and genetic exchange between virus strains. Sequence analysis of the EBV genome is unusually challenging because of the large number and lengths of repeat regions within the virus. Here we describe the sequence assembly and analysis of the large internal repeat 1 of EBV (IR1; also known as the BamW repeats) for more than 70 strains. The diversity of the latency protein EBV nuclear antigen leader protein (EBNA-LP) resides predominantly within the exons downstream of IR1. The integrity of the putative BWRF1 open reading frame (ORF) is retained in over 80% of strains, and deletions truncating IR1 always spare BWRF1. Conserved regions include the IR1 latency promoter (Wp) and one zone upstream of and two within BWRF1. IR1 is heterogeneous in 70% of strains, and this heterogeneity arises from sequence exchange between strains as well as from spontaneous mutation, with interstrain recombination being more common in tumor-derived viruses. This genetic exchange often incorporates regions of <1 kb, and allelic gene conversion changes the frequency of small regions within the repeat but not close to the flanks. These observations suggest that IR1-and, by extension, EBV-diversifies through both recombination and breakpoint repair, while concerted evolution of IR1 is driven by gene conversion of small regions. Finally, the prototype EBV strain B95-8 contains four nonconsensus variants within a single IR1 repeat unit, including a stop codon in the EBNA-LP gene. Repairing IR1 improves EBNA-LP levels and the quality of transformation by the B95-8 bacterial artificial chromosome (BAC). IMPORTANCE Epstein-Barr virus (EBV) infects the majority of the world population but causes illness in only a small minority of people. Nevertheless, over 1% of cancers worldwide are attributable to EBV. Recent sequencing projects investigating virus diversity to see if different strains have different disease impacts have excluded regions of repeating sequence, as they are more technically challenging. Here we analyze the sequence of the largest repeat in EBV (IR1). We first characterized the variations in protein sequences encoded across IR1. In studying variations within the repeat of each strain, we identified a mutation in the main laboratory strain of EBV that impairs virus function, and we suggest that tumor-associated viruses may be more likely to contain DNA mixed from two strains. The patterns of this mixing suggest that sequences can spread between strains (and also within the repeat) by copying sequence from another strain (or repeat unit) to repair DNA damage. Copyright © 2017 Ba abdullah et al.
Improved PCR-Based Detection of Soil Transmitted Helminth Infections Using a Next-Generation Sequencing Approach to Assay Design.

PubMed

Pilotte, Nils; Papaiakovou, Marina; Grant, Jessica R; Bierwert, Lou Ann; Llewellyn, Stacey; McCarthy, James S; Williams, Steven A

2016-03-01

The soil transmitted helminths are a group of parasitic worms responsible for extensive morbidity in many of the world's most economically depressed locations. With growing emphasis on disease mapping and eradication, the availability of accurate and cost-effective diagnostic measures is of paramount importance to global control and elimination efforts. While real-time PCR-based molecular detection assays have shown great promise, to date, these assays have utilized sub-optimal targets. By performing next-generation sequencing-based repeat analyses, we have identified high copy-number, non-coding DNA sequences from a series of soil transmitted pathogens. We have used these repetitive DNA elements as targets in the development of novel, multi-parallel, PCR-based diagnostic assays. Utilizing next-generation sequencing and the Galaxy-based RepeatExplorer web server, we performed repeat DNA analysis on five species of soil transmitted helminths (Necator americanus, Ancylostoma duodenale, Trichuris trichiura, Ascaris lumbricoides, and Strongyloides stercoralis). Employing high copy-number, non-coding repeat DNA sequences as targets, novel real-time PCR assays were designed, and assays were tested against established molecular detection methods. Each assay provided consistent detection of genomic DNA at quantities of 2 fg or less, demonstrated species-specificity, and showed an improved limit of detection over the existing, proven PCR-based assay. The utilization of next-generation sequencing-based repeat DNA analysis methodologies for the identification of molecular diagnostic targets has the ability to improve assay species-specificity and limits of detection. By exploiting such high copy-number repeat sequences, the assays described here will facilitate soil transmitted helminth diagnostic efforts. We recommend similar analyses when designing PCR-based diagnostic tests for the detection of other eukaryotic pathogens.
Translocation and gross deletion breakpoints in human inherited disease and cancer II: Potential involvement of repetitive sequence elements in secondary structure formation between DNA ends.

PubMed

Chuzhanova, Nadia; Abeysinghe, Shaun S; Krawczak, Michael; Cooper, David N

2003-09-01

Translocations and gross deletions are responsible for a significant proportion of both cancer and inherited disease. Although such gene rearrangements are nonuniformly distributed in the human genome, the underlying mutational mechanisms remain unclear. We have studied the potential involvement of various types of repetitive sequence elements in the formation of secondary structure intermediates between the single-stranded DNA ends that recombine during rearrangements. Complexity analysis was used to assess the potential of these ends to form secondary structures, the maximum decrease in complexity consequent to a gross rearrangement being used as an indicator of the type of repeat and the specific DNA ends involved. A total of 175 pairs of deletion/translocation breakpoint junction sequences available from the Gross Rearrangement Breakpoint Database [GRaBD; www.uwcm.ac.uk/uwcm/mg/grabd/grabd.html] were analyzed. Potential secondary structure was noted between the 5' flanking sequence of the first breakpoint and the 3' flanking sequence of the second breakpoint in 49% of rearrangements and between the 5' flanking sequence of the second breakpoint and the 3' flanking sequence of the first breakpoint in 36% of rearrangements. Inverted repeats, inversions of inverted repeats, and symmetric elements were found in association with gross rearrangements at approximately the same frequency. However, inverted repeats and inversions of inverted repeats accounted for the vast majority (83%) of deletions plus small insertions, symmetric elements for one-half of all antigen receptor-mediated translocations, while direct repeats appear only to be involved in mediating simple deletions. These findings extend our understanding of illegitimate recombination by highlighting the importance of secondary structure formation between single-stranded DNA ends at breakpoint junctions. Copyright 2003 Wiley-Liss, Inc.
Perceived empty duration between sounds of different lengths: Possible relation with repetition and rhythmic grouping.

PubMed

Kuroda, Tsuyoshi; Tomimatsu, Erika; Grondin, Simon; Miyazaki, Makoto

2016-11-01

We investigated how perceived duration of empty time intervals would be modulated by the length of sounds marking those intervals. Three sounds were successively presented in Experiment 1. Each sound was short (S) or long (L), and the temporal position of the middle sound's onset was varied. The lengthening of each sound resulted in delayed perception of the onset; thus, the middle sound's onset had to be presented earlier in the SLS than in the LSL sequence so that participants perceived the three sounds as presented at equal interonset intervals. In Experiment 2, a short sound and a long sound were alternated repeatedly, and the relative duration of the SL interval to the LS interval was varied. This repeated sequence was perceived as consisting of equal interonset intervals when the onsets of all sounds were aligned at physically equal intervals. If the same onset delay as in the preceding experiment had occurred, participants should have perceived equality between the interonset intervals in the repeated sequence when the SL interval was physically shortened relative to the LS interval. The effects of sound length seemed to be canceled out when the presentation of intervals was repeated. Finally, the perceived duration of the interonset intervals in the repeated sequence was not influenced by whether the participant's native language was French or Japanese, or by how the repeated sequence was perceptually segmented into rhythmic groups.
Genetic and DNA sequence analysis of the kanamycin resistance transposon Tn903.

PubMed Central

Grindley, N D; Joyce, C M

1980-01-01

The kanamycin resistance transposon Tn903 consists of a unique region of about 1000 base pairs bounded by a pair of 1050-base-pair inverted repeat sequences. Each repeat contains two Pvu II endonuclease cleavage sites separated by 520 base pairs. We have constructed derivatives of Tn903 in which this 520-base-pair fragment is deleted from one or both repeats. Those derivatives that lack both 520-base-pair fragments cannot transpose, whereas those that lack just one remain transposition proficient. One such transposable derivative, Tn903 delta I, has been selected for further study. We have determined the sequence of the intact inverted repeat. The 18 base pairs at each end are identical and inverted relative to one another, a structure characteristic of insertion sequences. Additional experiments indicate that a single inverted repeat from Tn903 can, in fact, transpose; we propose that this element be called IS903. To correlate the DNA sequence with genetic activities, we have created mutations by inserting a 10-base-pair DNA fragment at several sites within the intact repeat of Tn903 delta 1, and we have examined the effect of such insertions on transposability. The results suggest that IS903 encodes a 307-amino-acid polypeptide (a "transposase") that is absolutely required for transposition of IS903 or Tn903. Images PMID:6261245
Sunflower centromeres consist of a centromere-specific LINE and a chromosome-specific tandem repeat.

PubMed

Nagaki, Kiyotaka; Tanaka, Keisuke; Yamaji, Naoki; Kobayashi, Hisato; Murata, Minoru

2015-01-01

The kinetochore is a protein complex including kinetochore-specific proteins that plays a role in chromatid segregation during mitosis and meiosis. The complex associates with centromeric DNA sequences that are usually species-specific. In plant species, tandem repeats including satellite DNA sequences and retrotransposons have been reported as centromeric DNA sequences. In this study on sunflowers, a cDNA-encoding centromere-specific histone H3 (CENH3) was isolated from a cDNA pool from a seedling, and an antibody was raised against a peptide synthesized from the deduced cDNA. The antibody specifically recognized the sunflower CENH3 (HaCENH3) and showed centromeric signals by immunostaining and immunohistochemical staining analysis. The antibody was also applied in chromatin immunoprecipitation (ChIP)-Seq to isolate centromeric DNA sequences and two different types of repetitive DNA sequences were identified. One was a long interspersed nuclear element (LINE)-like sequence, which showed centromere-specific signals on almost all chromosomes in sunflowers. This is the first report of a centromeric LINE sequence, suggesting possible centromere targeting ability. Another type of identified repetitive DNA was a tandem repeat sequence with a 187-bp unit that was found only on a pair of chromosomes. The HaCENH3 content of the tandem repeats was estimated to be much higher than that of the LINE, which implies centromere evolution from LINE-based centromeres to more stable tandem-repeat-based centromeres. In addition, the epigenetic status of the sunflower centromeres was investigated by immunohistochemical staining and ChIP, and it was found that centromeres were heterochromatic.
Direct repeat sequences in the Streptomyces chitinase-63 promoter direct both glucose repression and chitin induction

PubMed Central

Ni, Xiangyang; Westpheling, Janet

1997-01-01

The chi63 promoter directs glucose-sensitive, chitin-dependent transcription of a gene involved in the utilization of chitin as carbon source. Analysis of 5′ and 3′ deletions of the promoter region revealed that a 350-bp segment is sufficient for wild-type levels of expression and regulation. The analysis of single base changes throughout the promoter region, introduced by random and site-directed mutagenesis, identified several sequences to be important for activity and regulation. Single base changes at −10, −12, −32, −33, −35, and −37 upstream of the transcription start site resulted in loss of activity from the promoter, suggesting that bases in these positions are important for RNA polymerase interaction. The sequences centered around −10 (TATTCT) and −35 (TTGACC) in this promoter are, in fact, prototypical of eubacterial promoters. Overlapping the RNA polymerase binding site is a perfect 12-bp direct repeat sequence. Some base changes within this direct repeat resulted in constitutive expression, suggesting that this sequence is an operator for negative regulation. Other base changes resulted in loss of glucose repression while retaining the requirement for chitin induction, suggesting that this sequence is also involved in glucose repression. The fact that cis-acting mutations resulted in glucose resistance but not inducer independence rules out the possibility that glucose repression acts exclusively by inducer exclusion. The fact that mutations that affect glucose repression and chitin induction fall within the same direct repeat sequence module suggests that the direct repeat sequence facilitates both chitin induction and glucose repression. PMID:9371809
Superior ab initio identification, annotation and characterisation of TEs and segmental duplications from genome assemblies.

PubMed

Zeng, Lu; Kortschak, R Daniel; Raison, Joy M; Bertozzi, Terry; Adelson, David L

2018-01-01

Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive sequences in genome assemblies. The pipeline begins with a pairwise alignment using krishna, a custom aligner. Single linkage clustering is then carried out to produce families of repetitive elements. Consensus sequences are then filtered for protein coding genes and then annotated using Repbase and a custom library of retrovirus and reverse transcriptase sequences. This process yields three types of family: fully annotated, partially annotated and unannotated. Fully annotated families reflect recently diverged/young known TEs present in Repbase. The remaining two types of families contain a mixture of novel TEs and segmental duplications. These can be resolved by aligning these consensus sequences back to the genome to assess copy number vs. length distribution. Our pipeline has three significant advantages compared to other methods for ab initio repeat identification: 1) we generate not only consensus sequences, but keep the genomic intervals for the original aligned sequences, allowing straightforward analysis of evolutionary dynamics, 2) consensus sequences represent low-divergence, recently/currently active TE families, 3) segmental duplications are annotated as a useful by-product. We have compared our ab initio repeat annotations for 7 genome assemblies to other methods and demonstrate that CARP compares favourably with RepeatModeler, the most widely used repeat annotation package.
Superior ab initio identification, annotation and characterisation of TEs and segmental duplications from genome assemblies

PubMed Central

Zeng, Lu; Kortschak, R. Daniel; Raison, Joy M.

2018-01-01

Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive sequences in genome assemblies. The pipeline begins with a pairwise alignment using krishna, a custom aligner. Single linkage clustering is then carried out to produce families of repetitive elements. Consensus sequences are then filtered for protein coding genes and then annotated using Repbase and a custom library of retrovirus and reverse transcriptase sequences. This process yields three types of family: fully annotated, partially annotated and unannotated. Fully annotated families reflect recently diverged/young known TEs present in Repbase. The remaining two types of families contain a mixture of novel TEs and segmental duplications. These can be resolved by aligning these consensus sequences back to the genome to assess copy number vs. length distribution. Our pipeline has three significant advantages compared to other methods for ab initio repeat identification: 1) we generate not only consensus sequences, but keep the genomic intervals for the original aligned sequences, allowing straightforward analysis of evolutionary dynamics, 2) consensus sequences represent low-divergence, recently/currently active TE families, 3) segmental duplications are annotated as a useful by-product. We have compared our ab initio repeat annotations for 7 genome assemblies to other methods and demonstrate that CARP compares favourably with RepeatModeler, the most widely used repeat annotation package. PMID:29538441
Origin of the CMS gene locus in rapeseed cybrid mitochondria: active and inactive recombination produces the complex CMS gene region in the mitochondrial genomes of Brassicaceae.

PubMed

Oshima, Masao; Kikuchi, Rie; Imamura, Jun; Handa, Hirokazu

2010-01-01

CMS (cytoplasmic male sterile) rapeseed is produced by asymmetrical somatic cell fusion between the Brassica napus cv. Westar and the Raphanus sativus Kosena CMS line (Kosena radish). The CMS rapeseed contains a CMS gene, orf125, which is derived from Kosena radish. Our sequence analyses revealed that the orf125 region in CMS rapeseed originated from recombination between the orf125/orfB region and the nad1C/ccmFN1 region by way of a 63 bp repeat. A precise sequence comparison among the related sequences in CMS rapeseed, Kosena radish and normal rapeseed showed that the orf125 region in CMS rapeseed consisted of the Kosena orf125/orfB region and the rapeseed nad1C/ccmFN1 region, even though Kosena radish had both the orf125/orfB region and the nad1C/ccmFN1 region in its mitochondrial genome. We also identified three tandem repeat sequences in the regions surrounding orf125, including a 63 bp repeat, which were involved in several recombination events. Interestingly, differences in the recombination activity for each repeat sequence were observed, even though these sequences were located adjacent to each other in the mitochondrial genome. We report results indicating that recombination events within the mitochondrial genomes are regulated at the level of specific repeat sequences depending on the cellular environment.
Analysis of Two Cosmid Clones from Chromosome 4 of Drosophila melanogaster Reveals Two New Genes Amid an Unusual Arrangement of Repeated Sequences

PubMed Central

Locke, John; Podemski, Lynn; Roy, Ken; Pilgrim, David; Hodgetts, Ross

1999-01-01

Chromosome 4 from Drosophila melanogaster has several unusual features that distinguish it from the other chromosomes. These include a diffuse appearance in salivary gland polytene chromosomes, an absence of recombination, and the variegated expression of P-element transgenes. As part of a larger project to understand these properties, we are assembling a physical map of this chromosome. Here we report the sequence of two cosmids representing ∼5% of the polytenized region. Both cosmid clones contain numerous repeated DNA sequences, as identified by cross hybridization with labeled genomic DNA, BLAST searches, and dot matrix analysis, which are positioned between and within the transcribed sequences. The repetitive sequences include three copies of the mobile element Hoppel, one copy of the mobile element HB, and 18 DINE repeats. DINE is a novel, short repeated sequence dispersed throughout both cosmid sequences. One cosmid includes the previously described cubitus interruptus (ci) gene and two new genes: that a gene with a predicted amino acid sequence similar to ribosomal protein S3a which is consistent with the Minute(4)101 locus thought to be in the region, and a novel member of the protein family that includes plexin and met–hepatocyte growth factor receptor. The other cosmid contains only the two short 5′-most exons from the zinc-finger-homolog-2 (zfh-2) gene. This is the first extensive sequence analysis of noncoding DNA from chromosome 4. The distribution of the various repeats suggests its organization is similar to the β-heterochromatic regions near the base of the major chromosome arms. Such a pattern may account for the diffuse banding of the polytene chromosome 4 and the variegation of many P-element transgenes on the chromosome. PMID:10022978
Experimental definition of a clustered regularly interspaced short palindromic duplicon in Escherichia coli.

PubMed

Goren, Moran G; Yosef, Ido; Auster, Oren; Qimron, Udi

2012-10-12

We analyzed sequences of newly inserted repeats in an Escherichia coli CRISPR (clustered regularly interspaced short palindromic repeats) array in vivo and showed that a base previously thought to belong to the repeat is actually derived from a protospacer. Based on further experimental results, we propose to use the term "duplicon" for a repeated sequence in a CRISPR array that serves as a template for a new duplicon. Our findings suggest the possibility of redrawing the borders between repeats, spacers, and protospacer adjacent motifs. Copyright © 2012 Elsevier Ltd. All rights reserved.
Phylogeny and strain typing of Escherichia coli, inferred from variation at mononucleotide repeat loci.

PubMed

Diamant, Eran; Palti, Yniv; Gur-Arie, Riva; Cohen, Helit; Hallerman, Eric M; Kashi, Yechezkel

2004-04-01

Multilocus sequencing of housekeeping genes has been used previously for bacterial strain typing and for inferring evolutionary relationships among strains of Escherichia coli. In this study, we used shorter intergenic sequences that contained simple sequence repeats (SSRs) of repeating mononucleotide motifs (mononucleotide repeats [MNRs]) to infer the phylogeny of pathogenic and commensal E. coli strains. Seven noncoding loci (four MNRs and three non-SSRs) were sequenced in 27 strains, including enterohemorrhagic (six isolates of O157:H7), enteropathogenic, enterotoxigenic, B, and K-12 strains. The four MNRs were also sequenced in 20 representative strains of the E. coli reference (ECOR) collection. Sequence polymorphism was significantly higher at the MNR loci, including the flanking sequences, indicating a higher mutation rate in the sequences flanking the MNR tracts. The four MNR loci were amplifiable by PCR in the standard ECOR A, B1, and D groups, but only one (yaiN) in the B2 group was amplified, which is consistent with previous studies that suggested that B2 is the most ancient group. High sequence compatibility was found between the four MNR loci, indicating that they are in the same clonal frame. The phylogenetic trees that were constructed from the sequence data were in good agreement with those of previous studies that used multilocus enzyme electrophoresis. The results demonstrate that MNR loci are useful for inferring phylogenetic relationships and provide much higher sequence variation than housekeeping genes. Therefore, the use of MNR loci for multilocus sequence typing should prove efficient for clinical diagnostics, epidemiology, and evolutionary study of bacteria.
Phylogeny and Strain Typing of Escherichia coli, Inferred from Variation at Mononucleotide Repeat Loci

PubMed Central

Diamant, Eran; Palti, Yniv; Gur-Arie, Riva; Cohen, Helit; Hallerman, Eric M.; Kashi, Yechezkel

2004-01-01

Multilocus sequencing of housekeeping genes has been used previously for bacterial strain typing and for inferring evolutionary relationships among strains of Escherichia coli. In this study, we used shorter intergenic sequences that contained simple sequence repeats (SSRs) of repeating mononucleotide motifs (mononucleotide repeats [MNRs]) to infer the phylogeny of pathogenic and commensal E. coli strains. Seven noncoding loci (four MNRs and three non-SSRs) were sequenced in 27 strains, including enterohemorrhagic (six isolates of O157:H7), enteropathogenic, enterotoxigenic, B, and K-12 strains. The four MNRs were also sequenced in 20 representative strains of the E. coli reference (ECOR) collection. Sequence polymorphism was significantly higher at the MNR loci, including the flanking sequences, indicating a higher mutation rate in the sequences flanking the MNR tracts. The four MNR loci were amplifiable by PCR in the standard ECOR A, B1, and D groups, but only one (yaiN) in the B2 group was amplified, which is consistent with previous studies that suggested that B2 is the most ancient group. High sequence compatibility was found between the four MNR loci, indicating that they are in the same clonal frame. The phylogenetic trees that were constructed from the sequence data were in good agreement with those of previous studies that used multilocus enzyme electrophoresis. The results demonstrate that MNR loci are useful for inferring phylogenetic relationships and provide much higher sequence variation than housekeeping genes. Therefore, the use of MNR loci for multilocus sequence typing should prove efficient for clinical diagnostics, epidemiology, and evolutionary study of bacteria. PMID:15066845
Characterization and assessment of an avian repetitive DNA sequence as an icterid phylogenetic marker.

PubMed

Quinn, J S; Guglich, E; Seutin, G; Lau, R; Marsolais, J; Parna, L; Boag, P T; White, B N

1992-02-01

The first tandemly repeated sequence examined in a passerine bird, a 431-bp PstI fragment named pMAT1, has been cloned from the genome of the brown-headed cowbird (Molothrus ater). The sequence represents about 5-10% of the genome (about 4 x 10(5) copies) and yields prominent ethidium bromide stained bands when genomic DNA cut with a variety of restriction enzymes is electrophoresed in agarose gels. A particularly striking ladder of fragments is apparent when the DNA is cut with HinfI, indicative of a tandem arrangement of the monomer. The cloned PstI monomer has been sequenced, revealing no internal repeated structure. There are sequences that hybridize with pMAT1 found in related nine-primaried oscines but not in more distantly related oscines, suboscines, or nonpasserine species. Little sequence similarity to tandemly repeated PstI cut sequences from the merlin (Falco columbarius), saurus crane (Grus antigone), or Puerto Rican parrot (Amazona vittata) or to HinfI digested sequence from the Toulouse goose (Anser anser) was detected. The isolated sequence was used as a probe to examine DNA samples of eight members of the tribe Icterini. This examination revealed phylogenetically informative characters. The repeat contains cutting sites from a number of restriction enzymes, which, if sufficiently polymorphic, would provide new phylogenetic characters. Sequences like these, conserved within a species, but variable between closely related species, may be very useful for phylogenetic studies of closely related taxa.
“One code to find them all”: a perl tool to conveniently parse RepeatMasker output files

PubMed Central

2014-01-01

Background Of the different bioinformatic methods used to recover transposable elements (TEs) in genome sequences, one of the most commonly used procedures is the homology-based method proposed by the RepeatMasker program. RepeatMasker generates several output files, including the .out file, which provides annotations for all detected repeats in a query sequence. However, a remaining challenge consists of identifying the different copies of TEs that correspond to the identified hits. This step is essential for any evolutionary/comparative analysis of the different copies within a family. Different possibilities can lead to multiple hits corresponding to a unique copy of an element, such as the presence of large deletions/insertions or undetermined bases, and distinct consensus corresponding to a single full-length sequence (like for long terminal repeat (LTR)-retrotransposons). These possibilities must be taken into account to determine the exact number of TE copies. Results We have developed a perl tool that parses the RepeatMasker .out file to better determine the number and positions of TE copies in the query sequence, in addition to computing quantitative information for the different families. To determine the accuracy of the program, we tested it on several RepeatMasker .out files corresponding to two organisms (Drosophila melanogaster and Homo sapiens) for which the TE content has already been largely described and which present great differences in genome size, TE content, and TE families. Conclusions Our tool provides access to detailed information concerning the TE content in a genome at the family level from the .out file of RepeatMasker. This information includes the exact position and orientation of each copy, its proportion in the query sequence, and its quality compared to the reference element. In addition, our tool allows a user to directly retrieve the sequence of each copy and obtain the same detailed information at the family level when a local library with incomplete TE class/subclass information was used with RepeatMasker. We hope that this tool will be helpful for people working on the distribution and evolution of TEs within genomes.
Effects of "D"-Amphetamine and Ethanol on Variable and Repetitive Key-Peck Sequences in Pigeons

ERIC Educational Resources Information Center

Ward, Ryan D.; Bailey, Ericka M.; Odum, Amy L.

2006-01-01

This experiment assessed the effects of "d"-Amphetamine and ethanol on reinforced variable and repetitive key-peck sequences in pigeons. Pigeons responded on two keys under a multiple schedule of Repeat and Vary components. In the Repeat component, completion of a target sequence of right, right, left, left resulted in food. In the Vary component,…
The mitochondrial genome of the legume Vigna radiata and the analysis of recombination across short mitochondrial repeats.

PubMed

Alverson, Andrew J; Zhuo, Shi; Rice, Danny W; Sloan, Daniel B; Palmer, Jeffrey D

2011-01-20

The mitochondrial genomes of seed plants are exceptionally fluid in size, structure, and sequence content, with the accumulation and activity of repetitive sequences underlying much of this variation. We report the first fully sequenced mitochondrial genome of a legume, Vigna radiata (mung bean), and show that despite its unexceptional size (401,262 nt), the genome is unusually depauperate in repetitive DNA and "promiscuous" sequences from the chloroplast and nuclear genomes. Although Vigna lacks the large, recombinationally active repeats typical of most other seed plants, a PCR survey of its modest repertoire of short (38-297 nt) repeats nevertheless revealed evidence for recombination across all of them. A set of novel control assays showed, however, that these results could instead reflect, in part or entirely, artifacts of PCR-mediated recombination. Consequently, we recommend that other methods, especially high-depth genome sequencing, be used instead of PCR to infer patterns of plant mitochondrial recombination. The average-sized but repeat- and feature-poor mitochondrial genome of Vigna makes it ever more difficult to generalize about the factors shaping the size and sequence content of plant mitochondrial genomes.

Highly Informative Simple Sequence Repeat (SSR) Markers for Fingerprinting Hazelnut

USDA-ARS?s Scientific Manuscript database

Simple sequence repeat (SSR) or microsatellite markers have many applications in breeding and genetic studies of plants, including fingerprinting of cultivars and investigations of genetic diversity, and therefore provide information for better management of germplasm collections. They are repeatab...
Distribution and sequence homogeneity of an abundant satellite DNA in the beetle, Tenebrio molitor.

PubMed Central

Davis, C A; Wyatt, G R

1989-01-01

The mealworm beetle, Tenebrio molitor, contains an unusually abundant and homogeneous satellite DNA which constitutes up to 60% of its genome. The satellite DNA is shown to be present in all of the chromosomes by in situ hybridization. 18 dimers of the repeat unit were cloned and sequenced. The consensus sequence is 142 nt long and lacks any internal repeat structure. Monomers of the sequence are very similar, showing on average a 2% divergence from the calculated consensus. Variant nucleotides are scattered randomly throughout the sequence although some variants are more common than others. Neighboring repeat units are no more alike than randomly chosen ones. The results suggest that some mechanism, perhaps gene conversion, is acting to maintain the homogeneity of the satellite DNA despite its abundance and distribution on all of the chromosomes. Images PMID:2762148
Development and characterization of BAC-end sequence derived SSRs, and their incorporation into a new higher density genetic map for cultivated peanut (Arachis hypogaea L.)

PubMed Central

2012-01-01

Background Cultivated peanut (Arachis hypogaea L.) is an important crop worldwide, valued for its edible oil and digestible protein. It has a very narrow genetic base that may well derive from a relatively recent single polyploidization event. Accordingly molecular markers have low levels of polymorphism and the number of polymorphic molecular markers available for cultivated peanut is still limiting. Results Here, we report a large set of BAC-end sequences (BES), use them for developing SSR (BES-SSR) markers, and apply them in genetic linkage mapping. The majority of BESs had no detectable homology to known genes (49.5%) followed by sequences with similarity to known genes (44.3%), and miscellaneous sequences (6.2%) such as transposable element, retroelement, and organelle sequences. A total of 1,424 SSRs were identified from 36,435 BESs. Among these identified SSRs, dinucleotide (47.4%) and trinucleotide (37.1%) SSRs were predominant. The new set of 1,152 SSRs as well as about 4,000 published or unpublished SSRs were screened against two parents of a mapping population, generating 385 polymorphic loci. A genetic linkage map was constructed, consisting of 318 loci onto 21 linkage groups and covering a total of 1,674.4 cM, with an average distance of 5.3 cM between adjacent loci. Two markers related to resistance gene homologs (RGH) were mapped to two different groups, thus anchoring 1 RGH-BAC contig and 1 singleton. Conclusions The SSRs mined from BESs will be of use in further molecular analysis of the peanut genome, providing a novel set of markers, genetically anchoring BAC clones, and incorporating gene sequences into a linkage map. This will aid in the identification of markers linked to genes of interest and map-based cloning. PMID:22260238
Amino acid sequence analysis of the annexin super-gene family of proteins.

PubMed

Barton, G J; Newman, R H; Freemont, P S; Crumpton, M J

1991-06-15

The annexins are a widespread family of calcium-dependent membrane-binding proteins. No common function has been identified for the family and, until recently, no crystallographic data existed for an annexin. In this paper we draw together 22 available annexin sequences consisting of 88 similar repeat units, and apply the techniques of multiple sequence alignment, pattern matching, secondary structure prediction and conservation analysis to the characterisation of the molecules. The analysis clearly shows that the repeats cluster into four distinct families and that greatest variation occurs within the repeat 3 units. Multiple alignment of the 88 repeats shows amino acids with conserved physicochemical properties at 22 positions, with only Gly at position 23 being absolutely conserved in all repeats. Secondary structure prediction techniques identify five conserved helices in each repeat unit and patterns of conserved hydrophobic amino acids are consistent with one face of a helix packing against the protein core in predicted helices a, c, d, e. Helix b is generally hydrophobic in all repeats, but contains a striking pattern of repeat-specific residue conservation at position 31, with Arg in repeats 4 and Glu in repeats 2, but unconserved amino acids in repeats 1 and 3. This suggests repeats 2 and 4 may interact via a buried saltbridge. The loop between predicted helices a and b of repeat 3 shows features distinct from the equivalent loop in repeats 1, 2 and 4, suggesting an important structural and/or functional role for this region. No compelling evidence emerges from this study for uteroglobin and the annexins sharing similar tertiary structures, or for uteroglobin representing a derivative of a primordial one-repeat structure that underwent duplication to give the present day annexins. The analyses performed in this paper are re-evaluated in the Appendix, in the light of the recently published X-ray structure for human annexin V. The structure confirms most of the predictions and shows the power of techniques for the determination of tertiary structural information from the amino acid sequences of an aligned protein family.
Discovery of Escherichia coli CRISPR sequences in an undergraduate laboratory.

PubMed

Militello, Kevin T; Lazatin, Justine C

2017-05-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) represent a novel type of adaptive immune system found in eubacteria and archaebacteria. CRISPRs have recently generated a lot of attention due to their unique ability to catalog foreign nucleic acids, their ability to destroy foreign nucleic acids in a mechanism that shares some similarity to RNA interference, and the ability to utilize reconstituted CRISPR systems for genome editing in numerous organisms. In order to introduce CRISPR biology into an undergraduate upper-level laboratory, a five-week set of exercises was designed to allow students to examine the CRISPR status of uncharacterized Escherichia coli strains and to allow the discovery of new repeats and spacers. Students started the project by isolating genomic DNA from E. coli and amplifying the iap CRISPR locus using the polymerase chain reaction (PCR). The PCR products were analyzed by Sanger DNA sequencing, and the sequences were examined for the presence of CRISPR repeat sequences. The regions between the repeats, the spacers, were extracted and analyzed with BLASTN searches. Overall, CRISPR loci were sequenced from several previously uncharacterized E. coli strains and one E. coli K-12 strain. Sanger DNA sequencing resulted in the discovery of 36 spacer sequences and their corresponding surrounding repeat sequences. Five of the spacers were homologous to foreign (non-E. coli) DNA. Assessment of the laboratory indicates that improvements were made in the ability of students to answer questions relating to the structure and function of CRISPRs. Future directions of the laboratory are presented and discussed. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(3):262-269, 2017. © 2016 The International Union of Biochemistry and Molecular Biology.
Sequence investigation of 34 forensic autosomal STRs with massively parallel sequencing.

PubMed

Zhang, Suhua; Niu, Yong; Bian, Yingnan; Dong, Rixia; Liu, Xiling; Bao, Yun; Jin, Chao; Zheng, Hancheng; Li, Chengtao

2018-05-01

STRs vary not only in the length of the repeat units and the number of repeats but also in the region with which they conform to an incremental repeat pattern. Massively parallel sequencing (MPS) offers new possibilities in the analysis of STRs since they can simultaneously sequence multiple targets in a single reaction and capture potential internal sequence variations. Here, we sequenced 34 STRs applied in the forensic community of China with a custom-designed panel. MPS performance were evaluated from sequencing reads analysis, concordance study and sensitivity testing. High coverage sequencing data were obtained to determine the constitute ratios and heterozygous balance. No actual inconsistent genotypes were observed between capillary electrophoresis (CE) and MPS, demonstrating the reliability of the panel and the MPS technology. With the sequencing data from the 200 investigated individuals, 346 and 418 alleles were obtained via CE and MPS technologies at the 34 STRs, indicating MPS technology provides higher discrimination than CE detection. The whole study demonstrated that STR genotyping with the custom panel and MPS technology has the potential not only to reveal length and sequence variations but also to satisfy the demands of high throughput and high multiplexing with acceptable sensitivity.
Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor.

PubMed

Kohany, Oleksiy; Gentles, Andrew J; Hankus, Lukasz; Jurka, Jerzy

2006-10-25

Repbase is a reference database of eukaryotic repetitive DNA, which includes prototypic sequences of repeats and basic information described in annotations. Updating and maintenance of the database requires specialized tools, which we have created and made available for use with Repbase, and which may be useful as a template for other curated databases. We describe the software tools RepbaseSubmitter and Censor, which are designed to facilitate updating and screening the content of Repbase. RepbaseSubmitter is a java-based interface for formatting and annotating Repbase entries. It eliminates many common formatting errors, and automates actions such as calculation of sequence lengths and composition, thus facilitating curation of Repbase sequences. In addition, it has several features for predicting protein coding regions in sequences; searching and including Pubmed references in Repbase entries; and searching the NCBI taxonomy database for correct inclusion of species information and taxonomic position. Censor is a tool to rapidly identify repetitive elements by comparison to known repeats. It uses WU-BLAST for speed and sensitivity, and can conduct DNA-DNA, DNA-protein, or translated DNA-translated DNA searches of genomic sequence. Defragmented output includes a map of repeats present in the query sequence, with the options to report masked query sequence(s), repeat sequences found in the query, and alignments. Censor and RepbaseSubmitter are available as both web-based services and downloadable versions. They can be found at http://www.girinst.org/repbase/submission.html (RepbaseSubmitter) and http://www.girinst.org/censor/index.php (Censor).
Centromere location in Arabidopsis is unaltered by extreme divergence in CENH3 protein sequence

PubMed Central

2017-01-01

During cell division, spindle fibers attach to chromosomes at centromeres. The DNA sequence at regional centromeres is fast evolving with no conserved genetic signature for centromere identity. Instead CENH3, a centromere-specific histone H3 variant, is the epigenetic signature that specifies centromere location across both plant and animal kingdoms. Paradoxically, CENH3 is also adaptively evolving. An ongoing question is whether CENH3 evolution is driven by a functional relationship with the underlying DNA sequence. Here, we demonstrate that despite extensive protein sequence divergence, CENH3 histones from distant species assemble centromeres on the same underlying DNA sequence. We first characterized the organization and diversity of centromere repeats in wild-type Arabidopsis thaliana. We show that A. thaliana CENH3-containing nucleosomes exhibit a strong preference for a unique subset of centromeric repeats. These sequences are largely missing from the genome assemblies and represent the youngest and most homogeneous class of repeats. Next, we tested the evolutionary specificity of this interaction in a background in which the native A. thaliana CENH3 is replaced with CENH3s from distant species. Strikingly, we find that CENH3 from Lepidium oleraceum and Zea mays, although specifying epigenetically weaker centromeres that result in genome elimination upon outcrossing, show a binding pattern on A. thaliana centromere repeats that is indistinguishable from the native CENH3. Our results demonstrate positional stability of a highly diverged CENH3 on independently evolved repeats, suggesting that the sequence specificity of centromeres is determined by a mechanism independent of CENH3. PMID:28223399
Repeating aftershocks of the great 2004 Sumatra and 2005 Nias earthquakes

NASA Astrophysics Data System (ADS)

Yu, Wen-che; Song, Teh-Ru Alex; Silver, Paul G.

2013-05-01

We investigate repeating aftershocks associated with the great 2004 Sumatra-Andaman (Mw 9.2) and 2005 Nias-Simeulue (Mw 8.6) earthquakes by cross-correlating waveforms recorded by the regional seismographic station PSI and teleseismic stations. We identify 10 and 18 correlated aftershock sequences associated with the great 2004 Sumatra and 2005 Nias earthquakes, respectively. The majority of the correlated aftershock sequences are located near the down-dip end of a large afterslip patch. We determine the precise relative locations of event pairs among these sequences and estimate the source rupture areas. The correlated event pairs identified are appropriately referred to as repeating aftershocks, in that the source rupture areas are comparable and significantly overlap within a sequence. We use the repeating aftershocks to estimate afterslip based on the slip-seismic moment scaling relationship and to infer the temporal decay rate of the recurrence interval. The estimated afterslip resembles that measured from the near-field geodetic data to the first order. The decay rate of repeating aftershocks as a function of lapse time t follows a power-law decay 1/tp with the exponent p in the range 0.8-1.1. Both types of observations indicate that repeating aftershocks are governed by post-seismic afterslip.
Genome-Wide Stochastic Adaptive DNA Amplification at Direct and Inverted DNA Repeats in the Parasite Leishmania

PubMed Central

Plourde, Marie; Gingras, Hélène; Roy, Gaétan; Lapointe, Andréanne; Leprohon, Philippe; Papadopoulou, Barbara; Corbeil, Jacques; Ouellette, Marc

2014-01-01

Gene amplification of specific loci has been described in all kingdoms of life. In the protozoan parasite Leishmania, the product of amplification is usually part of extrachromosomal circular or linear amplicons that are formed at the level of direct or inverted repeated sequences. A bioinformatics screen revealed that repeated sequences are widely distributed in the Leishmania genome and the repeats are chromosome-specific, conserved among species, and generally present in low copy number. Using sensitive PCR assays, we provide evidence that the Leishmania genome is continuously being rearranged at the level of these repeated sequences, which serve as a functional platform for constitutive and stochastic amplification (and deletion) of genomic segments in the population. This process is adaptive as the copy number of advantageous extrachromosomal circular or linear elements increases upon selective pressure and is reversible when selection is removed. We also provide mechanistic insights on the formation of circular and linear amplicons through RAD51 recombinase-dependent and -independent mechanisms, respectively. The whole genome of Leishmania is thus stochastically rearranged at the level of repeated sequences, and the selection of parasite subpopulations with changes in the copy number of specific loci is used as a strategy to respond to a changing environment. PMID:24844805
Population-scale whole genome sequencing identifies 271 highly polymorphic short tandem repeats from Japanese population.

PubMed

Hirata, Satoshi; Kojima, Kaname; Misawa, Kazuharu; Gervais, Olivier; Kawai, Yosuke; Nagasaki, Masao

2018-05-01

Forensic DNA typing is widely used to identify missing persons and plays a central role in forensic profiling. DNA typing usually uses capillary electrophoresis fragment analysis of PCR amplification products to detect the length of short tandem repeat (STR) markers. Here, we analyzed whole genome data from 1,070 Japanese individuals generated using massively parallel short-read sequencing of 162 paired-end bases. We have analyzed 843,473 STR loci with two to six basepair repeat units and cataloged highly polymorphic STR loci in the Japanese population. To evaluate the performance of the cataloged STR loci, we compared 23 STR loci, widely used in forensic DNA typing, with capillary electrophoresis based STR genotyping results in the Japanese population. Seventeen loci had high correlations and high call rates. The other six loci had low call rates or low correlations due to either the limitations of short-read sequencing technology, the bioinformatics tool used, or the complexity of repeat patterns. With these analyses, we have also purified the suitable 218 STR loci with four basepair repeat units and 53 loci with five basepair repeat units both for short read sequencing and PCR based technologies, which would be candidates to the actual forensic DNA typing in Japanese population.
Comparative molecular cytogenetics of major repetitive sequence families of three Dendrobium species (Orchidaceae) from Bangladesh

PubMed Central

Begum, Rabeya; Alam, Sheikh Shamimul; Menzel, Gerhard; Schmidt, Thomas

2009-01-01

Background and Aims Dendrobium species show tremendous morphological diversity and have broad geographical distribution. As repetitive sequence analysis is a useful tool to investigate the evolution of chromosomes and genomes, the aim of the present study was the characterization of repetitive sequences from Dendrobium moschatum for comparative molecular and cytogenetic studies in the related species Dendrobium aphyllum, Dendrobium aggregatum and representatives from other orchid genera. Methods In order to isolate highly repetitive sequences, a c0t-1 DNA plasmid library was established. Repeats were sequenced and used as probes for Southern hybridization. Sequence divergence was analysed using bioinformatic tools. Repetitive sequences were localized along orchid chromosomes by fluorescence in situ hybridization (FISH). Key Results Characterization of the c0t-1 library resulted in the detection of repetitive sequences including the (GA)n dinucleotide DmoO11, numerous Arabidopsis-like telomeric repeats and the highly amplified dispersed repeat DmoF14. The DmoF14 repeat is conserved in six Dendrobium species but diversified in representative species of three other orchid genera. FISH analyses showed the genome-wide distribution of DmoF14 in D. moschatum, D. aphyllum and D. aggregatum. Hybridization with the telomeric repeats demonstrated Arabidopsis-like telomeres at the chromosome ends of Dendrobium species. However, FISH using the telomeric probe revealed two pairs of chromosomes with strong intercalary signals in D. aphyllum. FISH showed the terminal position of 5S and 18S–5·8S–25S rRNA genes and a characteristic number of rDNA sites in the three Dendrobium species. Conclusions The repeated sequences isolated from D. moschatum c0t-1 DNA constitute major DNA families of the D. moschatum, D. aphyllum and D. aggregatum genomes with DmoF14 representing an ancient component of orchid genomes. Large intercalary telomere-like arrays suggest chromosomal rearrangements in D. aphyllum while the number and localization of rRNA genes as well as the species-specific distribution pattern of an abundant microsatellite reflect the genomic diversity of the three Dendrobium species. PMID:19635741
Identification of Plant-derived Alkaloids with Therapeutic Potential for Myotonic Dystrophy Type I*

PubMed Central

Herrendorff, Ruben; Faleschini, Maria Teresa; Stiefvater, Adeline; Erne, Beat; Wiktorowicz, Tatiana; Kern, Frances; Hamburger, Matthias; Potterat, Olivier; Kinter, Jochen; Sinnreich, Michael

2016-01-01

Myotonic dystrophy type I (DM1) is a disabling neuromuscular disease with no causal treatment available. This disease is caused by expanded CTG trinucleotide repeats in the 3′ UTR of the dystrophia myotonica protein kinase gene. On the RNA level, expanded (CUG)n repeats form hairpin structures that sequester splicing factors such as muscleblind-like 1 (MBNL1). Lack of available MBNL1 leads to misregulated alternative splicing of many target pre-mRNAs, leading to the multisystemic symptoms in DM1. Many studies aiming to identify small molecules that target the (CUG)n-MBNL1 complex focused on synthetic molecules. In an effort to identify new small molecules that liberate sequestered MBNL1 from (CUG)n RNA, we focused specifically on small molecules of natural origin. Natural products remain an important source for drugs and play a significant role in providing novel leads and pharmacophores for medicinal chemistry. In a new DM1 mechanism-based biochemical assay, we screened a collection of isolated natural compounds and a library of over 2100 extracts from plants and fungal strains. HPLC-based activity profiling in combination with spectroscopic methods were used to identify the active principles in the extracts. The bioactivity of the identified compounds was investigated in a human cell model and in a mouse model of DM1. We identified several alkaloids, including the β-carboline harmine and the isoquinoline berberine, that ameliorated certain aspects of the DM1 pathology in these models. Alkaloids as a compound class may have potential for drug discovery in other RNA-mediated diseases. PMID:27298317
Genome-Wide Computational Analysis of Musa Microsatellites: Classification, Cross-Taxon Transferability, Functional Annotation, Association with Transposons & miRNAs, and Genetic Marker Potential

PubMed Central

Biswas, Manosh Kumar; Liu, Yuxuan; Li, Chunyu; Sheng, Ou; Mayer, Christoph; Yi, Ganjun

2015-01-01

The development of organized, informative, robust, user-friendly, and freely accessible molecular markers is imperative to the Musa marker assisted breeding program. Although several hundred SSR markers have already been developed, the number of informative, robust, and freely accessible Musa markers remains inadequate for some breeding applications. In view of this issue, we surveyed SSRs in four different data sets, developed large-scale non-redundant highly informative therapeutic SSR markers, and classified them according to their attributes, as well as analyzed their cross-taxon transferability and utility for the genetic study of Musa and its relatives. A high SSR frequency (177 per Mbp) was found in the Musa genome. AT-rich dinucleotide repeats are predominant, and trinucleotide repeats are the most abundant in transcribed regions. A significant number of Musa SSRs are associated with pre-miRNAs, and 83% of these SSRs are promising candidates for the development of therapeutic SSR markers. Overall, 74% of the SSR markers were polymorphic, and 94% were transferable to at least one Musa spp. Two hundred forty-three markers generated a total of 1047 alleles, with 2-8 alleles each and an average of 4.38 alleles per locus. The PIC values ranged from 0.31 to 0.89 and averaged 0.71. We report the largest set of non-redundant, polymorphic, new SSR markers to be developed in Musa. These additional markers could be a valuable resource for marker-assisted breeding, genetic diversity and genomic studies of Musa and related species. PMID:26121637
Establishment and Maintenance of Primary Fibroblast Repositories for Rare Diseases-Friedreich's Ataxia Example.

PubMed

Li, Yanjie; Polak, Urszula; Clark, Amanda D; Bhalla, Angela D; Chen, Yu-Yun; Li, Jixue; Farmer, Jennifer; Seyer, Lauren; Lynch, David; Butler, Jill S; Napierala, Marek

2016-08-01

Friedreich's ataxia (FRDA) represents a rare neurodegenerative disease caused by expansion of GAA trinucleotide repeats in the first intron of the FXN gene. The number of GAA repeats in FRDA patients varies from approximately 60 to <1000 and is tightly correlated with age of onset and severity of the disease symptoms. The heterogeneity of Friedreich's ataxia stresses the need for a large cohort of patient samples to conduct studies addressing the mechanism of disease pathogenesis or evaluate novel therapeutic candidates. Herein, we report the establishment and characterization of an FRDA fibroblast repository, which currently includes 50 primary cell lines derived from FRDA patients and seven lines from mutation carriers. These cells are also a source for generating induced pluripotent stem cell (iPSC) lines by reprogramming, as well as disease-relevant neuronal, cardiac, and pancreatic cells that can then be differentiated from the iPSCs. All FRDA and carrier lines are derived using a standard operating procedure and characterized to confirm mutation status, as well as expression of FXN mRNA and protein. Consideration and significance of creating disease-focused cell line and tissue repositories, especially in the context of rare and heterogeneous disorders, are presented. Although the economic aspect of creating and maintaining such repositories is important, the benefits of easy access to a collection of well-characterized cell lines for the purpose of drug discovery or disease mechanism studies overshadow the associated costs. Importantly, all FRDA fibroblast cell lines collected in our repository are available to the scientific community.
Establishment and Maintenance of Primary Fibroblast Repositories for Rare Diseases—Friedreich's Ataxia Example

PubMed Central

Li, Yanjie; Polak, Urszula; Clark, Amanda D.; Bhalla, Angela D.; Chen, Yu-Yun; Li, Jixue; Farmer, Jennifer; Seyer, Lauren; Lynch, David

2016-01-01

Friedreich's ataxia (FRDA) represents a rare neurodegenerative disease caused by expansion of GAA trinucleotide repeats in the first intron of the FXN gene. The number of GAA repeats in FRDA patients varies from approximately 60 to <1000 and is tightly correlated with age of onset and severity of the disease symptoms. The heterogeneity of Friedreich's ataxia stresses the need for a large cohort of patient samples to conduct studies addressing the mechanism of disease pathogenesis or evaluate novel therapeutic candidates. Herein, we report the establishment and characterization of an FRDA fibroblast repository, which currently includes 50 primary cell lines derived from FRDA patients and seven lines from mutation carriers. These cells are also a source for generating induced pluripotent stem cell (iPSC) lines by reprogramming, as well as disease-relevant neuronal, cardiac, and pancreatic cells that can then be differentiated from the iPSCs. All FRDA and carrier lines are derived using a standard operating procedure and characterized to confirm mutation status, as well as expression of FXN mRNA and protein. Consideration and significance of creating disease-focused cell line and tissue repositories, especially in the context of rare and heterogeneous disorders, are presented. Although the economic aspect of creating and maintaining such repositories is important, the benefits of easy access to a collection of well-characterized cell lines for the purpose of drug discovery or disease mechanism studies overshadow the associated costs. Importantly, all FRDA fibroblast cell lines collected in our repository are available to the scientific community. PMID:27002638
Huntington disease in the South African population occurs on diverse and ethnically distinct genetic haplotypes

PubMed Central

Baine, Fiona K; Kay, Chris; Ketelaar, Maria E; Collins, Jennifer A; Semaka, Alicia; Doty, Crystal N; Krause, Amanda; Jacquie Greenberg, L; Hayden, Michael R

2013-01-01

Huntington disease (HD) is a neurodegenerative disorder resulting from the expansion of a CAG trinucleotide repeat in the huntingtin (HTT) gene. Worldwide prevalence varies geographically with the highest figures reported in populations of European ancestry. HD in South Africa has been reported in Caucasian, black and mixed subpopulations, with similar estimated prevalence in the Caucasian and mixed groups and a lower estimate in the black subpopulation. Recent studies have associated specific HTT haplotypes with HD in distinct populations. Expanded HD alleles in Europe occur predominantly on haplogroup A (specifically high-risk variants A1/A2), whereas in East Asian populations, HD alleles are associated with haplogroup C. Whether specific HTT haplotypes associate with HD in black Africans and how these compare with haplotypes found in European and East Asian populations remains unknown. The current study genotyped the HTT region in unaffected individuals and HD patients from each of the South African subpopulations, and haplotypes were constructed. CAG repeat sizes were determined and phased to haplotype. Results indicate that HD alleles from Caucasian and mixed patients are predominantly associated with haplogroup A, signifying a similar European origin for HD. However, in black patients, HD occurs predominantly on haplogroup B, suggesting several distinct origins of the mutation in South Africa. The absence of high-risk variants (A1/A2) in the black subpopulation may also explain the reported low prevalence of HD. Identification of haplotypes associated with HD-expanded alleles is particularly relevant to the development of population-specific therapeutic targets for selective suppression of the expanded HTT transcript. PMID:23463025
Fuchs' Endothelial Corneal Dystrophy in Patients With Myotonic Dystrophy, Type 1

PubMed Central

Winkler, Nelson S.; Milone, Margherita; Martinez-Thompson, Jennifer M.; Raja, Harish; Aleff, Ross A.; Patel, Sanjay V.; Fautsch, Michael P.; Wieben, Eric D.

2018-01-01

Purpose RNA toxicity from CTG trinucleotide repeat (TNR) expansion within noncoding DNA of the transcription factor 4 (TCF4) and DM1 protein kinase (DMPK) genes has been described in Fuchs' endothelial corneal dystrophy (FECD) and myotonic dystrophy, type 1 (DM1), respectively. We prospectively evaluated DM1 patients and their families for phenotypic FECD and report the analysis of CTG expansion in the TCF4 gene and DMPK expression in corneal endothelium. Methods FECD grade was evaluated by slit lamp biomicroscopy in 26 participants from 14 families with DM1. CTG TNR length in TCF4 and DMPK was determined by a combination of Gene Scan and Southern blotting of peripheral blood leukocyte DNA. Results FECD grade was 2 or higher in 5 (36%) of 14 probands, significantly greater than the general population (5%) (P < 0.001). FECD segregated with DM1; six of eight members of the largest family had both FECD and DM1, while the other two family members had neither disease. All DNA samples from 24 subjects, including four FECD-affected probands, were bi-allelic for nonexpanded TNR length in TCF4 (<40 repeats). Considering a 75% prevalence of TCF4 TNR expansion in FECD, the probability of four FECD probands lacking TNR expansion was 0.4%. Neither severity of DM1 nor DMPK TNR length predicted the presence of FECD in DM1 patients. Conclusions FECD was common in DM1 families, and the diseases cosegregated. TCF4 TNR expansion was lacking in DM1 families. These findings support a hypothesis that DMPK TNR expansion contributes to clinical FECD.
Abnormal trajectories in cerebellum and brainstem volumes in carriers of the fragile X premutation.

PubMed

Wang, Jun Yi; Hessl, David; Hagerman, Randi J; Simon, Tony J; Tassone, Flora; Ferrer, Emilio; Rivera, Susan M

2017-07-01

Fragile X-associated tremor/ataxia syndrome (FXTAS) is a late-onset neurodegenerative disorder typically affecting male premutation carriers with 55-200 CGG trinucleotide repeat expansions in the FMR1 gene after age 50. The aim of this study was to examine whether cerebellar and brainstem changes emerge during development or aging in late life. We retrospectively analyzed magnetic resonance imaging scans from 322 males (age 8-81 years). Volume changes in the cerebellum and brainstem were contrasted with those in the ventricles and whole brain. Compared to the controls, premutation carriers without FXTAS showed significantly accelerated volume decrease in the cerebellum and whole brain, flatter inverted U-shaped trajectory of the brainstem, and larger ventricles. Compared to both older controls and premutation carriers without FXTAS, carriers with FXTAS exhibited significant volume decrease in the cerebellum and whole brain and accelerated volume decrease in the brainstem. We therefore conclude that cerebellar and brainstem volumes were likely affected during both development and progression of neurodegeneration in premutation carriers, suggesting that interventions may need to start early in adulthood to be most effective. Copyright © 2017 Elsevier Inc. All rights reserved.
Fear-Specific Amygdala Function in Children and Adolescents on the Fragile X Spectrum: A Dosage Response of the FMR1 Gene

PubMed Central

Kim, So-Yeon; Burris, Jessica; Bassal, Frederick; Koldewyn, Kami; Chattarji, Sumantra; Tassone, Flora; Hessl, David; Rivera, Susan M.

2014-01-01

Mutations of the fragile X mental retardation 1 (FMR1) gene are the genetic cause of fragile X syndrome (FXS). The presence of significant socioemotional problems has been well documented in FXS although the brain basis of those deficits remains unspecified. Here, we investigated amygdala dysfunction and its relation to socioemotional deficits and FMR1 gene expression in children and adolescents on the FX spectrum (i.e., individuals whose trinucleotide CGG repeat expansion from 55 to over 200 places them somewhere within the fragile X diagnostic range from premutation to full mutation). Participants performed an fMRI task in which they viewed fearful, happy, and scrambled faces. Neuroimaging results demonstrated that FX participants revealed significantly attenuated amygdala activation in Fearful > Scrambled and Fearful > Happy contrasts compared with their neurotypical counterparts, while showing no differences in amygdala volume. Furthermore, we found significant relationships between FMR1 gene expression, anxiety/social dysfunction scores, and reduced amygdala activation in the FX group. In conclusion, we report novel evidence regarding a dosage response of the FMR1 gene on fear-specific functions of the amygdala, which is associated with socioemotional deficits in FXS. PMID:23146966

Huntington's Disease in a Patient Misdiagnosed as Conversion Disorder.

PubMed

Nogueira, João Machado; Franco, Ana Margarida; Mendes, Susana; Valadas, Anabela; Semedo, Cristina; Jesus, Gustavo

2018-01-01

Huntington's disease (HD) is an inherited, progressive, and neurodegenerative neuropsychiatric disorder caused by the expansion of cytosine-adenine-guanine (CAG) trinucleotide in Interested Transcript (IT) 15 gene on chromosome 4. This pathology typically presents in individuals aged between 30 and 50 years and the age of onset is inversely correlated with the length of the CAG repeat expansion. It is characterized by chorea, cognitive deficits, and psychiatric symptoms. Usually the psychiatric disorders precede motor and cognitive impairment, Major Depressive Disorder and anxiety disorders being the most common presentations. We present a clinical case of a 65-year-old woman admitted to our Psychiatric Acute Unit. During the 6 years preceding the admission, the patient had clinical assessments made several times by different specialties that focused only on isolated symptoms, disregarding the syndrome as a whole. In the course of her last admission, the patient was referred to our Neuropsychiatric Team, which made the provisional diagnosis of late-onset Huntington's disease, later confirmed by genetic testing. This clinical vignette highlights the importance of a multidisciplinary approach to atypical clinical presentations and raises awareness for the relevance of investigating carefully motor symptoms in psychiatric patients.
Quantification Assays for Total and Polyglutamine-Expanded Huntingtin Proteins

PubMed Central

Boogaard, Ivette; Smith, Melanie; Pulli, Kristiina; Szynol, Agnieszka; Albertus, Faywell; Lamers, Marieke B. A. C.; Dijkstra, Sipke; Kordt, Daniel; Reindl, Wolfgang; Herrmann, Frank; McAllister, George; Fischer, David F.; Munoz-Sanjuan, Ignacio

2014-01-01

The expansion of a CAG trinucleotide repeat in the huntingtin gene, which produces huntingtin protein with an expanded polyglutamine tract, is the cause of Huntington's disease (HD). Recent studies have reported that RNAi suppression of polyglutamine-expanded huntingtin (mutant HTT) in HD animal models can ameliorate disease phenotypes. A key requirement for such preclinical studies, as well as eventual clinical trials, aimed to reduce mutant HTT exposure is a robust method to measure HTT protein levels in select tissues. We have developed several sensitive and selective assays that measure either total human HTT or polyglutamine-expanded human HTT proteins on the electrochemiluminescence Meso Scale Discovery detection platform with an increased dynamic range over other methods. In addition, we have developed an assay to detect endogenous mouse and rat HTT proteins in pre-clinical models of HD to monitor effects on the wild type protein of both allele selective and non-selective interventions. We demonstrate the application of these assays to measure HTT protein in several HD in vitro cellular and in vivo animal model systems as well as in HD patient biosamples. Furthermore, we used purified recombinant HTT proteins as standards to quantitate the absolute amount of HTT protein in such biosamples. PMID:24816435
Total body irradiation in a patient with fragile X syndrome for acute lymphoblastic leukemia in preparation for stem cell transplantation: A case report and literature review.

PubMed

Collins, D T; Mannina, E M; Mendonca, M

2015-10-01

Fragile X syndrome (FXS) is a congenital disorder caused by expansion of CGG trinucleotide repeat at the 5' end of the fragile X mental retardation gene 1 (FMR1) on the X chromosome that leads to chromosomal instability and diminished serum levels of fragile X mental retardation protein (FMRP). Afflicted individuals often have elongated features, marfanoid habitus, macroorchidism and intellectual impairment. Evolving literature suggests the condition may actually protect from malignancy while chromosomal instability would presumably elevate the risk. Increased sensitivity to ionizing radiation should also be predicted by unstable sites within the DNA. Interestingly, in this report, we detail a patient with FXS diagnosed with acute lymphoblastic leukemia treated with induction followed by subsequent cycles of hyper-CVAD (cyclophosphamide, vincristine, doxorubicin, dexamethasone) with a complete response who then was recommended to undergo peripheral stem cell transplantation. The patient underwent total body irradiation (TBI) as a component of his conditioning regimen and despite the concern of his clinicians, developed minimal acute toxicity and successful engraftment. The pertinent literature regarding irradiation of patients with FXS is also reviewed. © 2015 Wiley Periodicals, Inc.
Behavioral and genetic correlates of the neural response to infant crying among human fathers.

PubMed

Mascaro, Jennifer S; Hackett, Patrick D; Gouzoules, Harold; Lori, Adriana; Rilling, James K

2014-11-01

Although evolution has shaped human infant crying and the corresponding response from caregivers, there is marked variation in paternal involvement and caretaking behavior, highlighting the importance of understanding the neurobiology supporting optimal paternal responses to cries. We explored the neural response to infant cries in fathers of children aged 1-2, and its relationship with hormone levels, variation in the androgen receptor (AR) gene, parental attitudes and parental behavior. Although number of AR CAG trinucleotide repeats was positively correlated with neural activity in brain regions important for empathy (anterior insula and inferior frontal gyrus), restrictive attitudes were inversely correlated with neural activity in these regions and with regions involved with emotion regulation (orbitofrontal cortex). Anterior insula activity had a non-linear relationship with paternal caregiving, such that fathers with intermediate activation were most involved. These results suggest that restrictive attitudes may be associated with decreased empathy and emotion regulation in response to a child in distress, and that moderate anterior insula activity reflects an optimal level of arousal that supports engaged fathering. © The Author (2013). Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Event-related potential alterations in fragile X syndrome

PubMed Central

Knoth, Inga S.; Lippé, Sarah

2012-01-01

Fragile X Syndrome (FXS) is the most common form of X-linked intellectual disability (ID), associated with a wide range of cognitive and behavioral impairments. FXS is caused by a trinucleotide repeat expansion in the FMR1 gene located on the X-chromosome. FMR1 is expected to prevent the expression of the “fragile X mental retardation protein (FMRP)”, which results in altered structural and functional development of the synapse, including a loss of synaptic plasticity. This review aims to unveil the contribution of electrophysiological signal studies for the understanding of the information processing impairments in FXS patients. We discuss relevant event-related potential (ERP) studies conducted with full mutation FXS patients and clinical populations sharing symptoms with FXS in a developmental perspective. Specific deviances found in FXS ERP profiles are described. Alterations are reported in N1, P2, Mismatch Negativity (MMN), N2, and P3 components in FXS compared to healthy controls. Particularly, deviances in N1 and P2 amplitude seem to be specific to FXS. The presented results suggest a cascade of impaired information processes that are in line with symptoms and anatomical findings in FXS. PMID:23015788
Chapter 24: the coming of molecular biology and its impact on clinical neurology.

PubMed

Smith, Christopher U M

2010-01-01

Although the chemical study of the nervous system dates back well into the 19th century, molecular biology and especially molecular neurobiology only began to be established in the second half of the 20th century. This chapter reviews their impact on clinical neuroscience during the 50 years since Watson and Crick published their seminal paper. After a short review of the part played by F.O. Schmitt in establishing molecular neuroscience the chapter outlines work that led to a detailed understanding of the biochemical structure and function of nerve cell membranes and their embedded channel proteins, receptors, and other molecules. The chapter then turns to the numerous pathologies that result from disorders of these elements: the various channel and gap-junction pathologies. The chapter continues with a discussion of some of the diseases caused by defective DNA, especially the trinucleotide repeat expansion diseases (TREDs) and ends with a short account of the development of molecular approaches to prion diseases, myasthenia gravis, and the neurodegenerative diseases of old age. Francis Bacon said long ago that "knowledge is power." The hope is that increasing molecular knowledge will help cure some of the human suffering seen in the neurological ward and clinic.
Comparative genomics and repetitive sequence divergence in the species of diploid Nicotiana section Alatae.

PubMed

Lim, K Yoong; Kovarik, Ales; Matyasek, Roman; Chase, Mark W; Knapp, Sandra; McCarthy, Elizabeth; Clarkson, James J; Leitch, Andrew R

2006-12-01

Combining phylogenetic reconstructions of species relationships with comparative genomic approaches is a powerful way to decipher evolutionary events associated with genome divergence. Here, we reconstruct the history of karyotype and tandem repeat evolution in species of diploid Nicotiana section Alatae. By analysis of plastid DNA, we resolved two clades with high bootstrap support, one containing N. alata, N. langsdorffii, N. forgetiana and N. bonariensis (called the n = 9 group) and another containing N. plumbaginifolia and N. longiflora (called the n = 10 group). Despite little plastid DNA sequence divergence, we observed, via fluorescent in situ hybridization, substantial chromosomal repatterning, including altered chromosome numbers, structure and distribution of repeats. Effort was focussed on 35S and 5S nuclear ribosomal DNA (rDNA) and the HRS60 satellite family of tandem repeats comprising the elements HRS60, NP3R and NP4R. We compared divergence of these repeats in diploids and polyploids of Nicotiana. There are dramatic shifts in the distribution of the satellite repeats and complete replacement of intergenic spacers (IGSs) of 35S rDNA associated with divergence of the species in section Alatae. We suggest that sequence homogenization has replaced HRS60 family repeats at sub-telomeric regions, but that this process may not occur, or occurs more slowly, when the repeats are found at intercalary locations. Sequence homogenization acts more rapidly (at least two orders of magnitude) on 35S rDNA than 5S rDNA and sub-telomeric satellite sequences. This rapid rate of divergence is analogous to that found in polyploid species, and is therefore, in plants, not only associated with polyploidy.
Correlation between fibroin amino acid sequence and physical silk properties.

PubMed

Fedic, Robert; Zurovec, Michal; Sehnal, Frantisek

2003-09-12

The fiber properties of lepidopteran silk depend on the amino acid repeats that interact during H-fibroin polymerization. The aim of our research was to relate repeat composition to insect biology and fiber strength. Representative regions of the H-fibroin genes were sequenced and analyzed in three pyralid species: wax moth (Galleria mellonella), European flour moth (Ephestia kuehniella), and Indian meal moth (Plodia interpunctella). The amino acid repeats are species-specific, evidently a diversification of an ancestral region of 43 residues, and include three types of regularly dispersed motifs: modifications of GSSAASAA sequence, stretches of tripeptides GXZ where X and Z represent bulky residues, and sequences similar to PVIVIEE. No concatenations of GX dipeptide or alanine, which are typical for Bombyx silkworms and Antheraea silk moths, respectively, were found. Despite different repeat structure, the silks of G. mellonella and E. kuehniella exhibit similar tensile strength as the Bombyx and Antheraea silks. We suggest that in these latter two species, variations in the repeat length obstruct repeat alignment, but sufficiently long stretches of iterated residues get superposed to interact. In the pyralid H-fibroins, interactions of the widely separated and diverse motifs depend on the precision of repeat matching; silk is strong in G. mellonella and E. kuehniella, with 2-3 types of long homogeneous repeats, and nearly 10 times weaker in P. interpunctella, with seven types of shorter erratic repeats. The high proportion of large amino acids in the H-fibroin of pyralids has probably evolved in connection with the spinning habit of caterpillars that live in protective silk tubes and spin continuously, enlarging the tubes on one end and partly devouring the other one. The silk serves as a depot of energetically rich and essential amino acids that may be scarce in the diet.
Evidence for Long-Timescale Patterns of Synaptic Inputs in CA1 of Awake Behaving Mice.

PubMed

Kolb, Ilya; Talei Franzesi, Giovanni; Wang, Michael; Kodandaramaiah, Suhasa B; Forest, Craig R; Boyden, Edward S; Singer, Annabelle C

2018-02-14

Repeated sequences of neural activity are a pervasive feature of neural networks in vivo and in vitro In the hippocampus, sequential firing of many neurons over periods of 100-300 ms reoccurs during behavior and during periods of quiescence. However, it is not known whether the hippocampus produces longer sequences of activity or whether such sequences are restricted to specific network states. Furthermore, whether long repeated patterns of activity are transmitted to single cells downstream is unclear. To answer these questions, we recorded intracellularly from hippocampal CA1 of awake, behaving male mice to examine both subthreshold activity and spiking output in single neurons. In eight of nine recordings, we discovered long (900 ms) reoccurring subthreshold fluctuations or "repeats." Repeats generally were high-amplitude, nonoscillatory events reoccurring with 10 ms precision. Using statistical controls, we determined that repeats occurred more often than would be expected from unstructured network activity (e.g., by chance). Most spikes occurred during a repeat, and when a repeat contained a spike, the spike reoccurred with precision on the order of ≤20 ms, showing that long repeated patterns of subthreshold activity are strongly connected to spike output. Unexpectedly, we found that repeats occurred independently of classic hippocampal network states like theta oscillations or sharp-wave ripples. Together, these results reveal surprisingly long patterns of repeated activity in the hippocampal network that occur nonstochastically, are transmitted to single downstream neurons, and strongly shape their output. This suggests that the timescale of information transmission in the hippocampal network is much longer than previously thought. SIGNIFICANCE STATEMENT We found long (≥900 ms), repeated, subthreshold patterns of activity in CA1 of awake, behaving mice. These repeated patterns ("repeats") occurred more often than expected by chance and with 10 ms precision. Most spikes occurred within repeats and reoccurred with a precision on the order of 20 ms. Surprisingly, there was no correlation between repeat occurrence and classical network states such as theta oscillations and sharp-wave ripples. These results provide strong evidence that long patterns of activity are repeated and transmitted to downstream neurons, suggesting that the hippocampus can generate longer sequences of repeated activity than previously thought. Copyright © 2018 the authors 0270-6474/18/381822-14$15.00/0.
CRF: detection of CRISPR arrays using random forest.

PubMed

Wang, Kai; Liang, Chun

2017-01-01

CRISPRs (clustered regularly interspaced short palindromic repeats) are particular repeat sequences found in wide range of bacteria and archaea genomes. Several tools are available for detecting CRISPR arrays in the genomes of both domains. Here we developed a new web-based CRISPR detection tool named CRF (CRISPR Finder by Random Forest). Different from other CRISPR detection tools, a random forest classifier was used in CRF to filter out invalid CRISPR arrays from all putative candidates and accordingly enhanced detection accuracy. In CRF, particularly, triplet elements that combine both sequence content and structure information were extracted from CRISPR repeats for classifier training. The classifier achieved high accuracy and sensitivity. Moreover, CRF offers a highly interactive web interface for robust data visualization that is not available among other CRISPR detection tools. After detection, the query sequence, CRISPR array architecture, and the sequences and secondary structures of CRISPR repeats and spacers can be visualized for visual examination and validation. CRF is freely available at http://bioinfolab.miamioh.edu/crf/home.php.
Expanded complexity of unstable repeat diseases

PubMed Central

Polak, Urszula; McIvor, Elizabeth; Dent, Sharon Y.R.; Wells, Robert D.; Napierala, Marek

2015-01-01

Unstable Repeat Diseases (URDs) share a common mutational phenomenon of changes in the copy number of short, tandemly repeated DNA sequences. More than 20 human neurological diseases are caused by instability, predominantly expansion, of microsatellite sequences. Changes in the repeat size initiate a cascade of pathological processes, frequently characteristic of a unique disease or a small subgroup of the URDs. Understanding of both the mechanism of repeat instability and molecular consequences of the repeat expansions is critical to developing successful therapies for these diseases. Recent technological breakthroughs in whole genome, transcriptome and proteome analyses will almost certainly lead to new discoveries regarding the mechanisms of repeat instability, the pathogenesis of URDs, and will facilitate development of novel therapeutic approaches. The aim of this review is to give a general overview of unstable repeats diseases, highlight the complexities of these diseases, and feature the emerging discoveries in the field. PMID:23233240
Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana.

PubMed

Mayer, K; Schüller, C; Wambutt, R; Murphy, G; Volckaert, G; Pohl, T; Düsterhöft, A; Stiekema, W; Entian, K D; Terryn, N; Harris, B; Ansorge, W; Brandt, P; Grivell, L; Rieger, M; Weichselgartner, M; de Simone, V; Obermaier, B; Mache, R; Müller, M; Kreis, M; Delseny, M; Puigdomenech, P; Watson, M; Schmidtheini, T; Reichert, B; Portatelle, D; Perez-Alonso, M; Boutry, M; Bancroft, I; Vos, P; Hoheisel, J; Zimmermann, W; Wedler, H; Ridley, P; Langham, S A; McCullagh, B; Bilham, L; Robben, J; Van der Schueren, J; Grymonprez, B; Chuang, Y J; Vandenbussche, F; Braeken, M; Weltjens, I; Voet, M; Bastiaens, I; Aert, R; Defoor, E; Weitzenegger, T; Bothe, G; Ramsperger, U; Hilbert, H; Braun, M; Holzer, E; Brandt, A; Peters, S; van Staveren, M; Dirske, W; Mooijman, P; Klein Lankhorst, R; Rose, M; Hauf, J; Kötter, P; Berneiser, S; Hempel, S; Feldpausch, M; Lamberth, S; Van den Daele, H; De Keyser, A; Buysshaert, C; Gielen, J; Villarroel, R; De Clercq, R; Van Montagu, M; Rogers, J; Cronin, A; Quail, M; Bray-Allen, S; Clark, L; Doggett, J; Hall, S; Kay, M; Lennard, N; McLay, K; Mayes, R; Pettett, A; Rajandream, M A; Lyne, M; Benes, V; Rechmann, S; Borkova, D; Blöcker, H; Scharfe, M; Grimm, M; Löhnert, T H; Dose, S; de Haan, M; Maarse, A; Schäfer, M; Müller-Auer, S; Gabel, C; Fuchs, M; Fartmann, B; Granderath, K; Dauner, D; Herzl, A; Neumann, S; Argiriou, A; Vitale, D; Liguori, R; Piravandi, E; Massenet, O; Quigley, F; Clabauld, G; Mündlein, A; Felber, R; Schnabl, S; Hiller, R; Schmidt, W; Lecharny, A; Aubourg, S; Chefdor, F; Cooke, R; Berger, C; Montfort, A; Casacuberta, E; Gibbons, T; Weber, N; Vandenbol, M; Bargues, M; Terol, J; Torres, A; Perez-Perez, A; Purnelle, B; Bent, E; Johnson, S; Tacon, D; Jesse, T; Heijnen, L; Schwarz, S; Scholler, P; Heber, S; Francs, P; Bielke, C; Frishman, D; Haase, D; Lemcke, K; Mewes, H W; Stocker, S; Zaccaria, P; Bevan, M; Wilson, R K; de la Bastide, M; Habermann, K; Parnell, L; Dedhia, N; Gnoj, L; Schutz, K; Huang, E; Spiegel, L; Sehkon, M; Murray, J; Sheet, P; Cordes, M; Abu-Threideh, J; Stoneking, T; Kalicki, J; Graves, T; Harmon, G; Edwards, J; Latreille, P; Courtney, L; Cloud, J; Abbott, A; Scott, K; Johnson, D; Minx, P; Bentley, D; Fulton, B; Miller, N; Greco, T; Kemp, K; Kramer, J; Fulton, L; Mardis, E; Dante, M; Pepin, K; Hillier, L; Nelson, J; Spieth, J; Ryan, E; Andrews, S; Geisel, C; Layman, D; Du, H; Ali, J; Berghoff, A; Jones, K; Drone, K; Cotton, M; Joshu, C; Antonoiu, B; Zidanic, M; Strong, C; Sun, H; Lamar, B; Yordan, C; Ma, P; Zhong, J; Preston, R; Vil, D; Shekher, M; Matero, A; Shah, R; Swaby, I K; O'Shaughnessy, A; Rodriguez, M; Hoffmann, J; Till, S; Granat, S; Shohdy, N; Hasegawa, A; Hameed, A; Lodhi, M; Johnson, A; Chen, E; Marra, M; Martienssen, R; McCombie, W R

1999-12-16

The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.
Characterization of species-specific repeated DNA sequences from B. nigra.

PubMed

Gupta, V; Lakshmisita, G; Shaila, M S; Jagannathan, V; Lakshmikumaran, M S

1992-07-01

The construction and characterization of two genome-specific recombinant DNA clones from B. nigra are described. Southern analysis showed that the two clones belong to a dispersed repeat family. They differ from each other in their length, distribution and sequence, though the average GC content is nearly the same (45%). These B genome-specific repeats have been used to analyse the phylogenetic relationships between cultivated and wild species of the family Brassicaceae.
[Convergent origin of repeats in genes coding for globular proteins. An analysis of the factors determining the presence of inverted and symmetrical repeats].

PubMed

Solov'ev, V V; Kel', A E; Kolchanov, N A

1989-01-01

The factors, determining the presence of inverted and symmetrical repeats in genes coding for globular proteins, have been analysed. An interesting property of genetical code has been revealed in the analysis of symmetrical repeats: the pairs of symmetrical codons corresponded to pairs of amino acids with mostly similar physical-chemical parameters. This property may explain the presence of symmetrical repeats and palindromes only in genes coding for beta-structural proteins-polypeptides, where amino acids with similar physical-chemical properties occupy symmetrical positions. A stochastic model of evolution of polynucleotide sequences has been used for analysis of inverted repeats. The modelling demonstrated that only limiting of sequences (uneven frequencies of used codons) is enough for arising of nonrandom inverted repeats in genes.
Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain

PubMed Central

de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas

2014-01-01

The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. PMID:24792163
ACMES: fast multiple-genome searches for short repeat sequences with concurrent cross-species information retrieval

PubMed Central

Reneker, Jeff; Shyu, Chi-Ren; Zeng, Peiyu; Polacco, Joseph C.; Gassmann, Walter

2004-01-01

We have developed a web server for the life sciences community to use to search for short repeats of DNA sequence of length between 3 and 10 000 bases within multiple species. This search employs a unique and fast hash function approach. Our system also applies information retrieval algorithms to discover knowledge of cross-species conservation of repeat sequences. Furthermore, we have incorporated a part of the Gene Ontology database into our information retrieval algorithms to broaden the coverage of the search. Our web server and tutorial can be found at http://acmes.rnet.missouri.edu. PMID:15215469
Survey of microsatellite DNA in pine

Treesearch

C. S. Echt; P. May-Marquardt

1997-01-01

A large insert genomic library from eastern white pine (Pinus strobus) was probed for the microsatellite motifs (AC)n and (AG)n, all 10 trinucleotide motifs, and 22 of the 33 possible tetranucleotide motifs. For comparison with a species from a different subgenus, a loblolly pine (Pinus taeda...
Survey of microsatellite DNA in pine

Treesearch

Craig S. Echt; P. May-Marquardt

1997-01-01

A large insert genomic library from eastern white pine (Pinus strobus) was probed for the microsatellite motifs (AC)n and (AG)n, all 10 trinucleotide motifs, and 22 of the 33 possible tetranucleotide motifs. For comparison with a species from a different subgenus, a loblolly pine (Pinus taeda) genomic...
Genetic characterization of UCS region of Pneumocystis jirovecii and construction of allelic profiles of Indian isolates based on sequence typing at three regions.

PubMed

Gupta, Rashmi; Mirdha, Bijay Ranjan; Guleria, Randeep; Kumar, Lalit; Luthra, Kalpana; Agarwal, Sanjay Kumar; Sreenivas, Vishnubhatla

2013-01-01

Pneumocystis jirovecii is an opportunistic pathogen that causes severe pneumonia in immunocompromised patients. To study the genetic diversity of P. jirovecii in India the upstream conserved sequence (UCS) region of Pneumocystis genome was amplified, sequenced and genotyped from a set of respiratory specimens obtained from 50 patients with a positive result for nested mitochondrial large subunit ribosomal RNA (mtLSU rRNA) PCR during the years 2005-2008. Of these 50 cases, 45 showed a positive PCR for UCS region. Variations in the tandem repeats in UCS region were characterized by sequencing all the positive cases. Of the 45 cases, one case showed five repeats, 11 cases showed four repeats, 29 cases showed three repeats and four cases showed two repeats. By running amplified DNA from all these cases on a high-resolution gel, mixed infection was observed in 12 cases (26.7%, 12/45). Forty three of 45 cases included in this study had previously been typed at mtLSU rRNA and internal transcribed spacer (ITS) region by our group. In the present study, the genotypes at those two regions were combined with UCS repeat patterns to construct allelic profiles of 43 cases. A total of 36 allelic profiles were observed in 43 isolates indicating high genetic variability. A statistically significant association was observed between mtLSU rRNA genotype 1, ITS type Ea and UCS repeat pattern 4. Copyright © 2012 Elsevier B.V. All rights reserved.
Evolution and selection of Rhg1, a copy-number variant nematode-resistance locus

PubMed Central

Lee, Tong Geon; Kumar, Indrajit; Diers, Brian W; Hudson, Matthew E

2015-01-01

The soybean cyst nematode (SCN) resistance locus Rhg1 is a tandem repeat of a 31.2 kb unit of the soybean genome. Each 31.2-kb unit contains four genes. One allele of Rhg1, Rhg1-b, is responsible for protecting most US soybean production from SCN. Whole-genome sequencing was performed, and PCR assays were developed to investigate allelic variation in sequence and copy number of the Rhg1 locus across a population of soybean germplasm accessions. Four distinct sequences of the 31.2-kb repeat unit were identified, and some Rhg1 alleles carry up to three different types of repeat unit. The total number of copies of the repeat varies from 1 to 10 per haploid genome. Both copy number and sequence of the repeat correlate with the resistance phenotype, and the Rhg1 locus shows strong signatures of selection. Significant linkage disequilibrium in the genome outside the boundaries of the repeat allowed the Rhg1 genotype to be inferred using high-density single nucleotide polymorphism genotyping of 15 996 accessions. Over 860 germplasm accessions were found likely to possess Rhg1 alleles. The regions surrounding the repeat show indications of non-neutral evolution and high genetic variability in populations from different geographic locations, but without evidence of fixation of the resistant genotype. A compelling explanation of these results is that balancing selection is in operation at Rhg1. PMID:25735447

Some links on this page may take you to non-federal websites. Their policies may differ from this site.