tandemly repeated sequences: Topics by Science.gov

Sample records for tandemly repeated sequences

TRAP: automated classification, quantification and annotation of tandemly repeated sequences.

PubMed

Sobreira, Tiago José P; Durham, Alan M; Gruber, Arthur

2006-02-01

TRAP, the Tandem Repeats Analysis Program, is a Perl program that provides a unified set of analyses for the selection, classification, quantification and automated annotation of tandemly repeated sequences. TRAP uses the results of the Tandem Repeats Finder program to perform a global analysis of the satellite content of DNA sequences, permitting researchers to easily assess the tandem repeat content for both individual sequences and whole genomes. The results can be generated in convenient formats such as HTML and comma-separated values. TRAP can also be used to automatically generate annotation data in the format of feature table and GFF files.
Typing Clostridium difficile strains based on tandem repeat sequences

PubMed Central

2009-01-01

Background Genotyping of epidemic Clostridium difficile strains is necessary to track their emergence and spread. Portability of genotyping data is desirable to facilitate inter-laboratory comparisons and epidemiological studies. Results This report presents results from a systematic screen for variation in repetitive DNA in the genome of C. difficile. We describe two tandem repeat loci, designated 'TR6' and 'TR10', which display extensive sequence variation that may be useful for sequence-based strain typing. Based on an investigation of 154 C. difficile isolates comprising 75 ribotypes, tandem repeat sequencing demonstrated excellent concordance with widely used PCR ribotyping and equal discriminatory power. Moreover, tandem repeat sequences enabled the reconstruction of the isolates' largely clonal population structure and evolutionary history. Conclusion We conclude that sequence analysis of the two repetitive loci introduced here may be highly useful for routine typing of C. difficile. Tandem repeat sequence typing resolves phylogenetic diversity to a level equivalent to PCR ribotypes. DNA sequences may be stored in databases accessible over the internet, obviating the need for the exchange of reference strains. PMID:19133124
Two tandemly repeated telomere-associated sequences in Nicotiana plumbaginifolia.

PubMed

Chen, C M; Wang, C T; Wang, C J; Ho, C H; Kao, Y Y; Chen, C C

1997-12-01

Two tandemly repeated telomere-associated sequences, NP3R and NP4R, have been isolated from Nicotiana plumbaginifolia. The length of a repeating unit for NP3R and NP4R is 165 and 180 nucleotides respectively. The abundance of NP3R, NP4R and telomeric repeats is, respectively, 8.4 x 10(4), 6 x 10(3) and 1.5 x 10(6) copies per haploid genome of N. plumbaginifolia. Fluorescence in situ hybridization revealed that NP3R is located at the ends and/or in interstitial regions of all 10 chromosomes and NP4R on the terminal regions of three chromosomes in the haploid genome of N. plumbaginifolia. Sequence homology search revealed that not only are NP3R and NP4R homologous to HRS60 and GRS, respectively, two tandem repeats isolated from N. tabacum, but that NP3R and NP4R are also related to each other, suggesting that they originated from a common ancestral sequence. The role of these repeated sequences in chromosome healing is discussed based on the observation that two to three copies of a telomere-similar sequence were present in each repeating unit of NP3R and NP4R.
Tandemly repeated sequences in mtDNA control region of whitefish, Coregonus lavaretus.

PubMed

Brzuzan, P

2000-06-01

Length variation of the mitochondrial DNA control region was observed with PCR amplification of a sample of 138 whitefish (Coregonus lavaretus). Nucleotide sequences of representative PCR products showed that the variation was due to the presence of an approximately 100-bp motif tandemly repeated two, three, or five times in the region between the conserved sequence block-3 (CSB-3) and the gene for phenylalanine tRNA. This is the first report on the tandem array composed of long repeat units in mitochondrial DNA of salmonids.
Small tandemly repeated DNA sequences of higher plants likely originate from a tRNA gene ancestor.

PubMed Central

Benslimane, A A; Dron, M; Hartmann, C; Rode, A

1986-01-01

Several monomers (177 bp) of a tandemly arranged repetitive nuclear DNA sequence of Brassica oleracea have been cloned and sequenced. They share up to 95% homology between one another and up to 80% with other satellite DNA sequences of Cruciferae, suggesting a common ancestor. Both strands of these monomers show more than 50% homology with many tRNA genes; the best homologies have been obtained with Lys and His yeast mitochondrial tRNA genes (respectively 64% and 60%). These results suggest that small tandemly repeated DNA sequences of plants may have evolved from a tRNA gene ancestor. These tandem repeats have probably arisen via a process involving reverse transcription of polymerase III RNA intermediates, as is the case for interspersed DNA sequences of mammalians. A model is proposed to explain the formation of such small tandemly repeated DNA sequences. Images PMID:3774553
A novel species-specific tandem repeat DNA family from Sinapis arvensis: detection of telomere-like sequences.

PubMed

Kapila, R; Das, S; Srivastava, P S; Lakshmikumaran, M

1996-08-01

DNA sequences representing a tandemly repeated DNA family of the Sinapis arvensis genome were cloned and characterized. The 700-bp tandem repeat family is represented by two clones, pSA35 and pSA52, which are 697 and 709 bp in length, respectively. Dot matrix analysis of the sequences indicates the presence of repeated elements within each monomeric unit. Sequence analysis of the repetitive region of clones pSA35 and pSA52 shows that there are several copies of a 7-bp repeat element organized in tandem. The consensus sequence of this repeat element is 5'-TTTAGGG-3'. These elements are highly mutated and the difference in length between the two clones is due to different copy numbers of these elements. The repetitive region of clone pSA35 has 26 copies of the element TTTAGGG, whereas clone pSA52 has 28 copies. The repetitive region in both clones is flanked on either side by inverted repeats that may be footprints of a transposition event. Sequence comparison indicates that the element TTTAGGG is identical to telomeric repeats present in Arabidopsis, maize, tomato, and other plants. However, Bal31 digestion kinetics indicates non-telomeric localization of the 700-bp tandem repeats. The clones represent a novel repeat family as (i) they contain telomere-like motifs as subrepeats within each unit; and (ii) they do not hybridize to related crucifers and are species-specific in nature.
Short Tandem Repeat DNA Internet Database

National Institute of Standards and Technology Data Gateway

SRD 130 Short Tandem Repeat DNA Internet Database (Web, free access) Short Tandem Repeat DNA Internet Database is intended to benefit research and application of short tandem repeat DNA markers for human identity testing. Facts and sequence information on each STR system, population data, commonly used multiplex STR systems, PCR primers and conditions, and a review of various technologies for analysis of STR alleles have been included.
A novel tandem repeat sequence located on human chromosome 4p: isolation and characterization.

PubMed

Kogi, M; Fukushige, S; Lefevre, C; Hadano, S; Ikeda, J E

1997-06-01

In an effort to analyze the genomic region of the distal half of human chromosome 4p, to where Huntington disease and other diseases have been mapped, we have isolated the cosmid clone (CRS447) that was likely to contain a region with specific repeat sequences. Clone CRS447 was subjected to detailed analysis, including chromosome mapping, restriction mapping, and DNA sequencing. Chromosome mapping by both a human-CHO hybrid cell panel and FISH revealed that CRS447 was predominantly located in the 4p15.1-15.3 region. CRS447 was shown to consist of tandem repeats of 4.7-kb units present on chromosome 4p. A single EcoRI unit was subcloned (pRS447), and the complete sequence was determined as 4752 nucleotides. When pRS447 was used as a probe, the number of copies of this repeat per haploid genome was estimated to be 50-70. Sequence analysis revealed that it contained two internal CA repeats and one putative ORF. Database search established that this sequence was unreported. However, two homologous STS markers were found in the database. We concluded that CRS447/pRS447 is a novel tandem repeat sequence that is mainly specific to human chromosome 4p.
[Molecular cloning and characterization of a novel Clonorchis sinensis antigenic protein containing tandem repeat sequences].

PubMed

Liu, Qian; Xu, Xue-Nian; Zhou, Yan; Cheng, Na; Dong, Yu-Ting; Zheng, Hua-Jun; Zhu, Yong-Qiang; Zhu, Yong-Qiang

2013-08-01

To find and clone new antigen genes from the lambda-ZAP cDNA expression library of adult Clonorchis sinensis, and determine the immunological characteristics of the recombinant proteins. The cDNA expression library of adult C. sinensis was screened by pooled sera of clonorchiasis patients. The sequences of the positive phage clones were compared with the sequences in EST database, and the full-length sequence of the gene (Cs22 gene) was obtained by RT-PCR. cDNA fragments containing 2 and 3 times tandem repeat sequences were generated by jumping PCR. The sequence encoding the mature peptide or the tandem repeat sequence was respectively cloned into the prokaryotic expression vector pET28a (+), and then transformed into E. coli Rosetta DE3 cells for expression. The recombinant proteins (rCs22-2r, rCs22-3r, rCs22M-2r, and rCs22M-3r) were purified by His-bind-resin (Ni-NTA) affinity chromatography. The immunogenicity of rCs22-2r and rCs22-3r was identified by ELISA. To evaluate the immunological diagnostic value of rCs22-2r and rCs22-3r, serum samples from 35 clonorchiasis patients, 31 healthy individuals, 15 schistosomiasis patients, 15 paragonimiasis westermani patients and 13 cysticercosis patients were examined by ELISA. To locate antigenic determinants, the pooled sera of clonorchiasis patients and healthy persons were analyzed for specific antibodies by ELISA with recombinant protein rCs22M-2r and rCs22M-3r containing the tandem repeat sequences. The full-length sequence of Cs22 antigen gene of C. sinensis was obtained. It contained 13 times tandem repeat sequences of EQQDGDEEGMGGDGGRGKEKGKVEGEDGAGEQKEQA. Bioinformatics analysis indicated that the protein (Cs22) belonged to GPI-anchored proteins family. The recombinant proteins rCs22-2r and rCs22-3r showed a certain level of immunogenicity. The positive rate by ELISA coated with the purified PrCs22-2r and PrCs22-3r for sera of clonorchiasis patients both were 45.7% (16/35), and 3.2% (1/31) for those of healthy
TRedD—A database for tandem repeats over the edit distance

PubMed Central

Sokol, Dina; Atagun, Firat

2010-01-01

A ‘tandem repeat’ in DNA is a sequence of two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats are common in the genomes of both eukaryotic and prokaryotic organisms. They are significant markers for human identity testing, disease diagnosis, sequence homology and population studies. In this article, we describe a new database, TRedD, which contains the tandem repeats found in the human genome. The database is publicly available online, and the software for locating the repeats is also freely available. The definition of tandem repeats used by TRedD is a new and innovative definition based upon the concept of ‘evolutive tandem repeats’. In addition, we have developed a tool, called TandemGraph, to graphically depict the repeats occurring in a sequence. This tool can be coupled with any repeat finding software, and it should greatly facilitate analysis of results. Database URL: http://tandem.sci.brooklyn.cuny.edu/ PMID:20624712
Accurate typing of short tandem repeats from genome-wide sequencing data and its applications.

PubMed

Fungtammasan, Arkarachai; Ananda, Guruprasad; Hile, Suzanne E; Su, Marcia Shu-Wei; Sun, Chen; Harris, Robert; Medvedev, Paul; Eckert, Kristin; Makova, Kateryna D

2015-05-01

Short tandem repeats (STRs) are implicated in dozens of human genetic diseases and contribute significantly to genome variation and instability. Yet profiling STRs from short-read sequencing data is challenging because of their high sequencing error rates. Here, we developed STR-FM, short tandem repeat profiling using flank-based mapping, a computational pipeline that can detect the full spectrum of STR alleles from short-read data, can adapt to emerging read-mapping algorithms, and can be applied to heterogeneous genetic samples (e.g., tumors, viruses, and genomes of organelles). We used STR-FM to study STR error rates and patterns in publicly available human and in-house generated ultradeep plasmid sequencing data sets. We discovered that STRs sequenced with a PCR-free protocol have up to ninefold fewer errors than those sequenced with a PCR-containing protocol. We constructed an error correction model for genotyping STRs that can distinguish heterozygous alleles containing STRs with consecutive repeat numbers. Applying our model and pipeline to Illumina sequencing data with 100-bp reads, we could confidently genotype several disease-related long trinucleotide STRs. Utilizing this pipeline, for the first time we determined the genome-wide STR germline mutation rate from a deeply sequenced human pedigree. Additionally, we built a tool that recommends minimal sequencing depth for accurate STR genotyping, depending on repeat length and sequencing read length. The required read depth increases with STR length and is lower for a PCR-free protocol. This suite of tools addresses the pressing challenges surrounding STR genotyping, and thus is of wide interest to researchers investigating disease-related STRs and STR evolution. © 2015 Fungtammasan et al.; Published by Cold Spring Harbor Laboratory Press.
Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution.

PubMed

Melters, Daniël P; Bradnam, Keith R; Young, Hugh A; Telis, Natalie; May, Michael R; Ruby, J Graham; Sebra, Robert; Peluso, Paul; Eid, John; Rank, David; Garcia, José Fernando; DeRisi, Joseph L; Smith, Timothy; Tobias, Christian; Ross-Ibarra, Jeffrey; Korf, Ian; Chan, Simon W L

2013-01-30

Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes.
Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution

PubMed Central

2013-01-01

Background Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Results Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. Conclusions While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes. PMID:23363705
APE1 incision activity at abasic sites in tandem repeat sequences.

PubMed

Li, Mengxia; Völker, Jens; Breslauer, Kenneth J; Wilson, David M

2014-05-29

Repetitive DNA sequences, such as those present in microsatellites and minisatellites, telomeres, and trinucleotide repeats (linked to fragile X syndrome, Huntington disease, etc.), account for nearly 30% of the human genome. These domains exhibit enhanced susceptibility to oxidative attack to yield base modifications, strand breaks, and abasic sites; have a propensity to adopt non-canonical DNA forms modulated by the positions of the lesions; and, when not properly processed, can contribute to genome instability that underlies aging and disease development. Knowledge on the repair efficiencies of DNA damage within such repetitive sequences is therefore crucial for understanding the impact of such domains on genomic integrity. In the present study, using strategically designed oligonucleotide substrates, we determined the ability of human apurinic/apyrimidinic endonuclease 1 (APE1) to cleave at apurinic/apyrimidinic (AP) sites in a collection of tandem DNA repeat landscapes involving telomeric and CAG/CTG repeat sequences. Our studies reveal the differential influence of domain sequence, conformation, and AP site location/relative positioning on the efficiency of APE1 binding and strand incision. Intriguingly, our data demonstrate that APE1 endonuclease efficiency correlates with the thermodynamic stability of the DNA substrate. We discuss how these results have both predictive and mechanistic consequences for understanding the success and failure of repair protein activity associated with such oxidatively sensitive, conformationally plastic/dynamic repetitive DNA domains. Published by Elsevier Ltd.
[Polymorphic loci and polymorphism analysis of short tandem repeats within XNP gene].

PubMed

Liu, Qi-Ji; Gong, Yao-Qin; Guo, Chen-Hong; Chen, Bing-Xi; Li, Jiang-Xia; Guo, Yi-Shou

2002-01-01

To select polymorphic short tandem repeat markers within X-linked nuclear protein (XNP) gene, genomic clones which contain XNP gene were recognized by homologous analysis with XNP cDNA. By comparing the cDNA with genomic DNA, non-exonic sequences were identified, and short tandem repeats were selected from non-exonic sequences by using BCM search Launcher. Polymorphisms of the short tandem repeats in Chinese population were evaluated by PCR amplification and PAGE. Five short tandem repeats were identified from XNP gene, two of which were polymorphic. Four and 11 alleles were observed in Chinese population for XNPSTR1 and XNPSTR4, respectively. Heterozygosities were 47% for XNPSTR1 and 70% for XNPSTR4. XNPSTR1 and XNPSTR4 localized within 3' end and intron 10, respectively. Two polymorphic short tandem repeats have been identified within XNP gene and will be useful for linkage analysis and gene diagnosis of XNP gene.
Sequence repeats and protein structure

NASA Astrophysics Data System (ADS)

Hoang, Trinh X.; Trovato, Antonio; Seno, Flavio; Banavar, Jayanth R.; Maritan, Amos

2012-11-01

Repeats are frequently found in known protein sequences. The level of sequence conservation in tandem repeats correlates with their propensities to be intrinsically disordered. We employ a coarse-grained model of a protein with a two-letter amino acid alphabet, hydrophobic (H) and polar (P), to examine the sequence-structure relationship in the realm of repeated sequences. A fraction of repeated sequences comprises a distinct class of bad folders, whose folding temperatures are much lower than those of random sequences. Imperfection in sequence repetition improves the folding properties of the bad folders while deteriorating those of the good folders. Our results may explain why nature has utilized repeated sequences for their versatility and especially to design functional proteins that are intrinsically unstructured at physiological temperatures.
Fingerprinting of Cyanobacteria Based on PCR with Primers Derived from Short and Long Tandemly Repeated Repetitive Sequences

PubMed Central

Rasmussen, Ulla; Svenning, Mette M.

1998-01-01

The presence of repeated DNA (short tandemly repeated repetitive [STRR] and long tandemly repeated repetitive [LTRR]) sequences in the genome of cyanobacteria was used to generate a fingerprint method for symbiotic and free-living isolates. Primers corresponding to the STRR and LTRR sequences were used in the PCR, resulting in a method which generate specific fingerprints for individual isolates. The method was useful both with purified DNA and with intact cyanobacterial filaments or cells as templates for the PCR. Twenty-three Nostoc isolates from a total of 35 were symbiotic isolates from the angiosperm Gunnera species, including isolates from the same Gunnera species as well as from different species. The results show a genetic similarity among isolates from different Gunnera species as well as a genetic heterogeneity among isolates from the same Gunnera species. Isolates which have been postulated to be closely related or identical revealed similar results by the PCR method, indicating that the technique is useful for clustering of even closely related strains. The method was applied to nonheterocystus cyanobacteria from which a fingerprint pattern was obtained. PMID:16349487
Identification and characterization of tandem repeats in exon III of dopamine receptor D4 (DRD4) genes from different mammalian species.

PubMed

Larsen, Svend Arild; Mogensen, Line; Dietz, Rune; Baagøe, Hans Jørgen; Andersen, Mogens; Werge, Thomas; Rasmussen, Henrik Berg

2005-12-01

In this study we have identified and characterized dopamine receptor D4 (DRD4) exon III tandem repeats in 33 public available nucleotide sequences from different mammalian species. We found that the tandem repeat in canids could be described in a novel and simple way, namely, as a structure composed of 15- and 12- bp modules. Tandem repeats composed of 18-bp modules were found in sequences from the horse, zebra, onager, and donkey, Asiatic bear, polar bear, common raccoon, dolphin, harbor porpoise, and domestic cat. Several of these sequences have been analyzed previously without a tandem repeat being found. In the domestic cow and gray seal we identified tandem repeats composed of 36-bp modules, each consisting of two closely related 18-bp basic units. A tandem repeat consisting of 9-bp modules was identified in sequences from mink and ferret. In the European otter we detected an 18-bp tandem repeat, while a tandem repeat consisting of 27-bp modules was identified in a sequence from European badger. Both these tandem repeats were composed of 9-bp basic units, which were closely related with the 9-bp repeat modules identified in the mink and ferret. Tandem repeats could not be identified in sequences from rodents. All tandem repeats possessed a high GC content with a strong bias for C. On phylogenetic analysis of the tandem repeats evolutionary related species were clustered into the same groups. The degree of conservation of the tandem repeats varied significantly between species. The deduced amino acid sequences of most of the tandem repeats exhibited a high propensity for disorder. This was also the case with an amino acid sequence of the human DRD4 exon III tandem repeat, which was included in the study for comparative purposes. We identified proline-containing motifs for SH3 and WW domain binding proteins, potential phosphorylation sites, PDZ domain binding motifs, and FHA domain binding motifs in the amino acid sequences of the tandem repeats. The numbers of
Molecular characterization and distribution of a 145-bp tandem repeat family in the genus Populus.

PubMed

Rajagopal, J; Das, S; Khurana, D K; Srivastava, P S; Lakshmikumaran, M

1999-10-01

This report aims to describe the identification and molecular characterization of a 145-bp tandem repeat family that accounts for nearly 1.5% of the Populus genome. Three members of this repeat family were cloned and sequenced from Populus deltoides and P. ciliata. The dimers of the repeat were sequenced in order to confirm the head-to-tail organization of the repeat. Hybridization-based analysis using the 145-bp tandem repeat as a probe on genomic DNA gave rise to ladder patterns which were identified to be a result of methylation and (or) sequence heterogeneity. Analysis of the methylation pattern of the repeat family using methylation-sensitive isoschizomers revealed variable methylation of the C residues and lack of methylation of the A residues. Sequence comparisons between the monomers revealed a high degree of sequence divergence that ranged between 6% and 11% in P. deltoides and between 4.2% and 8.3% in P. ciliata. This indicated the presence of sub-families within the 145-bp tandem family of repeats. Divergence was mainly due to the accumulation of point mutations and was concentrated in the central region of the repeat. The 145-bp tandem repeat family did not show significant homology to known tandem repeats from plants. A short stretch of 36 bp was found to show homology of 66.7% to a centromeric repeat from Chironomus plumosus. Dot-blot analysis and Southern hybridization data revealed the presence of the repeat family in 13 of the 14 Populus species examined. The absence of the 145-bp repeat from P. euphratica suggested that this species is relatively distant from other members of the genus, which correlates with taxonomic classifications. The widespread occurrence of the tandem family in the genus indicated that this family may be of ancient origin.
Rational design of alpha-helical tandem repeat proteins with closed architectures

PubMed Central

Doyle, Lindsey; Hallinan, Jazmine; Bolduc, Jill; Parmeggiani, Fabio; Baker, David; Stoddard, Barry L.; Bradley, Philip

2015-01-01

Tandem repeat proteins, which are formed by repetition of modular units of protein sequence and structure, play important biological roles as macromolecular binding and scaffolding domains, enzymes, and building blocks for the assembly of fibrous materials1,2. The modular nature of repeat proteins enables the rapid construction and diversification of extended binding surfaces by duplication and recombination of simple building blocks3,4. The overall architecture of tandem repeat protein structures – which is dictated by the internal geometry and local packing of the repeat building blocks – is highly diverse, ranging from extended, super-helical folds that bind peptide, DNA, and RNA partners5–9, to closed and compact conformations with internal cavities suitable for small molecule binding and catalysis10. Here we report the development and validation of computational methods for de novo design of tandem repeat protein architectures driven purely by geometric criteria defining the inter-repeat geometry, without reference to the sequences and structures of existing repeat protein families. We have applied these methods to design a series of closed alpha-solenoid11 repeat structures (alpha-toroids) in which the inter-repeat packing geometry is constrained so as to juxtapose the N- and C-termini; several of these designed structures have been validated by X-ray crystallography. Unlike previous approaches to tandem repeat protein engineering12–20, our design procedure does not rely on template sequence or structural information taken from natural repeat proteins and hence can produce structures unlike those seen in nature. As an example, we have successfully designed and validated closed alpha-solenoid repeats with a left-handed helical architecture that – to our knowledge – is not yet present in the protein structure database21. PMID:26675735

Population-scale whole genome sequencing identifies 271 highly polymorphic short tandem repeats from Japanese population.

PubMed

Hirata, Satoshi; Kojima, Kaname; Misawa, Kazuharu; Gervais, Olivier; Kawai, Yosuke; Nagasaki, Masao

2018-05-01

Forensic DNA typing is widely used to identify missing persons and plays a central role in forensic profiling. DNA typing usually uses capillary electrophoresis fragment analysis of PCR amplification products to detect the length of short tandem repeat (STR) markers. Here, we analyzed whole genome data from 1,070 Japanese individuals generated using massively parallel short-read sequencing of 162 paired-end bases. We have analyzed 843,473 STR loci with two to six basepair repeat units and cataloged highly polymorphic STR loci in the Japanese population. To evaluate the performance of the cataloged STR loci, we compared 23 STR loci, widely used in forensic DNA typing, with capillary electrophoresis based STR genotyping results in the Japanese population. Seventeen loci had high correlations and high call rates. The other six loci had low call rates or low correlations due to either the limitations of short-read sequencing technology, the bioinformatics tool used, or the complexity of repeat patterns. With these analyses, we have also purified the suitable 218 STR loci with four basepair repeat units and 53 loci with five basepair repeat units both for short read sequencing and PCR based technologies, which would be candidates to the actual forensic DNA typing in Japanese population.
A TALE-inspired computational screen for proteins that contain approximate tandem repeats.

PubMed

Perycz, Malgorzata; Krwawicz, Joanna; Bochtler, Matthias

2017-01-01

TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen.
TRDistiller: a rapid filter for enrichment of sequence datasets with proteins containing tandem repeats.

PubMed

Richard, François D; Kajava, Andrey V

2014-06-01

The dramatic growth of sequencing data evokes an urgent need to improve bioinformatics tools for large-scale proteome analysis. Over the last two decades, the foremost efforts of computer scientists were devoted to proteins with aperiodic sequences having globular 3D structures. However, a large portion of proteins contain periodic sequences representing arrays of repeats that are directly adjacent to each other (so called tandem repeats or TRs). These proteins frequently fold into elongated fibrous structures carrying different fundamental functions. Algorithms specific to the analysis of these regions are urgently required since the conventional approaches developed for globular domains have had limited success when applied to the TR regions. The protein TRs are frequently not perfect, containing a number of mutations, and some of them cannot be easily identified. To detect such "hidden" repeats several algorithms have been developed. However, the most sensitive among them are time-consuming and, therefore, inappropriate for large scale proteome analysis. To speed up the TR detection we developed a rapid filter that is based on the comparison of composition and order of short strings in the adjacent sequence motifs. Tests show that our filter discards up to 22.5% of proteins which are known to be without TRs while keeping almost all (99.2%) TR-containing sequences. Thus, we are able to decrease the size of the initial sequence dataset enriching it with TR-containing proteins which allows a faster subsequent TR detection by other methods. The program is available upon request. Copyright © 2014 Elsevier Inc. All rights reserved.
A TALE-inspired computational screen for proteins that contain approximate tandem repeats

PubMed Central

Krwawicz, Joanna

2017-01-01

TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen. PMID:28617832
Tandem Repeat Proteins Inspired By Squid Ring Teeth

NASA Astrophysics Data System (ADS)

Pena-Francesch, Abdon

Proteins are large biomolecules consisting of long chains of amino acids that hierarchically assemble into complex structures, and provide a variety of building blocks for biological materials. The repetition of structural building blocks is a natural evolutionary strategy for increasing the complexity and stability of protein structures. However, the relationship between amino acid sequence, structure, and material properties of protein systems remains unclear due to the lack of control over the protein sequence and the intricacies of the assembly process. In order to investigate the repetition of protein building blocks, a recently discovered protein from squids is examined as an ideal protein system. Squid ring teeth are predatory appendages located inside the suction cups that provide a strong grasp of prey, and are solely composed of a group of proteins with tandem repetition of building blocks. The objective of this thesis is the understanding of sequence, structure and property relationship in repetitive protein materials inspired in squid ring teeth for the first time. Specifically, this work focuses on squid-inspired structural proteins with tandem repeat units in their sequence (i.e., repetition of alternating building blocks) that are physically cross-linked via beta-sheet structures. The research work presented here tests the hypothesis that, in these systems, increasing the number of building blocks in the polypeptide chain decreases the protein network defects and improves the material properties. Hence, the sequence, nanostructure, and properties (thermal, mechanical, and conducting) of tandem repeat squid-inspired protein materials are examined. Spectroscopic structural analysis, advanced materials characterization, and entropic elasticity theory are combined to elucidate the structure and material properties of these repetitive proteins. This approach is applied not only to native squid proteins but also to squid-inspired synthetic polypeptides
Sunflower centromeres consist of a centromere-specific LINE and a chromosome-specific tandem repeat.

PubMed

Nagaki, Kiyotaka; Tanaka, Keisuke; Yamaji, Naoki; Kobayashi, Hisato; Murata, Minoru

2015-01-01

The kinetochore is a protein complex including kinetochore-specific proteins that plays a role in chromatid segregation during mitosis and meiosis. The complex associates with centromeric DNA sequences that are usually species-specific. In plant species, tandem repeats including satellite DNA sequences and retrotransposons have been reported as centromeric DNA sequences. In this study on sunflowers, a cDNA-encoding centromere-specific histone H3 (CENH3) was isolated from a cDNA pool from a seedling, and an antibody was raised against a peptide synthesized from the deduced cDNA. The antibody specifically recognized the sunflower CENH3 (HaCENH3) and showed centromeric signals by immunostaining and immunohistochemical staining analysis. The antibody was also applied in chromatin immunoprecipitation (ChIP)-Seq to isolate centromeric DNA sequences and two different types of repetitive DNA sequences were identified. One was a long interspersed nuclear element (LINE)-like sequence, which showed centromere-specific signals on almost all chromosomes in sunflowers. This is the first report of a centromeric LINE sequence, suggesting possible centromere targeting ability. Another type of identified repetitive DNA was a tandem repeat sequence with a 187-bp unit that was found only on a pair of chromosomes. The HaCENH3 content of the tandem repeats was estimated to be much higher than that of the LINE, which implies centromere evolution from LINE-based centromeres to more stable tandem-repeat-based centromeres. In addition, the epigenetic status of the sunflower centromeres was investigated by immunohistochemical staining and ChIP, and it was found that centromeres were heterochromatic.
Optimization of sequence alignment for simple sequence repeat regions.

PubMed

Jighly, Abdulqader; Hamwieh, Aladdin; Ogbonnaya, Francis C

2011-07-20

Microsatellites, or simple sequence repeats (SSRs), are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs) mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs).SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type.When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic phylogenic relationship.
Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution

USDA-ARS?s Scientific Manuscript database

Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres comprise of megabase-scale arrays of tandem repeats. The true prevalence of centromere tandem repeats, and whether they exhibit conserved seque...
Stabilization of perfect and imperfect tandem repeats by single-strand DNA exonucleases

PubMed Central

Feschenko, Vladimir V.; Rajman, Luis A.; Lovett, Susan T.

2003-01-01

Rearrangements between tandemly repeated DNA sequences are a common source of genetic instability. Such rearrangements underlie several human genetic diseases. In many organisms, the mismatch-repair (MMR) system functions to stabilize repeats when the repeat unit is short or when sequence imperfections are present between the repeats. We show here that the action of single-stranded DNA (ssDNA) exonucleases plays an additional, important role in stabilizing tandem repeats, independent of their role in MMR. For perfect repeats of ≈100 bp in Escherichia coli that are not susceptible to MMR, exonuclease (Exo)-I, ExoX, and RecJ exonuclease redundantly inhibit deletion. Our data suggest that >90% of potential deletion events are avoided by the combined action of these three exonucleases. Imperfect tandem repeats, less prone to rearrangements, are stabilized by both the MMR-pathway and ssDNA-specific exonucleases. For 100-bp repeats containing four mispairs, ExoI alone aborts most deletion events, even in the presence of a functional MMR system. By genetic analysis, we show that the inhibitory effect of ssDNA exonucleases on deletion formation is independent of the MutS and UvrD proteins. Exonuclease degradation of DNA displaced during the deletion process may abort slipped misalignment. Exonuclease action is therefore a significant force in genetic stabilization of many forms of repetitive DNA. PMID:12538867
Variable Number Of Tandem Repeats (VNTR) and its application in bacterial epidemiology.

PubMed

Ramazanzadeh, Rashid; McNerney, Ruth

2007-08-15

Molecular epidemiology is the using of molecular techniques to study bacterial distribution in human populations. Recently molecular epidemiologist benefit from several techniques such as Variable Number Tandem Repeat (VNTR) typing method to typing bacterial strains. Variable Number Tandem Repeat (VNTR) typing is a tool for genotyping and provides data in a simple and numeric format based on the number of repetitive sequences. VNTR for first time identified in M. tuberculosis as Mycobacterial Interspersed Repeat Units (MIRUs). General terms of VNTR have now been reported in Bacillus anthracis, Legionella pneumophila, Pseudomonas aeruginosa, Salmonella enterica and Escherichia coli O157.
Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm

PubMed Central

Glunčić, Matko; Paar, Vladimir

2013-01-01

The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K-string ensemble which enables a new method of direct mapping of symbolic DNA sequence into frequency domain, with straightforward identification of repeats as peaks in GRM diagram. In this way, we obtain very fast, efficient and highly automatized repeat finding tool. The method is robust to substitutions and insertions/deletions, as well as to various complexities of the sequence pattern. We present several case studies of GRM use, in order to illustrate its capabilities: identification of α-satellite tandem repeats and higher order repeats (HORs), identification of Alu dispersed repeats and of Alu tandems, identification of Period 3 pattern in exons, implementation of ‘magnifying glass’ effect, identification of complex HOR pattern, identification of inter-tandem transitional dispersed repeat sequences and identification of long segmental duplications. GRM algorithm is convenient for use, in particular, in cases of large repeat units, of highly mutated and/or complex repeats, and of global repeat maps for large genomic sequences (chromosomes and genomes). PMID:22977183
Molecular tandem repeat strategy for elucidating mechanical properties of high-strength proteins

PubMed Central

Jung, Huihun; Pena-Francesch, Abdon; Saadat, Alham; Sebastian, Aswathy; Kim, Dong Hwan; Hamilton, Reginald F.; Albert, Istvan; Allen, Benjamin D.; Demirel, Melik C.

2016-01-01

Many globular and structural proteins have repetitions in their sequences or structures. However, a clear relationship between these repeats and their contribution to the mechanical properties remains elusive. We propose a new approach for the design and production of synthetic polypeptides that comprise one or more tandem copies of a single unit with distinct amorphous and ordered regions. Our designed sequences are based on a structural protein produced in squid suction cups that has a segmented copolymer structure with amorphous and crystalline domains. We produced segmented polypeptides with varying repeat number, while keeping the lengths and compositions of the amorphous and crystalline regions fixed. We showed that mechanical properties of these synthetic proteins could be tuned by modulating their molecular weights. Specifically, the toughness and extensibility of synthetic polypeptides increase as a function of the number of tandem repeats. This result suggests that the repetitions in native squid proteins could have a genetic advantage for increased toughness and flexibility. PMID:27222581
ST proteins, a new family of plant tandem repeat proteins with a DUF2775 domain mainly found in Fabaceae and Asteraceae.

PubMed

Albornos, Lucía; Martín, Ignacio; Iglesias, Rebeca; Jiménez, Teresa; Labrador, Emilia; Dopico, Berta

2012-11-07

Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats. ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development. We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the described group of 20 to 40
ST proteins, a new family of plant tandem repeat proteins with a DUF2775 domain mainly found in Fabaceae and Asteraceae

PubMed Central

2012-01-01

Background Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats. Results ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development. Conclusions We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the
RepeatsDB-lite: a web server for unit annotation of tandem repeat proteins.

PubMed

Hirsh, Layla; Paladin, Lisanna; Piovesan, Damiano; Tosatto, Silvio C E

2018-05-09

RepeatsDB-lite (http://protein.bio.unipd.it/repeatsdb-lite) is a web server for the prediction of repetitive structural elements and units in tandem repeat (TR) proteins. TRs are a widespread but poorly annotated class of non-globular proteins carrying heterogeneous functions. RepeatsDB-lite extends the prediction to all TR types and strongly improves the performance both in terms of computational time and accuracy over previous methods, with precision above 95% for solenoid structures. The algorithm exploits an improved TR unit library derived from the RepeatsDB database to perform an iterative structural search and assignment. The web interface provides tools for analyzing the evolutionary relationships between units and manually refine the prediction by changing unit positions and protein classification. An all-against-all structure-based sequence similarity matrix is calculated and visualized in real-time for every user edit. Reviewed predictions can be submitted to RepeatsDB for review and inclusion.
Tandem Repeats in Proteins: Prediction Algorithms and Biological Role.

PubMed

Pellegrini, Marco

2015-01-01

Tandem repetitions in protein sequence and structure is a fascinating subject of research which has been a focus of study since the late 1990s. In this survey, we give an overview on the multi-faceted aspects of research on protein tandem repeats (PTR for short), including prediction algorithms, databases, early classification efforts, mechanisms of PTR formation and evolution, and synthetic PTR design. We also touch on the rather open issue of the relationship between PTR and flexibility (or disorder) in proteins. Detection of PTR either from protein sequence or structure data is challenging due to inherent high (biological) signal-to-noise ratio that is a key feature of this problem. As early in silico analytic tools have been key enablers for starting this field of study, we expect that current and future algorithmic and statistical breakthroughs will have a high impact on the investigations of the biological role of PTR.
Variable-Number Tandem Repeats That Are Useful in Genotyping Isolates of Salmonella enterica subsp. enterica Serovars Typhimurium and Newport▿

PubMed Central

Witonski, D. ; Stefanova, R.; Ranganathan, A.; Schutze, G. E.; Eisenach, K. D.; Cave, M. D.

2006-01-01

The genome of Salmonella enterica subsp. enterica serovar Typhimurium strain LT2 was analyzed for direct repeats, and 54 sequences containing variable-number tandem repeat loci were identified. Ten primer pairs that anneal upstream and downstream of each selected locus were designed and used to amplify PCR targets in isolates of S. enterica serovars Typhimurium and Newport. Four of the 10 loci did not show polymorphism in the length of products. Six loci were selected for analysis. Isolates of S. enterica serovars Typhimurium and Newport that were related to specific outbreaks and showed identical pulsed-field gel electrophoresis patterns were indistinguishable by the length of the six variable-number tandem repeats. Isolates that differed in their pulsed-field gel electrophoresis patterns showed polymorphism in variable-number tandem repeat profiles. Length of the products was confirmed by DNA sequence analysis. Only 2 of the 10 loci contained exact integers of the direct repeat. Eight loci contained partial copies. The partial copies were maintained at the ends of the variable-number tandem repeat loci in all isolates. In spite of having partial copies that were maintained in all isolates, the number of direct repeats at a locus was polymorphic. Six variable-number tandem repeat loci were useful in distinguishing isolates of S. enterica serovars Typhimurium and Newport that had different pulsed-field gel electrophoresis patterns and in identifying outbreak-associated cases that shared a common pulsed-field gel pattern. PMID:16943354
Genome-wide analysis of tandem repeats in plants and green algae

Treesearch

Zhixin Zhao; Cheng Guo; Sreeskandarajan Sutharzan; Pei Li; Craig Echt; Jie Zhang; Chun Liang

2014-01-01

Tandem repeats (TRs) extensively exist in the genomes of prokaryotes and eukaryotes. Based on the sequenced genomes and gene annotations of 31 plant and algal species in Phytozome version 8.0 (http://www.phytozome.net/), we examined TRs in a genome-wide scale, characterized their distributions and motif features, and explored their putative biological functions. Among...
Medium-sized tandem repeats represent an abundant component of the Drosophila virilis genome.

PubMed

Abdurashitov, Murat A; Gonchar, Danila A; Chernukhin, Valery A; Tomilov, Victor N; Tomilova, Julia E; Schostak, Natalia G; Zatsepina, Olga G; Zelentsova, Elena S; Evgen'ev, Michael B; Degtyarev, Sergey K H

2013-11-09

Previously, we developed a simple method for carrying out a restriction enzyme analysis of eukaryotic DNA in silico, based on the known DNA sequences of the genomes. This method allows the user to calculate lengths of all DNA fragments that are formed after a whole genome is digested at the theoretical recognition sites of a given restriction enzyme. A comparison of the observed peaks in distribution diagrams with the results from DNA cleavage using several restriction enzymes performed in vitro have shown good correspondence between the theoretical and experimental data in several cases. Here, we applied this approach to the annotated genome of Drosophila virilis which is extremely rich in various repeats. Here we explored the combined approach to perform the restriction analysis of D. virilis DNA. This approach enabled to reveal three abundant medium-sized tandem repeats within the D. virilis genome. While the 225 bp repeats were revealed previously in intergenic non-transcribed spacers between ribosomal genes of D. virilis, two other families comprised of 154 bp and 172 bp repeats were not described. Tandem Repeats Finder search demonstrated that 154 bp and 172 bp units are organized in multiple clusters in the genome of D. virilis. Characteristically, only 154 bp repeats derived from Helitron transposon are transcribed. Using in silico digestion in combination with conventional restriction analysis and sequencing of repeated DNA fragments enabled us to isolate and characterize three highly abundant families of medium-sized repeats present in the D. virilis genome. These repeats comprise a significant portion of the genome and may have important roles in genome function and structural integrity. Therefore, we demonstrated an approach which makes possible to investigate in detail the gross arrangement and expression of medium-sized repeats basing on sequencing data even in the case of incompletely assembled and/or annotated genomes.
Stability of Tandem Repeats in the Drosophila Melanogaster HSR-Omega Nuclear RNA

PubMed Central

Hogan, N. C.; Slot, F.; Traverse, K. L.; Garbe, J. C.; Bendena, W. G.; Pardue, M. L.

1995-01-01

The Drosophila melanogaster Hsr-omega locus produces a nuclear RNA containing >5 kb of tandem repeat sequences. These repeats are unique to Hsr-omega and show concerted evolution similar to that seen with classical satellite DNAs. In D. melanogaster the monomer is ~280 bp. Sequences of 191/2 monomers differ by 8 +/- 5% (mean +/- SD), when all pairwise comparisons are considered. Differences are single nucleotide substitutions and 1-3 nucleotide deletions/insertions. Changes appear to be randomly distributed over the repeat unit. Outer repeats do not show the decrease in monomer homogeneity that might be expected if homogeneity is maintained by recombination. However, just outside the last complete repeat at each end, there are a few fragments of sequence similar to the monomer. The sequences in these flanking regions are not those predicted for sequences decaying in the absence of recombination. Instead, the fragmentation of the sequence homology suggests that flanking regions have undergone more severe disruptions, possibly during an insertion or amplification event. Hsr-omega alleles differing in the number of repeats are detected and appear to be stable over a few thousand generations; however, both increases and decreases in repeat numbers have been observed. The new alleles appear to be as stable as their predecessors. No alleles of less than ~5 kb nor more than ~16 kb of repeats were seen in any stocks examined. The evidence that there is a limit on the minimum number of repeats is consistent with the suggestion that these repeats are important in the function of the unusual Hsr-omega nuclear RNA. PMID:7540581

STRBase: a short tandem repeat DNA database for the human identity testing community

PubMed Central

Ruitberg, Christian M.; Reeder, Dennis J.; Butler, John M.

2001-01-01

The National Institute of Standards and Technology (NIST) has compiled and maintained a Short Tandem Repeat DNA Internet Database (http://www.cstl.nist.gov/biotech/strbase/) since 1997 commonly referred to as STRBase. This database is an information resource for the forensic DNA typing community with details on commonly used short tandem repeat (STR) DNA markers. STRBase consolidates and organizes the abundant literature on this subject to facilitate on-going efforts in DNA typing. Observed alleles and annotated sequence for each STR locus are described along with a review of STR analysis technologies. Additionally, commercially available STR multiplex kits are described, published polymerase chain reaction (PCR) primer sequences are reported, and validation studies conducted by a number of forensic laboratories are listed. To supplement the technical information, addresses for scientists and hyperlinks to organizations working in this area are available, along with the comprehensive reference list of over 1300 publications on STRs used for DNA typing purposes. PMID:11125125
A naturally occurring, noncanonical GTP aptamer made of simple tandem repeats

PubMed Central

Curtis, Edward A; Liu, David R

2014-01-01

Recently, we used in vitro selection to identify a new class of naturally occurring GTP aptamer called the G motif. Here we report the discovery and characterization of a second class of naturally occurring GTP aptamer, the “CA motif.” The primary sequence of this aptamer is unusual in that it consists entirely of tandem repeats of CA-rich motifs as short as three nucleotides. Several active variants of the CA motif aptamer lack the ability to form consecutive Watson-Crick base pairs in any register, while others consist of repeats containing only cytidine and adenosine residues, indicating that noncanonical interactions play important roles in its structure. The circular dichroism spectrum of the CA motif aptamer is distinct from that of A-form RNA and other major classes of nucleic acid structures. Bioinformatic searches indicate that the CA motif is absent from most archaeal and bacterial genomes, but occurs in at least 70 percent of approximately 400 eukaryotic genomes examined. These searches also uncovered several phylogenetically conserved examples of the CA motif in rodent (mouse and rat) genomes. Together, these results reveal the existence of a second class of naturally occurring GTP aptamer whose sequence requirements, like that of the G motif, are not consistent with those of a canonical secondary structure. They also indicate a new and unexpected potential biochemical activity of certain naturally occurring tandem repeats. PMID:24824832
Chicken microsatellite markers isolated from libraries enriched for simple tandem repeats.

PubMed

Gibbs, M; Dawson, D A; McCamley, C; Wardle, A F; Armour, J A; Burke, T

1997-12-01

The total number of microsatellite loci is considered to be at least 10-fold lower in avian species than in mammalian species. Therefore, efficient large-scale cloning of chicken microsatellites, as required for the construction of a high-resolution linkage map, is facilitated by the construction of libraries using an enrichment strategy. In this study, a plasmid library enriched for tandem repeats was constructed from chicken genomic DNA by hybridization selection. Using this technique the proportion of recombinant clones that cross-hybridized to probes containing simple tandem repeats was raised to 16%, compared with < 0.1% in a non-enriched library. Primers were designed from 121 different sequences. Polymerase chain reaction (PCR) analysis of two chicken reference pedigrees enabled 72 loci to be localized within the collaborative chicken genetic map, and at least 30 of the remaining loci have been shown to be informative in these or other crosses.
Unrelated sequences at the 5' end of mouse LINE-1 repeated elements define two distinct subfamilies.

PubMed Central

Wincker, P; Jubier-Maurin, V; Roizès, G

1987-01-01

Some full length members of the mouse long interspersed repeated DNA family L1Md have been shown to be associated at their 5' end with a variable number of tandem repetitions, the A repeats, that have been suggested to be transcription controlling elements. We report that the other type of repeat, named F, found at the 5' end of a few L1 elements is also an integral part of full length L1 copies. Sequencing shows that the F repeats are GC rich, and organized in tandem. The L1 copies associated with either A or F repeats can be correlated with two different subsets of L1 sequences distinguished by a series of variant nucleotides specific to each and by unassociated but frequent restriction sites. These findings suggest that sequence replacement has occurred at least once in 5' of L1Md, and is related to the generation of specific subfamilies. Images PMID:3684566
An examination of the origin and evolution of additional tandem repeats in the mitochondrial DNA control region of Japanese sika deer (Cervus Nippon).

PubMed

Ba, Hengxing; Wu, Lang; Liu, Zongyue; Li, Chunyi

2016-01-01

Tandem repeat units are only detected in the left domain of the mitochondrial DNA control region in sika deer. Previous studies showed that Japanese sika deer have more tandem repeat units than its cousins from the Asian continent and Taiwan, which often have only three repeat units. To determine the origin and evolution of these additional repeat units in Japanese sika deer, we obtained the sequence of repeat units from an expanded dataset of the control region from all sika deer lineages. The functional constraint is inferred to act on the first repeat unit because this repeat has the least sequence divergence in comparison to the other units. Based on slipped-strand mispairing mechanisms, the illegitimate elongation model could account for the addition or deletion of these additional repeat units in the Japanese sika deer population. We also report that these additional repeat units could be occurring in the internal positions of tandem repeat regions, possibly via coupling with a homogenization mechanism within and among these lineages. Moreover, the increased number of repeat units in the Japanese sika deer population could reflect a balance between mutation and selection, as well as genetic drift.
A Dynamic Tandem Repeat in Monocotyledons Inferred from a Comparative Analysis of Chloroplast Genomes in Melanthiaceae.

PubMed

Do, Hoang Dang Khoa; Kim, Joo-Hwan

2017-01-01

Chloroplast genomes (cpDNA) are highly valuable resources for evolutionary studies of angiosperms, since they are highly conserved, are small in size, and play critical roles in plants. Slipped-strand mispairing (SSM) was assumed to be a mechanism for generating repeat units in cpDNA. However, research on the employment of different small repeated sequences through SSM events, which may induce the accumulation of distinct types of repeats within the same region in cpDNA, has not been documented. Here, we sequenced two chloroplast genomes from the endemic species Heloniopsis tubiflora (Korea) and Xerophyllum tenax (USA) to cover the gap between molecular data and explore "hot spots" for genomic events in Melanthiaceae. Comparative analysis of 23 complete cpDNA sequences revealed that there were different stages of deletion in the rps16 region across the Melanthiaceae. Based on the partial or complete loss of rps16 gene in cpDNA, we have firstly reported potential molecular markers for recognizing two sections ( Veratrum and Fuscoveratrum ) of Veratrum . Melathiaceae exhibits a significant change in the junction between large single copy and inverted repeat regions, ranging from trnH_GUG to a part of rps3 . Our results show an accumulation of tandem repeats in the rpl23-ycf2 regions of cpDNAs. Small conserved sequences exist and flank tandem repeats in further observation of this region across most of the examined taxa of Liliales. Therefore, we propose three scenarios in which different small repeated sequences were used during SSM events to generate newly distinct types of repeats. Occasionally, prior to the SSM process, point mutation event and double strand break repair occurred and induced the formation of initial repeat units which are indispensable in the SSM process. SSM may have likely occurred more frequently for short repeats than for long repeat sequences in tribe Parideae (Melanthiaceae, Liliales). Collectively, these findings add new evidence of dynamic
Characterization of the variable-number tandem repeats in vrrA from different Bacillus anthracis isolates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jackson, P.J.; Walthers, E.A.; Richmond, K.L.

1997-04-01

PCR analysis of 198 Bacillus anthracis isolates revealed a variable region of DNA sequence differing in length among the isolates. Five Polymorphisms differed by the presence Of two to six copies of the 12-bp tandem repeat 5{prime}-CAATATCAACAA-3{prime}. This variable-number tandem repeat (VNTR) region is located within a larger sequence containing one complete open reading frame that encodes a putative 30-kDa protein. Length variation did not change the reading frame of the encoded protein and only changed the copy number of a 4-amino-acid sequence (QYQQ) from 2 to 6. The structure of the VNTR region suggests that these multiple repeats aremore » generated by recombination or polymerase slippage. Protein structures predicted from the reverse-translated DNA sequence suggest that any structural changes in the encoded protein are confined to the region encoded by the VNTR sequence. Copy number differences in the VNTR region were used to define five different B. anthracis alleles. Characterization of 198 isolates revealed allele frequencies of 6.1, 17.7, 59.6, 5.6, and 11.1% sequentially from shorter to longer alleles. The high degree of polymorphism in the VNTR region provides a criterion for assigning isolates to five allelic categories. There is a correlation between categories and geographic distribution. Such molecular markers can be used to monitor the epidemiology of anthrax outbreaks in domestic and native herbivore populations. 22 refs., 4 figs., 3 tabs.« less
Functional centromeres in Astragalus sinicus include a compact centromere-specific histone H3 and a 20-bp tandem repeat.

PubMed

Tek, Ahmet L; Kashihara, Kazunari; Murata, Minoru; Nagaki, Kiyotaka

2011-11-01

The centromere plays an essential role for proper chromosome segregation during cell division and usually harbors long arrays of tandem repeated satellite DNA sequences. Although this function is conserved among eukaryotes, the sequences of centromeric DNA repeats are variable. Most of our understanding of functional centromeres, which are defined by localization of a centromere-specific histone H3 (CENH3) protein, comes from model organisms. The components of the functional centromere in legumes are poorly known. The genus Astragalus is a member of the legumes and bears the largest numbers of species among angiosperms. Therefore, we studied the components of centromeres in Astragalus sinicus. We identified the CenH3 homolog of A. sinicus, AsCenH3 that is the most compact in size among higher eukaryotes. A CENH3-based assay revealed the functional centromeric DNA sequences from A. sinicus, called CentAs. The CentAs repeat is localized in A. sinicus centromeres, and comprises an AT-rich tandem repeat with a monomer size of 20 nucleotides.
A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

PubMed Central

2010-01-01

Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT). Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the repeat may be disseminated by
Rare Sequence Variation in the Genome Flanking a Short Tandem Repeat Locus Can Lead to a Question of “Nonmaternity”

PubMed Central

Deucher, Anne; Chiang, Tsoyu; Schrijver, Iris

2010-01-01

Typing of STR (short tandem repeat) alleles is used in a variety of applications in clinical molecular pathology, including evaluations for maternal cell contamination. Using a commercially available STR typing assay for maternal cell contamination performed in conjunction with prenatal diagnostic testing, we were posed with apparent nonmaternity when the two fetal samples did not demonstrate the expected maternal allele at one locus. By designing primers external to the region amplified by the primers from the commercial assay and by performing direct sequencing of the resulting amplicon, we were able to determine that a guanine to adenine sequence variation led to primer mismatch and allele dropout. This explained the apparent null allele shared between the maternal and fetal samples. Therefore, although rare, allele dropout must be considered whenever unexplained homozygosity at an STR locus is observed. PMID:20203001
Tandem-repeat protein domains across the tree of life.

PubMed

Jernigan, Kristin K; Bordenstein, Seth R

2015-01-01

Tandem-repeat protein domains, composed of repeated units of conserved stretches of 20-40 amino acids, are required for a wide array of biological functions. Despite their diverse and fundamental functions, there has been no comprehensive assessment of their taxonomic distribution, incidence, and associations with organismal lifestyle and phylogeny. In this study, we assess for the first time the abundance of armadillo (ARM) and tetratricopeptide (TPR) repeat domains across all three domains in the tree of life and compare the results to our previous analysis on ankyrin (ANK) repeat domains in this journal. All eukaryotes and a majority of the bacterial and archaeal genomes analyzed have a minimum of one TPR and ARM repeat. In eukaryotes, the fraction of ARM-containing proteins is approximately double that of TPR and ANK-containing proteins, whereas bacteria and archaea are enriched in TPR-containing proteins relative to ARM- and ANK-containing proteins. We show in bacteria that phylogenetic history, rather than lifestyle or pathogenicity, is a predictor of TPR repeat domain abundance, while neither phylogenetic history nor lifestyle predicts ARM repeat domain abundance. Surprisingly, pathogenic bacteria were not enriched in TPR-containing proteins, which have been associated within virulence factors in certain species. Taken together, this comparative analysis provides a newly appreciated view of the prevalence and diversity of multiple types of tandem-repeat protein domains across the tree of life. A central finding of this analysis is that tandem repeat domain-containing proteins are prevalent not just in eukaryotes, but also in bacterial and archaeal species.
Tandem-repeat protein domains across the tree of life

PubMed Central

Jernigan, Kristin K.

2015-01-01

Tandem-repeat protein domains, composed of repeated units of conserved stretches of 20–40 amino acids, are required for a wide array of biological functions. Despite their diverse and fundamental functions, there has been no comprehensive assessment of their taxonomic distribution, incidence, and associations with organismal lifestyle and phylogeny. In this study, we assess for the first time the abundance of armadillo (ARM) and tetratricopeptide (TPR) repeat domains across all three domains in the tree of life and compare the results to our previous analysis on ankyrin (ANK) repeat domains in this journal. All eukaryotes and a majority of the bacterial and archaeal genomes analyzed have a minimum of one TPR and ARM repeat. In eukaryotes, the fraction of ARM-containing proteins is approximately double that of TPR and ANK-containing proteins, whereas bacteria and archaea are enriched in TPR-containing proteins relative to ARM- and ANK-containing proteins. We show in bacteria that phylogenetic history, rather than lifestyle or pathogenicity, is a predictor of TPR repeat domain abundance, while neither phylogenetic history nor lifestyle predicts ARM repeat domain abundance. Surprisingly, pathogenic bacteria were not enriched in TPR-containing proteins, which have been associated within virulence factors in certain species. Taken together, this comparative analysis provides a newly appreciated view of the prevalence and diversity of multiple types of tandem-repeat protein domains across the tree of life. A central finding of this analysis is that tandem repeat domain-containing proteins are prevalent not just in eukaryotes, but also in bacterial and archaeal species. PMID:25653910
Characterization of a tandemly repeated DNA sequence family originally derived by retroposition of tRNA(Glu) in the newt.

PubMed

Nagahashi, S; Endoh, H; Suzuki, Y; Okada, N

1991-11-20

A previous report from this laboratory showed that in vitro transcription of total genomic DNA of the newt Cynopus pyrrhogaster resulted in a discrete sized 8 S RNA, which represented highly repetitive and transcribable sequences with a glutamic acid tRNA-like structure in the newt genome. We isolated four independent clones from a newt genomic library and determined the complete sequences of three 2000 to 2400 base-pair PstI fragments spanning the 8 S RNA gene. The glutamic acid tRNA-related segment in the 8 S RNA gene contains the CCA sequence expected as the 3' terminus of a tRNA molecule. Further, the 11 nucleotides located 13 nucleotides upstream from one of the two transcription initiation sites of the 8 S RNA were found to be repeated in the region upstream from the termination site, suggesting that the original unit, which is shorter than the 8 S RNA, was retrotransposed via cDNA intermediates from the PolIII transcript. In the upstream region of the 8 S RNA gene, a 360 nucleotide unit containing the glutamic acid tRNA-related segment was found to be duplicated (clones NE1 and NE10) or triplicated (clone NE3). Except for the difference in the number of the 360 nucleotide unit, the three sequences of the 2000 to 2400 base-pair PstI fragment were essentially the same with only a few mutations and minor deletions. Inverse polymerase chain reaction and sequence determination of the products, together with a Southern hybridization experiment, demonstrated that the family consists of a tandemly repeated unit of 3300, 3700 or 4100 base-pairs. Thus during evolution, this family in the newt was created by retroposition via cDNA intermediates, followed by duplication or triplication of the 360 nucleotide unit and multiplication of the 3300 to 4100 base-pair region at the DNA level.
5meCpG epigenetic marks neighboring a primate-conserved core promoter short tandem repeat indicate X-chromosome inactivation.

PubMed

Machado, Filipe Brum; Machado, Fabricio Brum; Faria, Milena Amendro; Lovatel, Viviane Lamim; Alves da Silva, Antonio Francisco; Radic, Claudia Pamela; De Brasi, Carlos Daniel; Rios, Álvaro Fabricio Lopes; de Sousa Lopes, Susana Marina Chuva; da Silveira, Leonardo Serafim; Ruiz-Miranda, Carlos Ramon; Ramos, Ester Silveira; Medina-Acosta, Enrique

2014-01-01

X-chromosome inactivation (XCI) is the epigenetic transcriptional silencing of an X-chromosome during the early stages of embryonic development in female eutherian mammals. XCI assures monoallelic expression in each cell and compensation for dosage-sensitive X-linked genes between females (XX) and males (XY). DNA methylation at the carbon-5 position of the cytosine pyrimidine ring in the context of a CpG dinucleotide sequence (5meCpG) in promoter regions is a key epigenetic marker for transcriptional gene silencing. Using computational analysis, we revealed an extragenic tandem GAAA repeat 230-bp from the landmark CpG island of the human X-linked retinitis pigmentosa 2 RP2 promoter whose 5meCpG status correlates with XCI. We used this RP2 onshore tandem GAAA repeat to develop an allele-specific 5meCpG-based PCR assay that is highly concordant with the human androgen receptor (AR) exonic tandem CAG repeat-based standard HUMARA assay in discriminating active (Xa) from inactive (Xi) X-chromosomes. The RP2 onshore tandem GAAA repeat contains neutral features that are lacking in the AR disease-linked tandem CAG repeat, is highly polymorphic (heterozygosity rates approximately 0.8) and shows minimal variation in the Xa/Xi ratio. The combined informativeness of RP2/AR is approximately 0.97, and this assay excels at determining the 5meCpG status of alleles at the Xp (RP2) and Xq (AR) chromosome arms in a single reaction. These findings are relevant and directly translatable to nonhuman primate models of XCI in which the AR CAG-repeat is monomorphic. We conducted the RP2 onshore tandem GAAA repeat assay in the naturally occurring chimeric New World monkey marmoset (Callitrichidae) and found it to be informative. The RP2 onshore tandem GAAA repeat will facilitate studies on the variable phenotypic expression of dominant and recessive X-linked diseases, epigenetic changes in twins, the physiology of aging hematopoiesis, the pathogenesis of age-related hematopoietic
5meCpG Epigenetic Marks Neighboring a Primate-Conserved Core Promoter Short Tandem Repeat Indicate X-Chromosome Inactivation

PubMed Central

Machado, Filipe Brum; Machado, Fabricio Brum; Faria, Milena Amendro; Lovatel, Viviane Lamim; Alves da Silva, Antonio Francisco; Radic, Claudia Pamela; De Brasi, Carlos Daniel; Rios, Álvaro Fabricio Lopes; de Sousa Lopes, Susana Marina Chuva; da Silveira, Leonardo Serafim; Ruiz-Miranda, Carlos Ramon; Ramos, Ester Silveira; Medina-Acosta, Enrique

2014-01-01

X-chromosome inactivation (XCI) is the epigenetic transcriptional silencing of an X-chromosome during the early stages of embryonic development in female eutherian mammals. XCI assures monoallelic expression in each cell and compensation for dosage-sensitive X-linked genes between females (XX) and males (XY). DNA methylation at the carbon-5 position of the cytosine pyrimidine ring in the context of a CpG dinucleotide sequence (5meCpG) in promoter regions is a key epigenetic marker for transcriptional gene silencing. Using computational analysis, we revealed an extragenic tandem GAAA repeat 230-bp from the landmark CpG island of the human X-linked retinitis pigmentosa 2 RP2 promoter whose 5meCpG status correlates with XCI. We used this RP2 onshore tandem GAAA repeat to develop an allele-specific 5meCpG-based PCR assay that is highly concordant with the human androgen receptor (AR) exonic tandem CAG repeat-based standard HUMARA assay in discriminating active (Xa) from inactive (Xi) X-chromosomes. The RP2 onshore tandem GAAA repeat contains neutral features that are lacking in the AR disease-linked tandem CAG repeat, is highly polymorphic (heterozygosity rates approximately 0.8) and shows minimal variation in the Xa/Xi ratio. The combined informativeness of RP2/AR is approximately 0.97, and this assay excels at determining the 5meCpG status of alleles at the Xp (RP2) and Xq (AR) chromosome arms in a single reaction. These findings are relevant and directly translatable to nonhuman primate models of XCI in which the AR CAG-repeat is monomorphic. We conducted the RP2 onshore tandem GAAA repeat assay in the naturally occurring chimeric New World monkey marmoset (Callitrichidae) and found it to be informative. The RP2 onshore tandem GAAA repeat will facilitate studies on the variable phenotypic expression of dominant and recessive X-linked diseases, epigenetic changes in twins, the physiology of aging hematopoiesis, the pathogenesis of age-related hematopoietic
Identification of Variable-Number Tandem-Repeat (VNTR) Sequences in Acinetobacter baumannii and Interlaboratory Validation of an Optimized Multiple-Locus VNTR Analysis Typing Scheme▿†

PubMed Central

Pourcel, Christine; Minandri, Fabrizia; Hauck, Yolande; D'Arezzo, Silvia; Imperi, Francesco; Vergnaud, Gilles; Visca, Paolo

2011-01-01

Acinetobacter baumannii is an important opportunistic pathogen responsible for nosocomial outbreaks, mostly occurring in intensive care units. Due to the multiplicity of infection sources, reliable molecular fingerprinting techniques are needed to establish epidemiological correlations among A. baumannii isolates. Multiple-locus variable-number tandem-repeat analysis (MLVA) has proven to be a fast, reliable, and cost-effective typing method for several bacterial species. In this study, an MLVA assay compatible with simple PCR- and agarose gel-based electrophoresis steps as well as with high-throughput automated methods was developed for A. baumannii typing. Preliminarily, 10 potential polymorphic variable-number tandem repeats (VNTRs) were identified upon bioinformatic screening of six annotated genome sequences of A. baumannii. A collection of 7 reference strains plus 18 well-characterized isolates, including unique types and representatives of the three international A. baumannii lineages, was then evaluated in a two-center study aimed at validating the MLVA assay and comparing it with other genotyping assays, namely, macrorestriction analysis with pulsed-field gel electrophoresis (PFGE) and PCR-based sequence group (SG) profiling. The results showed that MLVA can discriminate between isolates with identical PFGE types and SG profiles. A panel of eight VNTR markers was selected, all showing the ability to be amplified and good amounts of polymorphism in the majority of strains. Independently generated MLVA profiles, composed of an ordered string of allele numbers corresponding to the number of repeats at each VNTR locus, were concordant between centers. Typeability, reproducibility, stability, discriminatory power, and epidemiological concordance were excellent. A database containing information and MLVA profiles for several A. baumannii strains is available from http://mlva.u-psud.fr/. PMID:21147956
TRStalker: an efficient heuristic for finding fuzzy tandem repeats.

PubMed

Pellegrini, Marco; Renda, M Elena; Vecchio, Alessio

2010-06-15

Genomes in higher eukaryotic organisms contain a substantial amount of repeated sequences. Tandem Repeats (TRs) constitute a large class of repetitive sequences that are originated via phenomena such as replication slippage and are characterized by close spatial contiguity. They play an important role in several molecular regulatory mechanisms, and also in several diseases (e.g. in the group of trinucleotide repeat disorders). While for TRs with a low or medium level of divergence the current methods are rather effective, the problem of detecting TRs with higher divergence (fuzzy TRs) is still open. The detection of fuzzy TRs is propaedeutic to enriching our view of their role in regulatory mechanisms and diseases. Fuzzy TRs are also important as tools to shed light on the evolutionary history of the genome, where higher divergence correlates with more remote duplication events. We have developed an algorithm (christened TRStalker) with the aim of detecting efficiently TRs that are hard to detect because of their inherent fuzziness, due to high levels of base substitutions, insertions and deletions. To attain this goal, we developed heuristics to solve a Steiner version of the problem for which the fuzziness is measured with respect to a motif string not necessarily present in the input string. This problem is akin to the 'generalized median string' that is known to be an NP-hard problem. Experiments with both synthetic and biological sequences demonstrate that our method performs better than current state of the art for fuzzy TRs and that the fuzzy TRs of the type we detect are indeed present in important biological sequences. TRStalker will be integrated in the web-based TRs Discovery Service (TReaDS) at bioalgo.iit.cnr.it. Supplementary data are available at Bioinformatics online.
Multiple-Locus Variable-Number Tandem-Repeat Analysis in Genotyping Yersinia enterocolitica Strains from Human and Porcine Origins

PubMed Central

Laukkanen-Ninios, R.; Ortiz Martínez, P.; Siitonen, A.; Fredriksson-Ahomaa, M.; Korkeala, H.

2013-01-01

Sporadic and epidemiologically linked Yersinia enterocolitica strains (n = 379) isolated from fecal samples from human patients, tonsil or fecal samples from pigs collected at slaughterhouses, and pork samples collected at meat stores were genotyped using multiple-locus variable-number tandem-repeat analysis (MLVA) with six loci, i.e., V2A, V4, V5, V6, V7, and V9. In total, 312 different MLVA types were found. Similar types were detected (i) in fecal samples collected from human patients over 2 to 3 consecutive years, (ii) in samples from humans and pigs, and (iii) in samples from pigs that originated from the same farms. Among porcine strains, we found farm-specific MLVA profiles. Variations in the numbers of tandem repeats from one to four for variable-number tandem-repeat (VNTR) loci V2A, V5, V6, and V7 were observed within a farm. MLVA was applicable for serotypes O:3, O:5,27, and O:9 and appeared to be a highly discriminating tool for distinguishing sporadic and outbreak-related strains. With long-term use, interpretation of the results became more challenging due to variations in more-discriminating loci, as was observed for strains originating from pig farms. Additionally, we encountered unexpectedly short V2A VNTR fragments and sequenced them. According to the sequencing results, updated guidelines for interpreting V2A VNTR results were prepared. PMID:23637293
Versatile communication strategies among tandem WW domain repeats

PubMed Central

Dodson, Emma Joy; Fishbain-Yoskovitz, Vered; Rotem-Bamberger, Shahar

2015-01-01

Interactions mediated by short linear motifs in proteins play major roles in regulation of cellular homeostasis since their transient nature allows for easy modulation. We are still far from a full understanding and appreciation of the complex regulation patterns that can be, and are, achieved by this type of interaction. The fact that many linear-motif-binding domains occur in tandem repeats in proteins indicates that their mutual communication is used extensively to obtain complex integration of information toward regulatory decisions. This review is an attempt to overview, and classify, different ways by which two and more tandem repeats cooperate in binding to their targets, in the well-characterized family of WW domains and their corresponding polyproline ligands. PMID:25710931
De novo generation of plant centromeres at tandem repeats.

PubMed

Teo, Chee How; Lermontova, Inna; Houben, Andreas; Mette, Michael Florian; Schubert, Ingo

2013-06-01

Artificial minichromosomes are highly desirable tools for basic research, breeding, and biotechnology purposes. We present an option to generate plant artificial minichromosomes via de novo engineering of plant centromeres in Arabidopsis thaliana by targeting kinetochore proteins to tandem repeat arrays at non-centromeric positions. We employed the bacterial lactose repressor/lactose operator system to guide derivatives of the centromeric histone H3 variant cenH3 to LacO operator sequences. Tethering of cenH3 to non-centromeric loci led to de novo assembly of kinetochore proteins and to dicentric carrier chromosomes which potentially form anaphase bridges. This approach will be further developed and may contribute to generating minichromosomes from preselected genomic regions, potentially even in a diploid background.

Microevolution of Pandemic Vibrio parahaemolyticus Assessed by the Number of Repeat Units in Short Sequence Tandem Repeat Regions

PubMed Central

García, Katherine; Gavilán, Ronnie G.; Höfle, Manfred G.; Martínez-Urtaza, Jaime; Espejo, Romilio T.

2012-01-01

The emergence of the pandemic strain Vibrio parahaemolyticus O3:K6 in 1996 caused a large increase of diarrhea outbreaks related to seafood consumption in Southeast Asia, and later worldwide. Isolates of this strain constitutes a clonal complex, and their effectual differentiation is possible by comparison of their variable number tandem repeats (VNTRs). The differentiation of the isolates by the differences in VNTRs will allow inferring the population dynamics and microevolution of this strain but this requires knowing the rate and mechanism of VNTRs' variation. Our study of mutants obtained after serial cultivation of clones showed that mutation rates of the six VNTRs examined are on the order of 10−4 mutant per generation and that difference increases by stepwise addition of single mutations. The single stepwise mutation (SSM) was deduced because mutants with 1, 2, 3, or more repeat unit deletions or insertions follow a geometric distribution. Plausible phylogenetic trees are obtained when, according to SSM, the genetic distance between clusters with different number of repeats is assessed by the absolute differences in repeats. Using this approach, mutants originated from different isolates of pandemic V. parahaemolyticus after serial cultivation are clustered with their parental isolates. Additionally, isolates of pandemic V. parahaemolyticus from Southeast Asia, Tokyo, and northern and southern Chile are clustered according their geographical origin. The deepest split in these four populations is observed between the Tokyo and southern Chile populations. We conclude that proper phylogenetic relations and successful tracing of pandemic V. parahaemolyticus requires measuring the differences between isolates by the absolute number of repeats in the VNTRs considered. PMID:22292049
Novel variable number of tandem repeats of gibbon MAOA gene and its evolutionary significance.

PubMed

Choi, Yuri; Jung, Yi-Deun; Ayarpadikannan, Selvam; Koga, Akihiko; Imai, Hiroo; Hirai, Hirohisa; Roos, Christian; Kim, Heui-Soo

2014-08-01

Variable number of tandem repeats (VNTRs) are scattered throughout the primate genome, and genetic variation of these VNTRs have been accumulated during primate radiation. Here, we analyzed VNTRs upstream of the monoamine oxidase A (MAOA) gene in 11 different gibbon species. An abundance of truncated VNTR sequences and copy number differences were observed compared to those of human VNTR sequences. To better understand the biological role of these VNTRs, a luciferase activity assay was conducted and results indicated that selected VNTR sequences of the MAOA gene from human and three different gibbon species (Hylobates klossii, Hylobates lar, and Nomascus concolor) showed silencing ability. Together, these data could be useful for understanding the evolutionary history and functional significance of MAOA VNTR sequences in gibbon species.
Interpreting short tandem repeat variations in humans using mutational constraint

PubMed Central

Gymrek, Melissa; Willems, Thomas; Reich, David; Erlich, Yaniv

2017-01-01

Identifying regions of the genome that are depleted of mutations can reveal potentially deleterious variants. Short tandem repeats (STRs), also known as microsatellites, are among the largest contributors of de novo mutations in humans. However, per-locus studies of STR mutations have been limited to highly ascertained panels of several dozen loci. Here, we harnessed bioinformatics tools and a novel analytical framework to estimate mutation parameters for each STR in the human genome by correlating STR genotypes with local sequence heterozygosity. We applied our method to obtain robust estimates of the impact of local sequence features on mutation parameters and used this to create a framework for measuring constraint at STRs by comparing observed vs. expected mutation rates. Constraint scores identified known pathogenic variants with early onset effects. Our metric will provide a valuable tool for prioritizing pathogenic STRs in medical genetics studies. PMID:28892063
The central domain of bovine submaxillary mucin consists of over 50 tandem repeats of 329 amino acids. Chromosomal localization of the BSM1 gene and relations to ovine and porcine counterparts.

PubMed

Jiang, W; Gupta, D; Gallagher, D; Davis, S; Bhavanandan, V P

2000-04-01

We previously elucidated five distinct protein domains (I-V) for bovine submaxillary mucin, which is encoded by two genes, BSM1 and BSM2. Using Southern blot analysis, genomic cloning and sequencing of the BSM1 gene, we now show that the central domain (V) consists of approximately 55 tandem repeats of 329 amino acids and that domains III-V are encoded by a 58.4-kb exon, the largest exon known for all genes to date. The BSM1 gene was mapped by fluorescence in situ hybridization to the proximal half of chromosome 5 at bands q2. 2-q2.3. The amino-acid sequence of six tandem repeats (two full and four partial) were found to have only 92-94% identities. We propose that the variability in the amino-acid sequences of the mucin tandem repeat is important for generating the combinatorial library of saccharides that are necessary for the protective function of mucins. The deduced peptide sequences of the central domain match those determined from the purified bovine submaxillary mucin and also show 68-94% identity to published peptide sequences of ovine submaxillary mucin. This indicates that the core protein of ovine submaxillary mucin is closely related to that of bovine submaxillary mucin and contains similar tandem repeats in the central domain. In contrast, the central domain of porcine submaxillary mucin is reported to consist of 81-amino-acid tandem repeats. However, both bovine submaxillary mucin and porcine submaxillary mucin contain similar N-terminal and C-terminal domains and the corresponding genes are in the conserved linkage regions of the respective genomes.
The RNase P RNA from cyanobacteria: short tandemly repeated repetitive (STRR) sequences are present within the RNase P RNA gene in heterocyst-forming cyanobacteria.

PubMed Central

Vioque, A

1997-01-01

The RNase P RNA gene (rnpB) from 10 cyanobacteria has been characterized. These new RNAs, together with the previously available ones, provide a comprehensive data set of RNase P RNA from diverse cyanobacterial lineages. All heterocystous cyanobacteria, but none of the non-heterocystous strains analyzed, contain short tandemly repeated repetitive (STRR) sequences that increase the length of helix P12. Site-directed mutagenesis experiments indicate that the STRR sequences are not required for catalytic activity in vitro. STRR sequences seem to have recently and independently invaded the RNase P RNA genes in heterocyst-forming cyanobacteria because closely related strains contain unrelated STRR sequences. Most cyanobacteria RNase P RNAs lack the sequence GGU in the loop connecting helices P15 and P16 that has been established to interact with the 3'-end CCA in precursor tRNA substrates in other bacteria. This character is shared with plastid RNase P RNA. Helix P6 is longer than usual in most cyanobacteria as well as in plastid RNase P RNA. PMID:9254706
Production of monoclonal antibody, PR81, recognizing the tandem repeat region of MUC1 mucin.

PubMed

Paknejad, M; Rasaee, M J; Tehrani, F Karami; Kashanian, S; Mohagheghi, M A; Omidfar, K; Bazl, M Rajabi

2003-06-01

A monoclonal antibody (MAb) was generated by immunizing BALB/c mice with homogenized breast cancerous tissues. This antibody (PR81) was found to be of IgG(1) class and subclass, containing kappa light chain. PR81 reacted with either the membrane extracts of several breast cancerous tissues or the cell surface of some MUC1 positive cell lines (MCF-7, BT-20 and T-47D) tested by enzyme immunoassay and for MCF-7 by immunofluorescence method. PR81 also reacted with two synthetic 27 and 16-amino acid peptides, TSA-P1-24 and A-P1-15, respectively, which included the core tandem repeat sequence of MUC1. However, this antibody did not react with a synthetic 14 amino acid peptide that has no similarity with tandem repeat found in MUC1. The generated antibody had good and similar affinities (2.19 x 10(8) M(-1)) toward TSA-P1-24 and A-P1-15, which are mainly shared in the hydrophilic sequence of PDTRPAP. Through Western blot analysis of homogenized breast tissues, PR81 recognized only a major band of 250 kDa. This band is stronger in malignant tissue than benign and normal tissues.
Ten tandem repeats of {beta}-hCG 109-118 enhance immunogenicity and anti-tumor effects of {beta}-hCG C-terminal peptide carried by mycobacterial heat-shock protein HSP65

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang Yankai; Yan Rong; He Yi

2006-07-14

The {beta}-subunit of human chorionic gonadotropin ({beta}-hCG) is secreted by many kinds of tumors and it has been used as an ideal target antigen to develop vaccines against tumors. In view of the low immunogenicity of this self-peptide,we designed a method based on isocaudamer technique to repeat tandemly the 10-residue sequence X of {beta}-hCG (109-118), then 10 tandemly repeated copies of the 10-residue sequence combined with {beta}-hCG C-terminal 37 peptides were fused to mycobacterial heat-shock protein 65 to construct a fusion protein HSP65-X10-{beta}hCGCTP37 as an immunogen. In this study, we examined the effect of the tandem repeats of this 10-residuemore » sequence in eliciting an immune by comparing the immunogenicity and anti-tumor effects of the two immunogens, HSP65-X10-{beta}hCGCTP37 and HSP65-{beta}hCGCTP37 (without the 10 tandem repeats). Immunization of mice with the fusion protein HSP65-X10-{beta}hCGCTP37 elicited much higher levels of specific anti-{beta}-hCG antibodies and more effectively inhibited the growth of Lewis lung carcinoma (LLC) in vivo than with HSP65-{beta}hCGCTP37, which should suggest that HSP65-X10-{beta}hCGCTP37 may be an effective protein vaccine for the treatment of {beta}-hCG-dependent tumors and multiple tandem repeats of a certain epitope are an efficient method to overcome the low immunogenicity of self-peptide antigens.« less
PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

PubMed

Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

2011-01-01

PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
Development of Multiple-Locus Variable-Number Tandem-Repeat Analysis for Molecular Subtyping of Campylobacter jejuni by Using Capillary Electrophoresis

PubMed Central

Techaruvichit, Punnida; Vesaratchavest, Mongkol; Keeratipibul, Suwimon; Kuda, Takashi; Kimura, Bon

2015-01-01

Campylobacter jejuni is a common cause of the frequently reported food-borne diseases in developed and developing nations. This study describes the development of multiple-locus variable-number tandem-repeat (VNTR) analysis (MLVA) using capillary electrophoresis as a novel typing method for microbial source tracking and epidemiological investigation of C. jejuni. Among 36 tandem repeat loci detected by the Tandem Repeat Finder program, 7 VNTR loci were selected and used for characterizing 60 isolates recovered from chicken meat samples from retail shops, samples from chicken meat processing factory, and stool samples. The discrimination ability of MLVA was compared with that of multilocus sequence typing (MLST). MLVA (diversity index of 0.97 with 31 MLVA types) provided slightly higher discrimination than MLST (diversity index of 0.95 with 25 MLST types). The overall concordance between MLVA and MLST was estimated at 63% by adjusted Rand coefficient. MLVA predicted MLST type better than MLST predicted MLVA type, as reflected by Wallace coefficient (Wallace coefficient for MLVA to MLST versus MLST to MLVA, 86% versus 51%). MLVA is a useful tool and can be used for effective monitoring of C. jejuni and investigation of epidemics caused by C. jejuni. PMID:26025899
Repeated sequence sets in mitochondrial DNA molecules of root knot nematodes (Meloidogyne): nucleotide sequences, genome location and potential for host-race identification.

PubMed Central

Okimoto, R; Chamberlin, H M; Macfarlane, J L; Wolstenholme, D R

1991-01-01

Within a 7 kb segment of the mtDNA molecule of the root knot nematode, Meloidogyne javanica, that lacks standard mitochondrial genes, are three sets of strictly tandemly arranged, direct repeat sequences: approximately 36 copies of a 102 ntp sequence that contains a TaqI site; 11 copies of a 63 ntp sequence, and 5 copies of an 8 ntp sequence. The 7 kb repeat-containing segment is bounded by putative tRNAasp and tRNAf-met genes and the arrangement of sequences within this segment is: the tRNAasp gene; a unique 1,528 ntp segment that contains two highly stable hairpin-forming sequences; the 102 ntp repeat set; the 8 ntp repeat set; a unique 1,068 ntp segment; the 63 ntp repeat set; and the tRNAf-met gene. The nucleotide sequences of the 102 ntp copies and the 63 ntp copies have been conserved among the species examined. Data from Southern hybridization experiments indicate that 102 ntp and 63 ntp repeats occur in the mtDNAs of three, two and two races of M.incognita, M.hapla and M.arenaria, respectively. Nucleotide sequences of the M.incognita Race-3 102 ntp repeat were found to be either identical or highly similar to those of the M.javanica 102 ntp repeat. Differences in migration distance and number of 102 ntp repeat-containing bands seen in Southern hybridization autoradiographs of restriction-digested mtDNAs of M.javanica and the different host races of M.incognita, M.hapla and M.arenaria are sufficient to distinguish the different host races of each species. Images PMID:2027769
Characterization of toxin-producing cyanobacteria by using an oligonucleotide probe containing a tandemly repeated heptamer.

PubMed Central

Rouhiainen, L; Sivonen, K; Buikema, W J; Haselkorn, R

1995-01-01

Cyanobacteria produce toxins that kill animals. The two main classes of cyanobacterial toxins are cyclic peptides that cause liver damage and alkaloids that block nerve transmission. Many toxin-producing strains from Finnish lakes were brought into axenic culture, and their toxins were characterized. Restriction fragment length polymorphism analysis, probing with a short tandemly repeated DNA sequence found at many locations in the chromosome of Anabaena sp. strain PCC 7120, distinguishes hepatotoxic Anabaena isolates from neurotoxin-producing strains and from Nostoc spp. PMID:7592362
Multilocus Variable-Number Tandem Repeat Typing of Mycobacterium ulcerans

PubMed Central

Ablordey, Anthony; Swings, Jean; Hubans, Christine; Chemlal, Karim; Locht, Camille; Portaels, Françoise; Supply, Philip

2005-01-01

The apparent genetic homogeneity of Mycobacterium ulcerans contributes to the poorly understood epidemiology of M. ulcerans infection. Here, we report the identification of variable number tandem repeat (VNTR) sequences as novel polymorphic elements in the genome of this species. A total of 19 potential VNTR loci identified in the closely related M. marinum genome sequence were screened in a collection of 23 M. ulcerans isolates, one Mycobacterium species referred to here as an intermediate species, and five M. marinum strains. Nine of the 19 loci were polymorphic in the three species (including the intermediate species) and revealed eight M. ulcerans and five M. marinum genotypes. The results from the VNTR analysis corroborated the genetic relationships of M. ulcerans isolates from various geographical origins, as defined by independent molecular markers. Although these results further highlight the extremely high clonal homogeneity within certain geographic regions, we report for the first time the discrimination of the two South American strains from Surinam and French Guyana. These findings support the potential of a VNTR-based genotyping method for strain discrimination within M. ulcerans and M. marinum. PMID:15814964
MSDB: A Comprehensive Database of Simple Sequence Repeats

PubMed Central

Avvaru, Akshay Kumar; Saxena, Saketh; Mishra, Rakesh Kumar

2017-01-01

Abstract Microsatellites, also known as Simple Sequence Repeats (SSRs), are short tandem repeats of 1–6 nt motifs present in all genomes, particularly eukaryotes. Besides their usefulness as genome markers, SSRs have been shown to perform important regulatory functions, and variations in their length at coding regions are linked to several disorders in humans. Microsatellites show a taxon-specific enrichment in eukaryotic genomes, and some may be functional. MSDB (Microsatellite Database) is a collection of >650 million SSRs from 6,893 species including Bacteria, Archaea, Fungi, Plants, and Animals. This database is by far the most exhaustive resource to access and analyze SSR data of multiple species. In addition to exploring data in a customizable tabular format, users can view and compare the data of multiple species simultaneously using our interactive plotting system. MSDB is developed using the Django framework and MySQL. It is freely available at http://tdb.ccmb.res.in/msdb. PMID:28854643
Whole genome evaluation of tandem repeat polymorphisms between two pathogenically similar strains of Xylella fastidiosa isolated from almond and grape in California

USDA-ARS?s Scientific Manuscript database

Whole genome tandem repeat polymorphisms were evaluated between two closely related Xylella fastidiosa strains, M23 and Temecula1, both cause almond leaf scorch disease (ALSD) and grape Pierce’s disease (PD) in California. Strain M23 was isolated from almond and the genome was sequenced in this stu...
Tandem Repeated Irritation Test (TRIT) Studies and Clinical Relevance: Post 2006.

PubMed

Reddy, Rasika; Maibach, Howard

2018-06-11

Single or multiple applications of irritants can lead to occupational contact dermatitis, and most commonly irritant contact dermatitis (ICD). Tandem irritation, the sequential application of two irritants to a target skin area, has been studied using the Tandem Repeated Irritation Test (TRIT) to provide a more accurate representation of skin irritation. Here we present an update to Kartono's review on tandem irritation studies since 2006 [1]. We surveyed the literature available on PubMed, Embase, Google Scholar, and the UCSF Dermatology library databases since 2006. The studies included discuss the tandem effects of common chemical irritants, organic solvents, occlusion as well as clinical relevance - and enlarge our ability to discern whether multiple chemical exposures are more or less likely to enhance irritation.
Tandem repeat regions within the Burkholderia pseudomallei genome and their application for high resolution genotyping.

PubMed

U'Ren, Jana M; Schupp, James M; Pearson, Talima; Hornstra, Heidie; Friedman, Christine L Clark; Smith, Kimothy L; Daugherty, Rebecca R Leadem; Rhoton, Shane D; Leadem, Ben; Georgia, Shalamar; Cardon, Michelle; Huynh, Lynn Y; DeShazer, David; Harvey, Steven P; Robison, Richard; Gal, Daniel; Mayo, Mark J; Wagner, David; Currie, Bart J; Keim, Paul

2007-03-30

The facultative, intracellular bacterium Burkholderia pseudomallei is the causative agent of melioidosis, a serious infectious disease of humans and animals. We identified and categorized tandem repeat arrays and their distribution throughout the genome of B. pseudomallei strain K96243 in order to develop a genetic typing method for B. pseudomallei. We then screened 104 of the potentially polymorphic loci across a diverse panel of 31 isolates including B. pseudomallei, B. mallei and B. thailandensis in order to identify loci with varying degrees of polymorphism. A subset of these tandem repeat arrays were subsequently developed into a multiple-locus VNTR analysis to examine 66 B. pseudomallei and 21 B. mallei isolates from around the world, as well as 95 lineages from a serial transfer experiment encompassing ~18,000 generations. B. pseudomallei contains a preponderance of tandem repeat loci throughout its genome, many of which are duplicated elsewhere in the genome. The majority of these loci are composed of repeat motif lengths of 6 to 9 bp with 4 to 10 repeat units and are predominately located in intergenic regions of the genome. Across geographically diverse B. pseudomallei and B.mallei isolates, the 32 VNTR loci displayed between 7 and 28 alleles, with Nei's diversity values ranging from 0.47 and 0.94. Mutation rates for these loci are comparable (>10-5 per locus per generation) to that of the most diverse tandemly repeated regions found in other less diverse bacteria. The frequency, location and duplicate nature of tandemly repeated regions within the B. pseudomallei genome indicate that these tandem repeat regions may play a role in generating and maintaining adaptive genomic variation. Multiple-locus VNTR analysis revealed extensive diversity within the global isolate set containing B. pseudomallei and B. mallei, and it detected genotypic differences within clonal lineages of both species that were identical using previous typing methods. Given the health
Detecting long tandem duplications in genomic sequences.

PubMed

Audemard, Eric; Schiex, Thomas; Faraut, Thomas

2012-05-08

Detecting duplication segments within completely sequenced genomes provides valuable information to address genome evolution and in particular the important question of the emergence of novel functions. The usual approach to gene duplication detection, based on all-pairs protein gene comparisons, provides only a restricted view of duplication. In this paper, we introduce ReD Tandem, a software using a flow based chaining algorithm targeted at detecting tandem duplication arrays of moderate to longer length regions, with possibly locally weak similarities, directly at the DNA level. On the A. thaliana genome, using a reference set of tandem duplicated genes built using TAIR,(a) we show that ReD Tandem is able to predict a large fraction of recently duplicated genes (dS < 1) and that it is also able to predict tandem duplications involving non coding elements such as pseudo-genes or RNA genes. ReD Tandem allows to identify large tandem duplications without any annotation, leading to agnostic identification of tandem duplications. This approach nicely complements the usual protein gene based which ignores duplications involving non coding regions. It is however inherently restricted to relatively recent duplications. By recovering otherwise ignored events, ReD Tandem gives a more comprehensive view of existing evolutionary processes and may also allow to improve existing annotations.
Tandem repeats of the 5' non-transcribed spacer of Tetrahymena rDNA function as high copy number autonomous replicons in the macronucleus but do not prevent rRNA gene dosage regulation.

PubMed Central

Pan, W J; Blackburn, E H

1995-01-01

The rRNA genes in the somatic macronucleus of Tetrahymena thermophila are normally on 21 kb linear palindromic molecules (rDNA). We examined the effect on rRNA gene dosage of transforming T.thermophila macronuclei with plasmid constructs containing a pair of tandemly repeated rDNA replication origin regions unlinked to the rRNA gene. A significant proportion of the plasmid sequences were maintained as high copy circular molecules, eventually consisting solely of tandem arrays of origin regions. As reported previously for cells transformed by a construct in which the same tandem rDNA origins were linked to the rRNA gene [Yu, G.-L. and Blackburn, E. H. (1990) Mol. Cell. Biol., 10, 2070-2080], origin sequences recombined to form linear molecules bearing several tandem repeats of the origin region, as well as rRNA genes. The total number of rDNA origin sequences eventually exceeded rRNA gene copies by approximately 20- to 40-fold and the number of circular replicons carrying only rDNA origin sequences exceeded rRNA gene copies by 2- to 3-fold. However, the rRNA gene dosage was unchanged. Hence, simply monitoring the total number of rDNA origin regions is not sufficient to regulate rRNA gene copy number. Images PMID:7784211
Analysis of an "off-ladder" allele at the Penta D short tandem repeat locus.

PubMed

Yang, Y L; Wang, J G; Wang, D X; Zhang, W Y; Liu, X J; Cao, J; Yang, S L

2015-11-25

Kinship testing of a father and his son from Guangxi, China, the location of the Zhuang minority people, was performed using the PowerPlex® 18D System with a short tandem repeat typing kit. The results indicated that both the father and his son had an off-ladder allele at the Penta D locus, with a genetic size larger than that of the maximal standard allelic ladder. To further identify this locus, monogenic amplification, gene cloning, and genetic sequencing were performed. Sequencing analysis demonstrated that the fragment size of the Penta D-OL locus was 469 bp and the core sequence was [AAAGA]21, also called Penta D-21. The rare Penta D-21 allele was found to be distributed among the Zhuang population from the Guangxi Zhuang Autonomous Region of China; therefore, this study improved the range of DNA data available for this locus and enhanced our ability for individual identification of gene loci.
MSDB: A Comprehensive Database of Simple Sequence Repeats.

PubMed

Avvaru, Akshay Kumar; Saxena, Saketh; Sowpati, Divya Tej; Mishra, Rakesh Kumar

2017-06-01

Microsatellites, also known as Simple Sequence Repeats (SSRs), are short tandem repeats of 1-6 nt motifs present in all genomes, particularly eukaryotes. Besides their usefulness as genome markers, SSRs have been shown to perform important regulatory functions, and variations in their length at coding regions are linked to several disorders in humans. Microsatellites show a taxon-specific enrichment in eukaryotic genomes, and some may be functional. MSDB (Microsatellite Database) is a collection of >650 million SSRs from 6,893 species including Bacteria, Archaea, Fungi, Plants, and Animals. This database is by far the most exhaustive resource to access and analyze SSR data of multiple species. In addition to exploring data in a customizable tabular format, users can view and compare the data of multiple species simultaneously using our interactive plotting system. MSDB is developed using the Django framework and MySQL. It is freely available at http://tdb.ccmb.res.in/msdb. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

The proliferation marker pKi-67 becomes masked to MIB-1 staining after expression of its tandem repeats.

PubMed

Schmidt, Mirko H H; Broll, Rainer; Bruch, Hans-Peter; Duchrow, Michael

2002-11-01

The Ki-67 antigen, pKi-67, is one of the most commonly used markers of proliferating cells. The protein can only be detected in dividing cells (G(1)-, S-, G(2)-, and M-phase) but not in quiescent cells (G(0)). The standard antibody to detect pKi-67 is MIB-1, which detects the so-called 'Ki-67 motif' FKELF in 9 of the protein's 16 tandem repeats. To investigate the function of these repeats we expressed three of them in an inducible gene expression system in HeLa cells. Surprisingly, addition of a nuclear localization sequence led to a complete absence of signal in the nuclei of MIB-1-stained cells. At the same time antibodies directed against different epitopes of pKi-67 did not fail to detect the protein. We conclude that the overexpression of the 'Ki-67 motif', which is present in the repeats, can lead to inability of MIB-1 to detect its antigen as demonstrated in adenocarcinoma tissue samples. Thereafter, in order to prevent the underestimation of Ki-67 proliferation indices in MIB-1-labeled preparations, additional antibodies (for example, MIB-21) should be used. Additionally, we could show in a mammalian two-hybrid assay that recombinant pKi-67 repeats are capable of self-associating with endogenous pKi-67. Speculating that the tandem repeats are intimately involved in its protein-protein interactions, this offers new insights in how access to these repeats is regulated by pKi-67 itself.
An improved genome assembly uncovers prolific tandem repeats in Atlantic cod.

PubMed

Tørresen, Ole K; Star, Bastiaan; Jentoft, Sissel; Reinar, William B; Grove, Harald; Miller, Jason R; Walenz, Brian P; Knight, James; Ekholm, Jenny M; Peluso, Paul; Edvardsen, Rolf B; Tooming-Klunderud, Ave; Skage, Morten; Lien, Sigbjørn; Jakobsen, Kjetill S; Nederbragt, Alexander J

2017-01-18

The first Atlantic cod (Gadus morhua) genome assembly published in 2011 was one of the early genome assemblies exclusively based on high-throughput 454 pyrosequencing. Since then, rapid advances in sequencing technologies have led to a multitude of assemblies generated for complex genomes, although many of these are of a fragmented nature with a significant fraction of bases in gaps. The development of long-read sequencing and improved software now enable the generation of more contiguous genome assemblies. By combining data from Illumina, 454 and the longer PacBio sequencing technologies, as well as integrating the results of multiple assembly programs, we have created a substantially improved version of the Atlantic cod genome assembly. The sequence contiguity of this assembly is increased fifty-fold and the proportion of gap-bases has been reduced fifteen-fold. Compared to other vertebrates, the assembly contains an unusual high density of tandem repeats (TRs). Indeed, retrospective analyses reveal that gaps in the first genome assembly were largely associated with these TRs. We show that 21% of the TRs across the assembly, 19% in the promoter regions and 12% in the coding sequences are heterozygous in the sequenced individual. The inclusion of PacBio reads combined with the use of multiple assembly programs drastically improved the Atlantic cod genome assembly by successfully resolving long TRs. The high frequency of heterozygous TRs within or in the vicinity of genes in the genome indicate a considerable standing genomic variation in Atlantic cod populations, which is likely of evolutionary importance.
Variability of CAG tandem repeats in exon 1 of the androgen receptor gene is not related with dog intersexuality.

PubMed

Nowacka-Woszuk, J; Switonski, M

2010-02-01

Numerous mutations of the human androgen receptor (AR) gene cause an intersexual phenotype, called the androgen insensitivity syndrome. The intersexual phenotype is also quite often diagnosed in dogs. The aim of this study was to conduct a comparative analysis of the entire coding sequence (eight exons) of the AR gene in healthy and four intersex dogs, as well as in three other canids (the red fox, arctic fox and Chinese raccoon dog). The coding sequence of the studied species appeared to be conserved (similarity above 97%) and polymorphism was found in exon 1 only. Altogether, 2 SNPs were identified in healthy dogs, 14 in red foxes, 16 in arctic foxes and 6 were found in Chinese raccoon dogs, respectively. Moreover, a variable number of tandem repeats (CAG and CAA), encoding an array of glutamines, was also observed in this exon. The CAA codon numbers were invariable within species, but the CAG repeats were polymorphic. The highest number of the CAG and CAA repeats was found in dogs (from 40 to 42) and the observed variability was similar in intersex and healthy dogs. In the other canids the variability fell within the following ranges: 29-37 (red fox), 37-39 (arctic fox) and 29-32 (Chinese raccoon dog). In addition, a polymorphic microsatellite marker in intron 2 was found in the dog, red fox and Chinese raccoon dog. It was concluded that the polymorphism level of the AR gene in the dog was lower than in the other canids and none of the detected polymorphisms, including variability of the CAG tandem repeats, could be related with the intersexual phenotype of the studied dogs.
The evolution and function of protein tandem repeats in plants.

PubMed

Schaper, Elke; Anisimova, Maria

2015-04-01

Sequence tandem repeats (TRs) are abundant in proteomes across all domains of life. For plants, little is known about their distribution or contribution to protein function. We exhaustively annotated TRs and studied the evolution of TR unit variations for all Ensembl plants. Using phylogenetic patterns of TR units, we detected conserved TRs with unit number and order preserved during evolution, and those TRs that have diverged via recent TR unit gains/losses. We correlated the mode of evolution of TRs to protein function. TR number was strongly correlated with proteome size, with about one-half of all TRs recognized as common protein domains. The majority of TRs have been highly conserved over long evolutionary distances, some since the separation of red algae and green plants c. 1.6 billion yr ago. Conversely, recurrent recent TR unit mutations were rare. Our results suggest that the first TRs by far predate the first plants, and that TR appearance is an ongoing process with similar rates across the plant kingdom. Interestingly, the few detected highly mutable TRs might provide a source of variation for rapid adaptation. In particular, such TRs are enriched in leucine-rich repeats (LRRs) commonly found in R genes, where TR unit gain/loss may facilitate resistance to emerging pathogens. © 2014 The Authors. New Phytologist © 2014 New Phytologist Trust.
Effect of Repeat Copy Number on Variable-Number Tandem Repeat Mutations in Escherichia coli O157:H7

PubMed Central

Vogler, Amy J.; Keys, Christine; Nemoto, Yoshimi; Colman, Rebecca E.; Jay, Zack; Keim, Paul

2006-01-01

Variable-number tandem repeat (VNTR) loci have shown a remarkable ability to discriminate among isolates of the recently emerged clonal pathogen Escherichia coli O157:H7, making them a very useful molecular epidemiological tool. However, little is known about the rates at which these sequences mutate, the factors that affect mutation rates, or the mechanisms by which mutations occur at these loci. Here, we measure mutation rates for 28 VNTR loci and investigate the effects of repeat copy number and mismatch repair on mutation rate using in vitro-generated populations for 10 E. coli O157:H7 strains. We find single-locus rates as high as 7.0 × 10−4 mutations/generation and a combined 28-locus rate of 6.4 × 10−4 mutations/generation. We observed single- and multirepeat mutations that were consistent with a slipped-strand mispairing mutation model, as well as a smaller number of large repeat copy number mutations that were consistent with recombination-mediated events. Repeat copy number within an array was strongly correlated with mutation rate both at the most mutable locus, O157-10 (r2 = 0.565, P = 0.0196), and across all mutating loci. The combined locus model was significant whether locus O157-10 was included (r2 = 0.833, P < 0.0001) or excluded (r2 = 0.452, P < 0.0001) from the analysis. Deficient mismatch repair did not affect mutation rate at any of the 28 VNTRs with repeat unit sizes of >5 bp, although a poly(G) homomeric tract was destabilized in the mutS strain. Finally, we describe a general model for VNTR mutations that encompasses insertions and deletions, single- and multiple-repeat mutations, and their relative frequencies based upon our empirical mutation rate data. PMID:16740932
Effect of repeat copy number on variable-number tandem repeat mutations in Escherichia coli O157:H7.

PubMed

Vogler, Amy J; Keys, Christine; Nemoto, Yoshimi; Colman, Rebecca E; Jay, Zack; Keim, Paul

2006-06-01

Variable-number tandem repeat (VNTR) loci have shown a remarkable ability to discriminate among isolates of the recently emerged clonal pathogen Escherichia coli O157:H7, making them a very useful molecular epidemiological tool. However, little is known about the rates at which these sequences mutate, the factors that affect mutation rates, or the mechanisms by which mutations occur at these loci. Here, we measure mutation rates for 28 VNTR loci and investigate the effects of repeat copy number and mismatch repair on mutation rate using in vitro-generated populations for 10 E. coli O157:H7 strains. We find single-locus rates as high as 7.0 x 10(-4) mutations/generation and a combined 28-locus rate of 6.4 x 10(-4) mutations/generation. We observed single- and multirepeat mutations that were consistent with a slipped-strand mispairing mutation model, as well as a smaller number of large repeat copy number mutations that were consistent with recombination-mediated events. Repeat copy number within an array was strongly correlated with mutation rate both at the most mutable locus, O157-10 (r2= 0.565, P = 0.0196), and across all mutating loci. The combined locus model was significant whether locus O157-10 was included (r2= 0.833, P < 0.0001) or excluded (r2= 0.452, P < 0.0001) from the analysis. Deficient mismatch repair did not affect mutation rate at any of the 28 VNTRs with repeat unit sizes of >5 bp, although a poly(G) homomeric tract was destabilized in the mutS strain. Finally, we describe a general model for VNTR mutations that encompasses insertions and deletions, single- and multiple-repeat mutations, and their relative frequencies based upon our empirical mutation rate data.
De novo protein sequencing by combining top-down and bottom-up tandem mass spectra.

PubMed

Liu, Xiaowen; Dekker, Lennard J M; Wu, Si; Vanduijn, Martijn M; Luider, Theo M; Tolić, Nikola; Kou, Qiang; Dvorkin, Mikhail; Alexandrova, Sonya; Vyatkina, Kira; Paša-Tolić, Ljiljana; Pevzner, Pavel A

2014-07-03

There are two approaches for de novo protein sequencing: Edman degradation and mass spectrometry (MS). Existing MS-based methods characterize a novel protein by assembling tandem mass spectra of overlapping peptides generated from multiple proteolytic digestions of the protein. Because each tandem mass spectrum covers only a short peptide of the target protein, the key to high coverage protein sequencing is to find spectral pairs from overlapping peptides in order to assemble tandem mass spectra to long ones. However, overlapping regions of peptides may be too short to be confidently identified. High-resolution mass spectrometers have become accessible to many laboratories. These mass spectrometers are capable of analyzing molecules of large mass values, boosting the development of top-down MS. Top-down tandem mass spectra cover whole proteins. However, top-down tandem mass spectra, even combined, rarely provide full ion fragmentation coverage of a protein. We propose an algorithm, TBNovo, for de novo protein sequencing by combining top-down and bottom-up MS. In TBNovo, a top-down tandem mass spectrum is utilized as a scaffold, and bottom-up tandem mass spectra are aligned to the scaffold to increase sequence coverage. Experiments on data sets of two proteins showed that TBNovo achieved high sequence coverage and high sequence accuracy.
Whole genome sequencing of Salmonella Typhimurium illuminates distinct outbreaks caused by an endemic multi-locus variable number tandem repeat analysis type in Australia, 2014.

PubMed

Phillips, Anastasia; Sotomayor, Cristina; Wang, Qinning; Holmes, Nadine; Furlong, Catriona; Ward, Kate; Howard, Peter; Octavia, Sophie; Lan, Ruiting; Sintchenko, Vitali

2016-09-15

Salmonella Typhimurium (STM) is an important cause of foodborne outbreaks worldwide. Subtyping of STM remains critical to outbreak investigation, yet current techniques (e.g. multilocus variable number tandem repeat analysis, MLVA) may provide insufficient discrimination. Whole genome sequencing (WGS) offers potentially greater discriminatory power to support infectious disease surveillance. We performed WGS on 62 STM isolates of a single, endemic MLVA type associated with two epidemiologically independent, food-borne outbreaks along with sporadic cases in New South Wales, Australia, during 2014. Genomes of case and environmental isolates were sequenced using HiSeq (Illumina) and the genetic distance between them was assessed by single nucleotide polymorphism (SNP) analysis. SNP analysis was compared to the epidemiological context. The WGS analysis supported epidemiological evidence and genomes of within-outbreak isolates were nearly identical. Sporadic cases differed from outbreak cases by a small number of SNPs, although their close relationship to outbreak cases may represent an unidentified common food source that may warrant further public health follow up. Previously unrecognised mini-clusters were detected. WGS of STM can discriminate foodborne community outbreaks within a single endemic MLVA clone. Our findings support the translation of WGS into public health laboratory surveillance of salmonellosis.
Reverse Transcription Errors and RNA-DNA Differences at Short Tandem Repeats.

PubMed

Fungtammasan, Arkarachai; Tomaszkiewicz, Marta; Campos-Sánchez, Rebeca; Eckert, Kristin A; DeGiorgio, Michael; Makova, Kateryna D

2016-10-01

Transcript variation has important implications for organismal function in health and disease. Most transcriptome studies focus on assessing variation in gene expression levels and isoform representation. Variation at the level of transcript sequence is caused by RNA editing and transcription errors, and leads to nongenetically encoded transcript variants, or RNA-DNA differences (RDDs). Such variation has been understudied, in part because its detection is obscured by reverse transcription (RT) and sequencing errors. It has only been evaluated for intertranscript base substitution differences. Here, we investigated transcript sequence variation for short tandem repeats (STRs). We developed the first maximum-likelihood estimator (MLE) to infer RT error and RDD rates, taking next generation sequencing error rates into account. Using the MLE, we empirically evaluated RT error and RDD rates for STRs in a large-scale DNA and RNA replicated sequencing experiment conducted in a primate species. The RT error rates increased exponentially with STR length and were biased toward expansions. The RDD rates were approximately 1 order of magnitude lower than the RT error rates. The RT error rates estimated with the MLE from a primate data set were concordant with those estimated with an independent method, barcoded RNA sequencing, from a Caenorhabditis elegans data set. Our results have important implications for medical genomics, as STR allelic variation is associated with >40 diseases. STR nonallelic transcript variation can also contribute to disease phenotype. The MLE and empirical rates presented here can be used to evaluate the probability of disease-associated transcripts arising due to RDD. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Analysis of sequence repeats of proteins in the PDB.

PubMed

Mary Rajathei, David; Selvaraj, Samuel

2013-12-01

Internal repeats in protein sequences play a significant role in the evolution of protein structure and function. Applications of different bioinformatics tools help in the identification and characterization of these repeats. In the present study, we analyzed sequence repeats in a non-redundant set of proteins available in the Protein Data Bank (PDB). We used RADAR for detecting internal repeats in a protein, PDBeFOLD for assessing structural similarity, PDBsum for finding functional involvement and Pfam for domain assignment of the repeats in a protein. Through the analysis of sequence repeats, we found that identity of the sequence repeats falls in the range of 20-40% and, the superimposed structures of the most of the sequence repeats maintain similar overall folding. Analysis sequence repeats at the functional level reveals that most of the sequence repeats are involved in the function of the protein through functionally involved residues in the repeat regions. We also found that sequence repeats in single and two domain proteins often contained conserved sequence motifs for the function of the domain. Copyright © 2013 Elsevier Ltd. All rights reserved.
Short tandem repeat analysis in Japanese population.

PubMed

Hashiyada, M

2000-01-01

Short tandem repeats (STRs), known as microsatellites, are one of the most informative genetic markers for characterizing biological materials. Because of the relatively small size of STR alleles (generally 100-350 nucleotides), amplification by polymerase chain reaction (PCR) is relatively easy, affording a high sensitivity of detection. In addition, STR loci can be amplified simultaneously in a multiplex PCR. Thus, substantial information can be obtained in a single analysis with the benefits of using less template DNA, reducing labor, and reducing the contamination. We investigated 14 STR loci in a Japanese population living in Sendai by three multiplex PCR kits, GenePrint PowerPlex 1.1 and 2.2. Fluorescent STR System (Promega, Madison, WI, USA) and AmpF/STR Profiler (Perkin-Elmer, Norwalk, CT, USA). Genomic DNA was extracted using sodium dodecyl sulfate (SDS) proteinase K or Chelex 100 treatment followed by the phenol/chloroform extraction. PCR was performed according to the manufacturer's protocols. Electrophoresis was carried out on an ABI 377 sequencer and the alleles were determined by GeneScan 2.0.2 software (Perkin-Elmer). In 14 STRs loci, statistical parameters indicated a relatively high rate, and no significant deviation from Hardy-Weinberg equilibrium was detected. We apply this STR system to paternity testing and forensic casework, e.g., personal identification in rape cases. This system is an effective tool in the forensic sciences to obtain information on individual identification.
Tandem repeats analysis for the high resolution phylogenetic analysis of Yersinia pestis

PubMed Central

Pourcel, C; André-Mazeaud, F; Neubauer, H; Ramisse, F; Vergnaud, G

2004-01-01

Background Yersinia pestis, the agent of plague, is a young and highly monomorphic species. Three biovars, each one thought to be associated with the last three Y. pestis pandemics, have been defined based on biochemical assays. More recently, DNA based assays, including DNA sequencing, IS typing, DNA arrays, have significantly improved current knowledge on the origin and phylogenetic evolution of Y. pestis. However, these methods suffer either from a lack of resolution or from the difficulty to compare data. Variable number of tandem repeats (VNTRs) provides valuable polymorphic markers for genotyping and performing phylogenetic analyses in a growing number of pathogens and have given promising results for Y. pestis as well. Results In this study we have genotyped 180 Y. pestis isolates by multiple locus VNTR analysis (MLVA) using 25 markers. Sixty-one different genotypes were observed. The three biovars were distributed into three main branches, with some exceptions. In particular, the Medievalis phenotype is clearly heterogeneous, resulting from different mutation events in the napA gene. Antiqua strains from Asia appear to hold a central position compared to Antiqua strains from Africa. A subset of 7 markers is proposed for the quick comparison of a new strain with the collection typed here. This can be easily achieved using a Web-based facility, specifically set-up for running such identifications. Conclusion Tandem-repeat typing may prove to be a powerful complement to the existing phylogenetic tools for Y. pestis. Typing can be achieved quickly at a low cost in terms of consumables, technical expertise and equipment. The resulting data can be easily compared between different laboratories. The number and selection of markers will eventually depend upon the type and aim of investigations. PMID:15186506
Characterization of the patterns of polymorphism in a [open quotes]cryptic repeat[close quotes] reveals a novel type of hypervariable sequence

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jacobson, D.P.; Schmeling, P.; Sommer, S.S.

Alternating purine and pyrimidine repeats (RY(i)) are an abundant source of polymorphism. The subset with long tandem repeats of GT or AC (GT(i)) have been studied extensively, but cryptic RY(i) (i.e., no single tandem repeat predominates) have received little attention. The factor IX gene has a polymorphic cryptic RY(i) of 142-216 bp. Previously, there were four known polymorphic alleles, of the form AB, A[sub 2]B, A[sub 2]B[sub 2], and A[sub 3]B[sub 2], where A = (GT)(AC)[sub 3](AT)[sub 3](GT)(AT)[sub 4] and B = A with an additional 3' AT dinucleotide. To further characterize this locus, the authors examined more than 1,700more » additional human chromosomes and determined the sequences of the homologous sites in orangutans and chimpanzees. The novel alleles found in humans expand the repertoire of A/B alleles to A[sub 0-4]B[sub 1] and A[sub 1-3]B[sub 2]. The A[sub n]B[sub 2] series are abundant in Caucasians but are absent in blacks and Asians. Conversely, the A[sub 0]B[sub 1] allele is common in blacks but is not found in more than 1,700 Caucasian chromosomes. The data are compatible with a model in which recombination is more frequent than polymerase slippage at this locus. In orangutans, the RY(i) is present, but the sequence is markedly different. An A/B-type of pattern was discerned in which B differs from A by an additional six (AT) dinucleotides at the 3' end. In chimpanzees, the size of the RY(i) locus was greatly expanded, and the sequence showed a novel pattern of hypervariability in which there are many tandem repeats of the form (GT)[sub n](AC)[sub 0](AT)[sub p](GT)[sub q](AT)[sub s], where n, o, p, q, and s are different integers. The sequences of the factor IX intron 1 cryptic RY(i) in three primates provide perspective on the range of possible patterns of polymorphism. Analysis of the patterns suggests how the RY(i) can be conserved during evolution, while the precise sequence varies. 25 refs., 5 figs., 3 tabs.« less
GENETIC DIVERSITY OF TYPHA LATIFOLIA (TYPHACEAE) AND THE IMPACT OF POLLUTANTS EXAMINED WITH TANDEM-REPETITIVE DNA PROBES

EPA Science Inventory

Genetic diversity at variable-number-tandem-repeat (VNTR) loci was examined in the common cattail, Typha latifolia (Typhaceae), using three synthetic DNA probes composed of tandemly repeated "core" sequences (GACA, GATA, and GCAC). The principal objectives of this investigation w...
Comparison of simple sequence repeats in 19 Archaea.

PubMed

Trivedi, S

2006-12-05

All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.
Identification and Analysis of Novel Amino-Acid Sequence Repeats in Bacillus anthracis str. Ames Proteome Using Computational Tools

PubMed Central

Hemalatha, G. R.; Rao, D. Satyanarayana; Guruprasad, L.

2007-01-01

We have identified four repeats and ten domains that are novel in proteins encoded by the Bacillus anthracis str. Ames proteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 57-amino-acid-residue PxV domain, (2) 122-amino-acid-residue FxF domain, (3) 111-amino-acid-residue YEFF domain, (4) 109-amino-acid-residue IMxxH domain, (5) 103-amino-acid-residue VxxT domain, (6) 84-amino-acid-residue ExW domain, (7) 104-amino-acid-residue NTGFIG domain, (8) 36-amino-acid-residue NxGK repeat, (9) 95-amino-acid-residue VYV domain, (10) 75-amino-acid-residue KEWE domain, (11) 59-amino-acid-residue AFL domain, (12) 53-amino-acid-residue RIDVK repeat, (13) (a) 41-amino-acid-residue AGQF repeat and (b) 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure. PMID:17538688
Submegabase Clusters of Unstable Tandem Repeats Unique to the Tla Region of Mouse T Haplotypes

PubMed Central

Uehara, H.; Ebersole, T.; Bennett, D.; Artzt, K.

1990-01-01

We describe here the identification and genomic organization of mouse t haplotype-specific elements (TSEs) 7.8 and 5.8 kb in length. The TSEs exist as submegabase-long clusters of tandem repeats localized in the Tla region of the major histocompatibility complex of all t haplotype chromosomes examined. In contrast, no such clusters were detected among 12 inbred strains of Mus musculus and other Mus species; thus, clusters of TSEs represent the first absolutely qualitative difference between t haplotypes and wild-type chromosomes. Pulsed field gel electrophoresis shows that the number of clusters, and the number of repeats in each cluster are extremely variable. Dramatic quantitative differences of TSEs uniquely distinguish every independent t haplotype from any other. The complete nucleotide sequence of one 7.8-kb TSE reveals significant homology to the ETn (a major transcript in the early embryo of the mouse), and some homologies to intracisternal A-particles and the mammary tumor virus env gene. Apart from the diagnostic relevance to t haplotypes, evolutionary and functional significances are discussed with respect to chromosome structure and genetic recombination. PMID:2076812
Structure, organization, and sequence of alpha satellite DNA from human chromosome 17: evidence for evolution by unequal crossing-over and an ancestral pentamer repeat shared with the human X chromosome.

PubMed

Waye, J S; Willard, H F

1986-09-01

The centromeric regions of all human chromosomes are characterized by distinct subsets of a diverse tandemly repeated DNA family, alpha satellite. On human chromosome 17, the predominant form of alpha satellite is a 2.7-kilobase-pair higher-order repeat unit consisting of 16 alphoid monomers. We present the complete nucleotide sequence of the 16-monomer repeat, which is present in 500 to 1,000 copies per chromosome 17, as well as that of a less abundant 15-monomer repeat, also from chromosome 17. These repeat units were approximately 98% identical in sequence, differing by the exclusion of precisely 1 monomer from the 15-monomer repeat. Homologous unequal crossing-over is suggested as a probable mechanism by which the different repeat lengths on chromosome 17 were generated, and the putative site of such a recombination event is identified. The monomer organization of the chromosome 17 higher-order repeat unit is based, in part, on tandemly repeated pentamers. A similar pentameric suborganization has been previously demonstrated for alpha satellite of the human X chromosome. Despite the organizational similarities, substantial sequence divergence distinguishes these subsets. Hybridization experiments indicate that the chromosome 17 and X subsets are more similar to each other than to the subsets found on several other human chromosomes. We suggest that the chromosome 17 and X alpha satellite subsets may be related components of a larger alphoid subfamily which have evolved from a common ancestral repeat into the contemporary chromosome-specific subsets.
A Lossy Compression Technique Enabling Duplication-Aware Sequence Alignment

PubMed Central

Freschi, Valerio; Bogliolo, Alessandro

2012-01-01

In spite of the recognized importance of tandem duplications in genome evolution, commonly adopted sequence comparison algorithms do not take into account complex mutation events involving more than one residue at the time, since they are not compliant with the underlying assumption of statistical independence of adjacent residues. As a consequence, the presence of tandem repeats in sequences under comparison may impair the biological significance of the resulting alignment. Although solutions have been proposed, repeat-aware sequence alignment is still considered to be an open problem and new efficient and effective methods have been advocated. The present paper describes an alternative lossy compression scheme for genomic sequences which iteratively collapses repeats of increasing length. The resulting approximate representations do not contain tandem duplications, while retaining enough information for making their comparison even more significant than the edit distance between the original sequences. This allows us to exploit traditional alignment algorithms directly on the compressed sequences. Results confirm the validity of the proposed approach for the problem of duplication-aware sequence alignment. PMID:22518086
Molecular characterization and physical localization of highly repetitive DNA sequences from Brazilian Alstroemeria species.

PubMed

Kuipers, A G J; Kamstra, S A; de Jeu, M J; Visser, R G F

2002-01-01

Highly repetitive DNA sequences were isolated from genomic DNA libraries of Alstroemeria psittacina and A. inodora. Among the repetitive sequences that were isolated, tandem repeats as well as dispersed repeats could be discerned. The tandem repeats belonged to a family of interlinked Sau3A subfragments with sizes varying from 68-127 bp, and constituted a larger HinfI repeat of approximately 400 bp. Southern hybridization showed a similar molecular organization of the tandem repeats in each of the Brazilian Alstroemeria species tested. None of the repeats hybridized with DNA from Chilean Alstroemeria species, which indicates that they are specific for the Brazilian species. In-situ localization studies revealed the tandem repeats to be localized in clusters on the chromosomes of A. inodora and A. psittacina: distal hybridization sites were found on chromosome arms 2PS, 6PL, 7PS, 7PL and 8PL, interstitial sites on chromosome arms 2PL, 3PL, 4PL and 5PL. The applicability of the tandem repeats for cytogenetic analysis of interspecific hybrids and their role in heterochromatin organization are discussed.

Comparative and functional characterization of intragenic tandem repeats in 10 Aspergillus genomes.

PubMed

Gibbons, John G; Rokas, Antonis

2009-03-01

Intragenic tandem repeats (ITRs) are consecutive repeats of three or more nucleotides found in coding regions. ITRs are the underlying cause of several human genetic diseases and have been associated with phenotypic variation, including pathogenesis, in several clades of the tree of life. We have examined the evolution and functional role of ITRs in 10 genomes spanning the fungal genus Aspergillus, a clade of relevance to medicine, agriculture, and industry. We identified several hundred ITRs in each of the species examined. ITR content varied extensively between species, with an average 79% of ITRs unique to a given species. For the fraction of conserved ITR regions, sequence comparisons within species and between close relatives revealed that they were highly variable. ITR-containing proteins were evolutionarily less conserved, compositionally distinct, and overrepresented for domains associated with cell-surface localization and function relative to the rest of the proteome. Furthermore, ITRs were preferentially found in proteins involved in transcription, cellular communication, and cell-type differentiation but were underrepresented in proteins involved in metabolism and energy. Importantly, although ITRs were evolutionarily labile, their functional associations appeared. To be remarkably conserved across eukaryotes. Fungal ITRs likely participate in a variety of developmental processes and cell-surface-associated functions, suggesting that their contribution to fungal lifestyle and evolution may be more general than previously assumed.
Simple sequence repeats in Escherichia coli: abundance, distribution, composition, and polymorphism.

PubMed

Gur-Arie, R; Cohen, C J; Eitan, Y; Shelef, L; Hallerman, E M; Kashi, Y

2000-01-01

Computer-based genome-wide screening of the DNA sequence of Escherichia coli strain K12 revealed tens of thousands of tandem simple sequence repeat (SSR) tracts, with motifs ranging from 1 to 6 nucleotides. SSRs were well distributed throughout the genome. Mononucleotide SSRs were over-represented in noncoding regions and under-represented in open reading frames (ORFs). Nucleotide composition of mono- and dinucleotide SSRs, both in ORFs and in noncoding regions, differed from that of the genomic region in which they occurred, with 93% of all mononucleotide SSRs proving to be of A or T. Computer-based analysis of the fine position of every SSR locus in the noncoding portion of the genome relative to downstream ORFs showed SSRs located in areas that could affect gene regulation. DNA sequences at 14 arbitrarily chosen SSR tracts were compared among E. coli strains. Polymorphisms of SSR copy number were observed at four of seven mononucleotide SSR tracts screened, with all polymorphisms occurring in noncoding regions. SSR polymorphism could prove important as a genome-wide source of variation, both for practical applications (including rapid detection, strain identification, and detection of loci affecting key phenotypes) and for evolutionary adaptation of microbes.
MULTIPLE-LOCUS VARIABLE-NUMBER TANDEM REPEAT ANALYSIS OF BRUCELLA ISOLATES FROM THAILAND.

PubMed

Kumkrong, Khurawan; Chankate, Phanita; Tonyoung, Wittawat; Intarapuk, Apiradee; Kerdsin, Anusak; Kalambaheti, Thareerat

2017-01-01

Brucellosis-induced abortion can result in significant economic loss to farm animals. Brucellosis can be transmitted to humans during slaughter of infected animals or via consumption of contaminated food products. Strain identification of Brucella isolates can reveal the route of transmission. Brucella strains were isolated from vaginal swabs of farm animal, cow milk and from human blood cultures. Multiplex PCR was used to identify Brucella species, and owing to high DNA homology among Brucella isolates, multiple-locus variable-number tandem repeat analysis (MLVA) based on the number of tandem repeats at 16 different genomic loci was used for strain identification. Multiplex PCR categorized the isolates into B. abortus (n = 7), B. melitensis (n = 37), B. suis (n = 3), and 5 of unknown Brucella spp. MLVA-16 clustering analysis differentiated the strains into various genotypes, with Brucella isolates from the same geographic region being closely related, and revealed that the Thai isolates were phylogenetically distinct from those in other countries, including within the Southeast Asian region. Thus, MLVA-16 typing has utility in epidemiological studies.
A novel typing method for Listeria monocytogenes using high-resolution melting analysis (HRMA) of tandem repeat regions.

PubMed

Ohshima, Chihiro; Takahashi, Hajime; Iwakawa, Ai; Kuda, Takashi; Kimura, Bon

2017-07-17

Listeria monocytogenes, which is responsible for causing food poisoning known as listeriosis, infects humans and animals. Widely distributed in the environment, this bacterium is known to contaminate food products after being transmitted to factories via raw materials. To minimize the contamination of products by food pathogens, it is critical to identify and eliminate factory entry routes and pathways for the causative bacteria. High resolution melting analysis (HRMA) is a method that takes advantage of differences in DNA sequences and PCR product lengths that are reflected by the disassociation temperature. Through our research, we have developed a multiple locus variable-number tandem repeat analysis (MLVA) using HRMA as a simple and rapid method to differentiate L. monocytogenes isolates. While evaluating our developed method, the ability of MLVA-HRMA, MLVA using capillary electrophoresis, and multilocus sequence typing (MLST) was compared for their ability to discriminate between strains. The MLVA-HRMA method displayed greater discriminatory ability than MLST and MLVA using capillary electrophoresis, suggesting that the variation in the number of repeat units, along with mutations within the DNA sequence, was accurately reflected by the melting curve of HRMA. Rather than relying on DNA sequence analysis or high-resolution electrophoresis, the MLVA-HRMA method employs the same process as PCR until the analysis step, suggesting a combination of speed and simplicity. The result of MLVA-HRMA method is able to be shared between different laboratories. There are high expectations that this method will be adopted for regular inspections at food processing facilities in the near future. Copyright © 2017. Published by Elsevier B.V.
Protein Sequencing with Tandem Mass Spectrometry

NASA Astrophysics Data System (ADS)

Ziady, Assem G.; Kinter, Michael

The recent introduction of electrospray ionization techniques that are suitable for peptides and whole proteins has allowed for the design of mass spectrometric protocols that provide accurate sequence information for proteins. The advantages gained by these approaches over traditional Edman Degradation sequencing include faster analysis and femtomole, sometimes attomole, sensitivity. The ability to efficiently identify proteins has allowed investigators to conduct studies on their differential expression or modification in response to various treatments or disease states. In this chapter, we discuss the use of electrospray tandem mass spectrometry, a technique whereby protein-derived peptides are subjected to fragmentation in the gas phase, revealing sequence information for the protein. This powerful technique has been instrumental for the study of proteins and markers associated with various disorders, including heart disease, cancer, and cystic fibrosis. We use the study of protein expression in cystic fibrosis as an example.
Comparative molecular cytogenetic analyses of a major tandemly repeated DNA family and retrotransposon sequences in cultivated jute Corchorus species (Malvaceae).

PubMed

Begum, Rabeya; Zakrzewski, Falk; Menzel, Gerhard; Weber, Beatrice; Alam, Sheikh Shamimul; Schmidt, Thomas

2013-07-01

The cultivated jute species Corchorus olitorius and Corchorus capsularis are important fibre crops. The analysis of repetitive DNA sequences, comprising a major part of plant genomes, has not been carried out in jute but is useful to investigate the long-range organization of chromosomes. The aim of this study was the identification of repetitive DNA sequences to facilitate comparative molecular and cytogenetic studies of two jute cultivars and to develop a fluorescent in situ hybridization (FISH) karyotype for chromosome identification. A plasmid library was generated from C. olitorius and C. capsularis with genomic restriction fragments of 100-500 bp, which was complemented by targeted cloning of satellite DNA by PCR. The diversity of the repetitive DNA families was analysed comparatively. The genomic abundance and chromosomal localization of different repeat classes were investigated by Southern analysis and FISH, respectively. The cytosine methylation of satellite arrays was studied by immunolabelling. Major satellite repeats and retrotransposons have been identified from C. olitorius and C. capsularis. The satellite family CoSat I forms two undermethylated species-specific subfamilies, while the long terminal repeat (LTR) retrotransposons CoRetro I and CoRetro II show similarity to the Metaviridea of plant retroelements. FISH karyotypes were developed by multicolour FISH using these repetitive DNA sequences in combination with 5S and 18S-5·8S-25S rRNA genes which enable the unequivocal chromosome discrimination in both jute species. The analysis of the structure and diversity of the repeated DNA is crucial for genome sequence annotation. The reference karyotypes will be useful for breeding of jute and provide the basis for karyotyping homeologous chromosomes of wild jute species to reveal the genetic and evolutionary relationship between cultivated and wild Corchorus species.
Comparative molecular cytogenetic analyses of a major tandemly repeated DNA family and retrotransposon sequences in cultivated jute Corchorus species (Malvaceae)

PubMed Central

Begum, Rabeya; Zakrzewski, Falk; Menzel, Gerhard; Weber, Beatrice; Alam, Sheikh Shamimul; Schmidt, Thomas

2013-01-01

Background and Aims The cultivated jute species Corchorus olitorius and Corchorus capsularis are important fibre crops. The analysis of repetitive DNA sequences, comprising a major part of plant genomes, has not been carried out in jute but is useful to investigate the long-range organization of chromosomes. The aim of this study was the identification of repetitive DNA sequences to facilitate comparative molecular and cytogenetic studies of two jute cultivars and to develop a fluorescent in situ hybridization (FISH) karyotype for chromosome identification. Methods A plasmid library was generated from C. olitorius and C. capsularis with genomic restriction fragments of 100–500 bp, which was complemented by targeted cloning of satellite DNA by PCR. The diversity of the repetitive DNA families was analysed comparatively. The genomic abundance and chromosomal localization of different repeat classes were investigated by Southern analysis and FISH, respectively. The cytosine methylation of satellite arrays was studied by immunolabelling. Key Results Major satellite repeats and retrotransposons have been identified from C. olitorius and C. capsularis. The satellite family CoSat I forms two undermethylated species-specific subfamilies, while the long terminal repeat (LTR) retrotransposons CoRetro I and CoRetro II show similarity to the Metaviridea of plant retroelements. FISH karyotypes were developed by multicolour FISH using these repetitive DNA sequences in combination with 5S and 18S–5·8S–25S rRNA genes which enable the unequivocal chromosome discrimination in both jute species. Conclusions The analysis of the structure and diversity of the repeated DNA is crucial for genome sequence annotation. The reference karyotypes will be useful for breeding of jute and provide the basis for karyotyping homeologous chromosomes of wild jute species to reveal the genetic and evolutionary relationship between cultivated and wild Corchorus species. PMID:23666888
Effective application of multiple locus variable number of tandem repeats analysis to tracing Staphylococcus aureus in food-processing environment.

PubMed

Rešková, Z; Koreňová, J; Kuchta, T

2014-04-01

A total of 256 isolates of Staphylococcus aureus were isolated from 98 samples (34 swabs and 64 food samples) obtained from small or medium meat- and cheese-processing plants in Slovakia. The strains were genotypically characterized by multiple locus variable number of tandem repeats analysis (MLVA), involving multiplex polymerase chain reaction (PCR) with subsequent separation of the amplified DNA fragments by an automated flow-through gel electrophoresis. With the panel of isolates, MLVA produced 31 profile types, which was a sufficient discrimination to facilitate the description of spatial and temporal aspects of contamination. Further data on MLVA discrimination were obtained by typing a subpanel of strains by multiple locus sequence typing (MLST). MLVA coupled to automated electrophoresis proved to be an effective, comparatively fast and inexpensive method for tracing S. aureus contamination of food-processing factories. Subspecies genotyping of microbial contaminants in food-processing factories may facilitate identification of spatial and temporal aspects of the contamination. This may help to properly manage the process hygiene. With S. aureus, multiple locus variable number of tandem repeats analysis (MLVA) proved to be an effective method for the purpose, being sufficiently discriminative, yet comparatively fast and inexpensive. The application of automated flow-through gel electrophoresis to separation of DNA fragments produced by multiplex PCR helped to improve the accuracy and speed of the method. © 2013 The Society for Applied Microbiology.
Simple Sequence Repeats in Escherichia coli: Abundance, Distribution, Composition, and Polymorphism

PubMed Central

Gur-Arie, Riva; Cohen, Cyril J.; Eitan, Yuval; Shelef, Leora; Hallerman, Eric M.; Kashi, Yechezkel

2000-01-01

Computer-based genome-wide screening of the DNA sequence of Escherichia coli strain K12 revealed tens of thousands of tandem simple sequence repeat (SSR) tracts, with motifs ranging from 1 to 6 nucleotides. SSRs were well distributed throughout the genome. Mononucleotide SSRs were over-represented in noncoding regions and under-represented in open reading frames (ORFs). Nucleotide composition of mono- and dinucleotide SSRs, both in ORFs and in noncoding regions, differed from that of the genomic region in which they occurred, with 93% of all mononucleotide SSRs proving to be of A or T. Computer-based analysis of the fine position of every SSR locus in the noncoding portion of the genome relative to downstream ORFs showed SSRs located in areas that could affect gene regulation. DNA sequences at 14 arbitrarily chosen SSR tracts were compared among E. coli strains. Polymorphisms of SSR copy number were observed at four of seven mononucleotide SSR tracts screened, with all polymorphisms occurring in noncoding regions. SSR polymorphism could prove important as a genome-wide source of variation, both for practical applications (including rapid detection, strain identification, and detection of loci affecting key phenotypes) and for evolutionary adaptation of microbes.[The sequence data described in this paper have been submitted to the GenBank data library under accession numbers AF209020–209030 and AF209508–209518.] PMID:10645951
Filipino DNA variation at 12 X-chromosome short tandem repeat markers.

PubMed

Salvador, Jazelyn M; Apaga, Dame Loveliness T; Delfin, Frederick C; Calacal, Gayvelline C; Dennis, Sheila Estacio; De Ungria, Maria Corazon A

2018-06-08

Demands for solving complex kinship scenarios where only distant relatives are available for testing have risen in the past years. In these instances, other genetic markers such as X-chromosome short tandem repeat (X-STR) markers are employed to supplement autosomal and Y-chromosomal STR DNA typing. However, prior to use, the degree of STR polymorphism in the population requires evaluation through generation of an allele or haplotype frequency population database. This population database is also used for statistical evaluation of DNA typing results. Here, we report X-STR data from 143 unrelated Filipino male individuals who were genotyped via conventional polymerase chain reaction-capillary electrophoresis (PCR-CE) using the 12 X-STR loci included in the Investigator ® Argus X-12 kit (Qiagen) and via massively parallel sequencing (MPS) of seven X-STR loci included in the ForenSeq ™ DNA Signature Prep kit of the MiSeq ® FGx ™ Forensic Genomics System (Illumina). Allele calls between PCR-CE and MPS systems were consistent (100% concordance) across seven overlapping X-STRs. Allele and haplotype frequencies and other parameters of forensic interest were calculated based on length (PCR-CE, 12 X-STRs) and sequence (MPS, seven X-STRs) variations observed in the population. Results of our study indicate that the 12 X-STRs in the PCR-CE system are highly informative for the Filipino population. MPS of seven X-STR loci identified 73 X-STR alleles compared with 55 X-STR alleles that were identified solely by length via PCR-CE. Of the 73 sequence-based alleles observed, six alleles have not been reported in the literature. The population data presented here may serve as a reference Philippine frequency database of X-STRs for forensic casework applications. Copyright © 2018 Elsevier B.V. All rights reserved.
Tandem repeated application of organic solvents and sodium lauryl sulphate enhances cumulative skin irritation.

PubMed

Schliemann, Sibylle; Schmidt, Christina; Elsner, Peter

2014-01-01

The objective of our study was to investigate the tandem irritation potential of two organic solvents with concurrent exposure to the hydrophilic detergent irritant sodium lauryl sulphate (SLS). A tandem repeated irritation test was performed with two undiluted organic solvents, cumene (C) and octane (O), with either alternating application with SLS 0.5% or twice daily application of each irritant alone in 27 volunteers on the skin of the back. The cumulative irritation induced over 4 days was quantified using visual scoring and non-invasive bioengineering measurements (skin colour reflectance, skin hydration and transepidermal water loss). Repeated application of C/SLS and O/SLS induced more decline of stratum corneum hydration and higher degrees of clinical irritation and erythema compared to each irritant alone. Our results demonstrate a further example of additive harmful skin effects induced by particular skin irritants and indicate that exposure to organic solvents together with detergents may increase the risk of acquiring occupational contact dermatitis. © 2014 S. Karger AG, Basel.
Multi-locus variable number tandem repeat analysis for Escherichia coli causing extraintestinal infections.

PubMed

Manges, Amee R; Tellis, Patricia A; Vincent, Caroline; Lifeso, Kimberley; Geneau, Geneviève; Reid-Smith, Richard J; Boerlin, Patrick

2009-11-01

Discriminatory genotyping methods for the analysis of Escherichia coli other than O157:H7 are necessary for public health-related activities. A new multi-locus variable number tandem repeat analysis protocol is presented; this method achieves an index of discrimination of 99.5% and is reproducible and valid when tested on a collection of 836 diverse E. coli.
Expanded complexity of unstable repeat diseases

PubMed Central

Polak, Urszula; McIvor, Elizabeth; Dent, Sharon Y.R.; Wells, Robert D.; Napierala, Marek

2015-01-01

Unstable Repeat Diseases (URDs) share a common mutational phenomenon of changes in the copy number of short, tandemly repeated DNA sequences. More than 20 human neurological diseases are caused by instability, predominantly expansion, of microsatellite sequences. Changes in the repeat size initiate a cascade of pathological processes, frequently characteristic of a unique disease or a small subgroup of the URDs. Understanding of both the mechanism of repeat instability and molecular consequences of the repeat expansions is critical to developing successful therapies for these diseases. Recent technological breakthroughs in whole genome, transcriptome and proteome analyses will almost certainly lead to new discoveries regarding the mechanisms of repeat instability, the pathogenesis of URDs, and will facilitate development of novel therapeutic approaches. The aim of this review is to give a general overview of unstable repeats diseases, highlight the complexities of these diseases, and feature the emerging discoveries in the field. PMID:23233240
Identification of presumed ancestral DNA sequences of phaseolin in Phaseolus vulgaris.

PubMed Central

Kami, J; Velásquez, V B; Debouck, D G; Gepts, P

1995-01-01

Common bean (Phaseolus vulgaris) consists of two major geographic gene pools, one distributed in Mexico, Central America, and Colombia and the other in the southern Andes (southern Peru, Bolivia, and Argentina). Amplification and sequencing of members of the multigene family coding for phaseolin, the major seed storage protein of the common bean, provide evidence for accumulation of tandem direct repeats in both introns and exons during evolution of the multigene family in this species. The presumed ancestral phaseolin sequences, without tandem repeats, were found in recently discovered but nearly extinct wild common bean populations of Ecuador and northern Peru that are intermediate between the two major gene pools of the species based on geographical and molecular arguments. Our results illustrate the usefulness of tandem direct repeats in establishing the polarity of DNA sequence divergence and therefore in proposing phylogenies. Images Fig. 1 Fig. 3 PMID:7862642
Use of Variable-Number Tandem Repeats To Examine Genetic Diversity of Neisseria meningitidis

PubMed Central

Yazdankhah, Siamak P.; Lindstedt, Bjørn-Arne; Caugant, Dominique A.

2005-01-01

Repetitive DNA motifs with potential variable-number tandem repeats (VNTR) were identified in the genome of Neisseria meningitidis and used to develop a typing method. A total of 146 meningococcal isolates recovered from carriers and patients were studied. These included 82 of the 107 N. meningitidis isolates previously used in the development of multilocus sequence typing (MLST), 45 isolates recovered from different counties in Norway in connection with local outbreaks, and 19 serogroup W135 isolates of sequence type 11 (ST-11), which were recovered in several parts of the world. The latter group comprised isolates related to the Hajj outbreak of 2000 and isolates recovered from outbreaks in Burkina Faso in 2001 and 2002. All isolates had been characterized previously by MLST or multilocus enzyme electrophoresis (MLEE). VNTR analysis showed that meningococcal isolates with similar MLST or MLEE types recovered from epidemiologically linked cases in a defined geographical area often presented similar VNTR patterns while isolates of the same MLST or MLEE types without an obvious epidemiological link showed variable VNTR patterns. Thus, VNTR analysis may be used for fine typing of meningococcal isolates after MLST or MLEE typing. The method might be especially valuable for differentiating among ST-11 strains, as shown by the VNTR analyses of serogroup W135 ST-11 meningococcal isolates recovered since the mid-1990s. PMID:15814988
Deep landscape update of dispersed and tandem repeats in the genome model of the red jungle fowl, Gallus gallus, using a series of de novo investigating tools.

PubMed

Guizard, Sébastien; Piégu, Benoît; Arensburger, Peter; Guillou, Florian; Bigot, Yves

2016-08-19

The program RepeatMasker and the database Repbase-ISB are part of the most widely used strategy for annotating repeats in animal genomes. They have been used to show that avian genomes have a lower repeat content (8-12 %) than the sequenced genomes of many vertebrate species (30-55 %). However, the efficiency of such a library-based strategies is dependent on the quality and completeness of the sequences in the database that is used. An alternative to these library based methods are methods that identify repeats de novo. These alternative methods have existed for a least a decade and may be more powerful than the library based methods. We have used an annotation strategy involving several complementary de novo tools to determine the repeat content of the model genome galGal4 (1.04 Gbp), including identifying simple sequence repeats (SSRs), tandem repeats and transposable elements (TEs). We annotated over one Gbp. of the galGal4 genome and showed that it is composed of approximately 19 % SSRs and TEs repeats. Furthermore, we estimate that the actual genome of the red jungle fowl contains about 31-35 % repeats. We find that library-based methods tend to overestimate TE diversity. These results have a major impact on the current understanding of repeats distributions throughout chromosomes in the red jungle fowl. Our results are a proof of concept of the reliability of using de novo tools to annotate repeats in large animal genomes. They have also revealed issues that will need to be resolved in order to develop gold-standard methodologies for annotating repeats in eukaryote genomes.
Analysis of short tandem repeat polymorphisms using infrared fluorescence with M18 tailed primers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oetting, W.S.; Wiesner, G.; Laken, S.

The use of short tandem repeat polymorphisms (STRPs) are becoming increasingly important as markers for linkage analysis due to their large numbers of the human genome and their high degree of polymorphism. Fluorescence based detection of the STRP pattern using the LI-COR model 4000S automated DNA sequencer eliminates the need for radioactivity and produces a digitized image that can be used for the analysis of the polymorphisms. In an effort to reduce the cost of STRP analysis, we have synthesized primers with a 19 bp extension complementary to the sequence of the M13 primer on the 5{prime} end of onemore » of the two primers used in the amplification of the STRP instead of using primers with direct conjugation of the infrared fluorescent dye. Up to 5 primer pairs can be multiplexed together with the M13 primer-dye conjugate as the sole primer conjugated to the fluorescent dye. Comparisons between primers that have been directly conjugated to the fluor with those having the M13 sequence extension show no difference in the ability to determine the STRP pattern. At present, the entire Weber 4A set of STRP markers is available with the M13 5{prime} extension. We are currently using this technique for linkage analysis of familial breast cancer and asthma. The combination of STRP analysis using fluorescence detection will allow this technique to be fully automated for allele scoring and linkage analysis.« less
Transcription of highly repetitive tandemly organized DNA in amphibians and birds: A historical overview and modern concepts.

PubMed

Trofimova, Irina; Krasikova, Alla

2016-12-01

Tandemly organized highly repetitive DNA sequences are crucial structural and functional elements of eukaryotic genomes. Despite extensive evidence, satellite DNA remains an enigmatic part of the eukaryotic genome, with biological role and significance of tandem repeat transcripts remaining rather obscure. Data on tandem repeats transcription in amphibian and avian model organisms is fragmentary despite their genomes being thoroughly characterized. Review systematically covers historical and modern data on transcription of amphibian and avian satellite DNA in somatic cells and during meiosis when chromosomes acquire special lampbrush form. We highlight how transcription of tandemly repetitive DNA sequences is organized in interphase nucleus and on lampbrush chromosomes. We offer LTR-activation hypotheses of widespread satellite DNA transcription initiation during oogenesis. Recent explanations are provided for the significance of high-yield production of non-coding RNA derived from tandemly organized highly repetitive DNA. In many cases the data on the transcription of satellite DNA can be extrapolated from lampbrush chromosomes to interphase chromosomes. Lampbrush chromosomes with applied novel technical approaches such as superresolution imaging, chromosome microdissection followed by high-throughput sequencing, dynamic observation in life-like conditions provide amazing opportunities for investigation mechanisms of the satellite DNA transcription.
Transcription of highly repetitive tandemly organized DNA in amphibians and birds: A historical overview and modern concepts

PubMed Central

Krasikova, Alla

2016-01-01

ABSTRACT Tandemly organized highly repetitive DNA sequences are crucial structural and functional elements of eukaryotic genomes. Despite extensive evidence, satellite DNA remains an enigmatic part of the eukaryotic genome, with biological role and significance of tandem repeat transcripts remaining rather obscure. Data on tandem repeats transcription in amphibian and avian model organisms is fragmentary despite their genomes being thoroughly characterized. Review systematically covers historical and modern data on transcription of amphibian and avian satellite DNA in somatic cells and during meiosis when chromosomes acquire special lampbrush form. We highlight how transcription of tandemly repetitive DNA sequences is organized in interphase nucleus and on lampbrush chromosomes. We offer LTR-activation hypotheses of widespread satellite DNA transcription initiation during oogenesis. Recent explanations are provided for the significance of high-yield production of non-coding RNA derived from tandemly organized highly repetitive DNA. In many cases the data on the transcription of satellite DNA can be extrapolated from lampbrush chromosomes to interphase chromosomes. Lampbrush chromosomes with applied novel technical approaches such as superresolution imaging, chromosome microdissection followed by high-throughput sequencing, dynamic observation in life-like conditions provide amazing opportunities for investigation mechanisms of the satellite DNA transcription. PMID:27763817
Short-Sequence DNA Repeats in Prokaryotic Genomes

PubMed Central

van Belkum, Alex; Scherer, Stewart; van Alphen, Loek; Verbrugh, Henri

1998-01-01

Short-sequence DNA repeat (SSR) loci can be identified in all eukaryotic and many prokaryotic genomes. These loci harbor short or long stretches of repeated nucleotide sequence motifs. DNA sequence motifs in a single locus can be identical and/or heterogeneous. SSRs are encountered in many different branches of the prokaryote kingdom. They are found in genes encoding products as diverse as microbial surface components recognizing adhesive matrix molecules and specific bacterial virulence factors such as lipopolysaccharide-modifying enzymes or adhesins. SSRs enable genetic and consequently phenotypic flexibility. SSRs function at various levels of gene expression regulation. Variations in the number of repeat units per locus or changes in the nature of the individual repeat sequences may result from recombination processes or polymerase inadequacy such as slipped-strand mispairing (SSM), either alone or in combination with DNA repair deficiencies. These rather complex phenomena can occur with relative ease, with SSM approaching a frequency of 10−4 per bacterial cell division and allowing high-frequency genetic switching. Bacteria use this random strategy to adapt their genetic repertoire in response to selective environmental pressure. SSR-mediated variation has important implications for bacterial pathogenesis and evolutionary fitness. Molecular analysis of changes in SSRs allows epidemiological studies on the spread of pathogenic bacteria. The occurrence, evolution and function of SSRs, and the molecular methods used to analyze them are discussed in the context of responsiveness to environmental factors, bacterial pathogenicity, epidemiology, and the availability of full-genome sequences for increasing numbers of microorganisms, especially those that are medically relevant. PMID:9618442

Intratypic variability of a tandem repeat locus within the DNA polymerase gene of human herpes simplex virus type 2.

PubMed

Sun, Yongjiang; Chan, Roy Kum Wah; Tan, Suat Hoon

2004-01-01

In this study, the irntratypic variability of a tandem repeat locus within the DNA polymerase (pol) gene of human herpes simplex virus type 2 (HSV2) was uncovered. The locus contained variable numbers of tandem dodecanucleotide (5'-GAC GAG GAC GGG-3') repetitive units. Our result showed that approximately 95% of analyzed HSV2 clinical isolates and the current GenBank HSV2 strains contained two copies of the repetitive units. From genital herpes specimens, three new HSV2 strains, which respectively contained 1, 3, and 4 copies of the repetitive units, were identified. This variable number of tandem repeat (VNTR) locus is absent in HSV1, and thus it also contributes to the intertypic variability of HSV1 and HSV2. The intratypic variability of the locus may be useful for HSV2 strain genotyping and this application is discussed.
Altered Methylation in Tandem Repeat Element and Elemental Component Levels in Inhalable Air Particles

PubMed Central

Hou, Lifang; Zhang, Xiao; Zheng, Yinan; Wang, Sheng; Dou, Chang; Guo, Liqiong; Byun, Hyang-Min; Motta, Valeria; McCracken, John; Díaz, Anaité; Kang, Choong-Min; Koutrakis, Petros; Bertazzi, Pier Alberto; Li, Jingyun; Schwartz, Joel; Baccarelli, Andrea A.

2014-01-01

Exposure to particulate matter (PM) has been associated with lung cancer risk in epidemiology investigations. Elemental components of PM have been suggested to have critical roles in PM toxicity, but the molecular mechanisms underlying their association with cancer risks remain poorly understood. DNA methylation has emerged as a promising biomarker for environmental-related diseases, including lung cancer. In this study, we evaluated the effects of PM elemental components on methylation of three tandem repeats in a highly-exposed population in Beijing, China. The Beijing Truck Driver Air Pollution Study was conducted shortly before the 2008 Beijing Olympic Games (June 15-July 27, 2008) and included 60 truck drivers and 60 office workers. On two days separated by 1-2 weeks, we measured blood DNA methylation of SATα, NBL2, D4Z4, and personal exposure to eight elemental components in PM2.5, including aluminum (Al), silicon (Si), sulfur (S), potassium (K), calcium (Ca) titanium (Ti), iron (Fe), and zinc (Zn). We estimated the associations of individual elemental component with each tandem repeat methylation in generalized estimating equations (GEE) models adjusted for PM2.5 mass and other covariates. Out of the eight examined elements, NBL2 methylation was positively associated with concentrations of Si (0.121, 95%CI: 0.030; 0.212, FDR=0.047) and Ca (0.065, 95%CI: 0.014; 0.115, FDR=0.047) in truck drivers. In office workers, SATα methylation was positively associated with concentrations of S (0.115, 95%CI: 0.034; 0.196, FDR=0.042). PM-associated differences in blood tandem-repeat methylation may help detect biological effects of the exposure and identify individuals who may eventually experience higher lung cancer risk. PMID:24273195
Variable-number tandem repeats as molecular markers for biotypes of Pasteuria ramosa in Daphnia spp.

PubMed

Mouton, Laurence; Nong, Guang; Preston, James F; Ebert, Dieter

2007-06-01

Variable-number tandem repeats (VNTRs) have been identified in populations of Pasteuria ramosa, a castrating endobacterium of Daphnia species. The allelic polymorphisms at 14 loci in laboratory and geographically diverse soil samples showed that VNTRs may serve as biomarkers for the genetic characterization of P. ramosa isolates.
Identification and characterization of short tandem repeats in the Tibetan macaque genome based on resequencing data.

PubMed

Liu, San-Xu; Hou, Wei; Zhang, Xue-Yan; Peng, Chang-Jun; Yue, Bi-Song; Fan, Zhen-Xin; Li, Jing

2018-07-18

The Tibetan macaque, which is endemic to China, is currently listed as a Near Endangered primate species by the International Union for Conservation of Nature (IUCN). Short tandem repeats (STRs) refer to repetitive elements of genome sequence that range in length from 1-6 bp. They are found in many organisms and are widely applied in population genetic studies. To clarify the distribution characteristics of genome-wide STRs and understand their variation among Tibetan macaques, we conducted a genome-wide survey of STRs with next-generation sequencing of five macaque samples. A total of 1 077 790 perfect STRs were mined from our assembly, with an N50 of 4 966 bp. Mono-nucleotide repeats were the most abundant, followed by tetra- and di-nucleotide repeats. Analysis of GC content and repeats showed consistent results with other macaques. Furthermore, using STR analysis software (lobSTR), we found that the proportion of base pair deletions in the STRs was greater than that of insertions in the five Tibetan macaque individuals (P<0.05, t-test). We also found a greater number of homozygous STRs than heterozygous STRs (P<0.05, t-test), with the Emei and Jianyang Tibetan macaques showing more heterozygous loci than Huangshan Tibetan macaques. The proportion of insertions and mean variation of alleles in the Emei and Jianyang individuals were slightly higher than those in the Huangshan individuals, thus revealing differences in STR allele size between the two populations. The polymorphic STR loci identified based on the reference genome showed good amplification efficiency and could be used to study population genetics in Tibetan macaques. The neighbor-joining tree classified the five macaques into two different branches according to their geographical origin, indicating high genetic differentiation between the Huangshan and Sichuan populations. We elucidated the distribution characteristics of STRs in the Tibetan macaque genome and provided an effective method for
Isolation of centromeric-tandem repetitive DNA sequences by chromatin affinity purification using a HaloTag7-fused centromere-specific histone H3 in tobacco.

PubMed

Nagaki, Kiyotaka; Shibata, Fukashi; Kanatani, Asaka; Kashihara, Kazunari; Murata, Minoru

2012-04-01

The centromere is a multi-functional complex comprising centromeric DNA and a number of proteins. To isolate unidentified centromeric DNA sequences, centromere-specific histone H3 variants (CENH3) and chromatin immunoprecipitation (ChIP) have been utilized in some plant species. However, anti-CENH3 antibody for ChIP must be raised in each species because of its species specificity. Production of the antibodies is time-consuming and costly, and it is not easy to produce ChIP-grade antibodies. In this study, we applied a HaloTag7-based chromatin affinity purification system to isolate centromeric DNA sequences in tobacco. This system required no specific antibody, and made it possible to apply a highly stringent wash to remove contaminated DNA. As a result, we succeeded in isolating five tandem repetitive DNA sequences in addition to the centromeric retrotransposons that were previously identified by ChIP. Three of the tandem repeats were centromere-specific sequences located on different chromosomes. These results confirm the validity of the HaloTag7-based chromatin affinity purification system as an alternative method to ChIP for isolating unknown centromeric DNA sequences. The discovery of more than two chromosome-specific centromeric DNA sequences indicates the mosaic structure of tobacco centromeres. © Springer-Verlag 2011
The evolution of filamin-a protein domain repeat perspective.

PubMed

Light, Sara; Sagit, Rauan; Ithychanda, Sujay S; Qin, Jun; Elofsson, Arne

2012-09-01

Particularly in higher eukaryotes, some protein domains are found in tandem repeats, performing broad functions often related to cellular organization. For instance, the eukaryotic protein filamin interacts with many proteins and is crucial for the cytoskeleton. The functional properties of long repeat domains are governed by the specific properties of each individual domain as well as by the repeat copy number. To provide better understanding of the evolutionary and functional history of repeating domains, we investigated the mode of evolution of the filamin domain in some detail. Among the domains that are common in long repeat proteins, sushi and spectrin domains evolve primarily through cassette tandem duplications while scavenger and immunoglobulin repeats appear to evolve through clustered tandem duplications. Additionally, immunoglobulin and filamin repeats exhibit a unique pattern where every other domain shows high sequence similarity. This pattern may be the result of tandem duplications, serve to avert aggregation between adjacent domains or it is the result of functional constraints. In filamin, our studies confirm the presence of interspersed integrin binding domains in vertebrates, while invertebrates exhibit more varied patterns, including more clustered integrin binding domains. The most notable case is leech filamin, which contains a 20 repeat expansion and exhibits unique dimerization topology. Clearly, invertebrate filamins are varied and contain examples of similar adjacent integrin-binding domains. Given that invertebrate integrin shows more similarity to the weaker filamin binder, integrin β3, it is possible that the distance between integrin-binding domains is not as crucial for invertebrate filamins as for vertebrates. Copyright © 2012 Elsevier Inc. All rights reserved.
[Reticulate evolution of parthenogenetic species of the Lacertidae rock lizards: inheritance of CLsat tandem repeats and anonymous RAPD markers].

PubMed

Chobanu, D; Rudykh, I A; Riabinina, N L; Grechko, V V; Kramerov, D A; Darevskiĭ, I S

2002-01-01

The genetic relatedness of several bisexual and of four unisexual "Lacerta saxicola complex" lizards was studied, using monomer sequences of the complex-specific CLsat tandem repeats and anonymous RAPD markers. Genomes of parthenospecies were shown to include different satellite monomers. The structure of each such monomer is specific for a certain pair of bisexual species. This fact might be interpreted in favor of co-dominant inheritance of these markers in bisexual species hybridogenesis. This idea is supported by the results obtained with RAPD markers; i.e., unisexual species genomes include only the loci characteristic of certain bisexual species. At the same time, in neither case parthenospecies possess specific, autoapomorphic loci that were not present in this or that bisexual species.
Detecting and Characterizing Repeating Earthquake Sequences During Volcanic Eruptions

NASA Astrophysics Data System (ADS)

Tepp, G.; Haney, M. M.; Wech, A.

2017-12-01

A major challenge in volcano seismology is forecasting eruptions. Repeating earthquake sequences often precede volcanic eruptions or lava dome activity, providing an opportunity for short-term eruption forecasting. Automatic detection of these sequences can lead to timely eruption notification and aid in continuous monitoring of volcanic systems. However, repeating earthquake sequences may also occur after eruptions or along with magma intrusions that do not immediately lead to an eruption. This additional challenge requires a better understanding of the processes involved in producing these sequences to distinguish those that are precursory. Calculation of the inverse moment rate and concepts from the material failure forecast method can lead to such insights. The temporal evolution of the inverse moment rate is observed to differ for precursory and non-precursory sequences, and multiple earthquake sequences may occur concurrently. These observations suggest that sequences may occur in different locations or through different processes. We developed an automated repeating earthquake sequence detector and near real-time alarm to send alerts when an in-progress sequence is identified. Near real-time inverse moment rate measurements can further improve our ability to forecast eruptions by allowing for characterization of sequences. We apply the detector to eruptions of two Alaskan volcanoes: Bogoslof in 2016-2017 and Redoubt Volcano in 2009. The Bogoslof eruption produced almost 40 repeating earthquake sequences between its start in mid-December 2016 and early June 2017, 21 of which preceded an explosive eruption, and 2 sequences in the months before eruptive activity. Three of the sequences occurred after the implementation of the alarm in late March 2017 and successfully triggered alerts. The nearest seismometers to Bogoslof are over 45 km away, requiring a detector that can work with few stations and a relatively low signal-to-noise ratio. During the Redoubt
Diversity and evolution of centromere repeats in the maize genome.

PubMed

Bilinski, Paul; Distor, Kevin; Gutierrez-Lopez, Jose; Mendoza, Gabriela Mendoza; Shi, Jinghua; Dawe, R Kelly; Ross-Ibarra, Jeffrey

2015-03-01

Centromere repeats are found in most eukaryotes and play a critical role in kinetochore formation. Though centromere repeats exhibit considerable diversity both within and among species, little is understood about the mechanisms that drive centromere repeat evolution. Here, we use maize as a model to investigate how a complex history involving polyploidy, fractionation, and recent domestication has impacted the diversity of the maize centromeric repeat CentC. We first validate the existence of long tandem arrays of repeats in maize and other taxa in the genus Zea. Although we find considerable sequence diversity among CentC copies genome-wide, genetic similarity among repeats is highest within these arrays, suggesting that tandem duplications are the primary mechanism for the generation of new copies. Nonetheless, clustering analyses identify similar sequences among distant repeats, and simulations suggest that this pattern may be due to homoplasious mutation. Although the two ancestral subgenomes of maize have contributed nearly equal numbers of centromeres, our analysis shows that the majority of all CentC repeats derive from one of the parental genomes, with an even stronger bias when examining the largest assembled contiguous clusters. Finally, by comparing maize with its wild progenitor teosinte, we find that the abundance of CentC likely decreased after domestication, while the pericentromeric repeat Cent4 has drastically increased.
The Repeat Sequences and Elevated Substitution Rates of the Chloroplast accD Gene in Cupressophytes

PubMed Central

Li, Jia; Su, Yingjuan; Wang, Ting

2018-01-01

The plastid accD gene encodes a subunit of the acetyl-CoA carboxylase (ACCase) enzyme. The length of accD gene has been supposed to expand in Cryptomeria japonica, Taiwania cryptomerioides, Cephalotaxus, Taxus chinensis, and Podocarpus lambertii, and the main reason for this phenomenon was the existence of tandemly repeated sequences. However, it is still unknown whether the accD gene length in other cupressophytes has expanded. Here, in order to investigate how widespread this phenomenon was, 18 accD sequences and its surrounding regions of cupressophyte were sequenced and analyzed. Together with 39 GenBank sequence data, our taxon sampling covered all the extant gymnosperm orders. The repetitive elements and substitution rates of accD among 57 gymnosperm species were analyzed, the results show: (1) Reading frame length of accD gene in 18 cupressophytes species has also expanded. (2) Many repetitive elements were identified in accD gene of cupressophyte lineages. (3) The synonymous and non-synonymous substitution rates of accD were accelerated in cupressophytes. (4) accD was located in rearrangement endpoints. These results suggested that repetitive elements may mediate the chloroplast genome rearrangement and accelerated the substitution rates. PMID:29731764
Exploring the repeat protein universe through computational protein design

DOE PAGES

Brunette, TJ; Parmeggiani, Fabio; Huang, Po-Ssu; ...

2015-12-16

A central question in protein evolution is the extent to which naturally occurring proteins sample the space of folded structures accessible to the polypeptide chain. Repeat proteins composed of multiple tandem copies of a modular structure unit are widespread in nature and have critical roles in molecular recognition, signalling, and other essential biological processes. Naturally occurring repeat proteins have been re-engineered for molecular recognition and modular scaffolding applications. In this paper, we use computational protein design to investigate the space of folded structures that can be generated by tandem repeating a simple helix–loop–helix–loop structural motif. Eighty-three designs with sequences unrelatedmore » to known repeat proteins were experimentally characterized. Of these, 53 are monomeric and stable at 95 °C, and 43 have solution X-ray scattering spectra consistent with the design models. Crystal structures of 15 designs spanning a broad range of curvatures are in close agreement with the design models with root mean square deviations ranging from 0.7 to 2.5 Å. Finally, our results show that existing repeat proteins occupy only a small fraction of the possible repeat protein sequence and structure space and that it is possible to design novel repeat proteins with precisely specified geometries, opening up a wide array of new possibilities for biomolecular engineering.« less
GENETIC VARIATION IN RED RASPBERRIES (RUBUS IDAEUS L.; ROSACEAE) FROM SITES DIFFERING IN ORGANIC POLLUTANTS COMPARED WITH SYNTHETIC TANDEM REPEAT DNA PROBES

EPA Science Inventory

Two synthetic tandem repetitive DNA probes were used to compare genetic variation at variable-number-tandem-repeat (VNTR) loci among Rubus idaeus L. var. strigosus (Michx.) Maxim. (Rosaceae) individuals sampled at eight sites contaminated by pollutants (N = 39) and eight adjacent...
De novo transcriptome sequencing reveals a considerable bias in the incidence of simple sequence repeats towards the downstream of 'Pre-miRNAs' of black pepper.

PubMed

Joy, Nisha; Asha, Srinivasan; Mallika, Vijayan; Soniya, Eppurathu Vasudevan

2013-01-01

Next generation sequencing has an advantageon transformational development of species with limited available sequence data as it helps to decode the genome and transcriptome. We carried out the de novo sequencing using illuminaHiSeq™ 2000 to generate the first leaf transcriptome of black pepper (Piper nigrum L.), an important spice variety native to South India and also grown in other tropical regions. Despite the economic and biochemical importance of pepper, a scientifically rigorous study at the molecular level is far from complete due to lack of sufficient sequence information and cytological complexity of its genome. The 55 million raw reads obtained, when assembled using Trinity program generated 2,23,386 contigs and 1,28,157 unigenes. Reports suggest that the repeat-rich genomic regions give rise to small non-coding functional RNAs. MicroRNAs (miRNAs) are the most abundant type of non-coding regulatory RNAs. In spite of the widespread research on miRNAs, little is known about the hair-pin precursors of miRNAs bearing Simple Sequence Repeats (SSRs). We used the array of transcripts generated, for the in silico prediction and detection of '43 pre-miRNA candidates bearing different types of SSR motifs'. The analysis identified 3913 different types of SSR motifs with an average of one SSR per 3.04 MB of thetranscriptome. About 0.033% of the transcriptome constituted 'pre-miRNA candidates bearing SSRs'. The abundance, type and distribution of SSR motifs studied across the hair-pin miRNA precursors, showed a significant bias in the position of SSRs towards the downstream of predicted 'pre-miRNA candidates'. The catalogue of transcripts identified, together with the demonstration of reliable existence of SSRs in the miRNA precursors, permits future opportunities for understanding the genetic mechanism of black pepper and likely functions of 'tandem repeats' in miRNAs.
New Multilocus Variable-Number Tandem-Repeat Analysis Tool for Surveillance and Local Epidemiology of Bacterial Leaf Blight and Bacterial Leaf Streak of Rice Caused by Xanthomonas oryzae

PubMed Central

Poulin, L.; Grygiel, P.; Magne, M.; Rodriguez-R, L. M.; Forero Serna, N.; Zhao, S.; El Rafii, M.; Dao, S.; Tekete, C.; Wonni, I.; Koita, O.; Pruvost, O.; Verdier, V.; Vernière, C.

2014-01-01

Multilocus variable-number tandem-repeat analysis (MLVA) is efficient for routine typing and for investigating the genetic structures of natural microbial populations. Two distinct pathovars of Xanthomonas oryzae can cause significant crop losses in tropical and temperate rice-growing countries. Bacterial leaf streak is caused by X. oryzae pv. oryzicola, and bacterial leaf blight is caused by X. oryzae pv. oryzae. For the latter, two genetic lineages have been described in the literature. We developed a universal MLVA typing tool both for the identification of the three X. oryzae genetic lineages and for epidemiological analyses. Sixteen candidate variable-number tandem-repeat (VNTR) loci were selected according to their presence and polymorphism in 10 draft or complete genome sequences of the three X. oryzae lineages and by VNTR sequencing of a subset of loci of interest in 20 strains per lineage. The MLVA-16 scheme was then applied to 338 strains of X. oryzae representing different pathovars and geographical locations. Linkage disequilibrium between MLVA loci was calculated by index association on different scales, and the 16 loci showed linear Mantel correlation with MLSA data on 56 X. oryzae strains, suggesting that they provide a good phylogenetic signal. Furthermore, analyses of sets of strains for different lineages indicated the possibility of using the scheme for deeper epidemiological investigation on small spatial scales. PMID:25398857
Sequencing of Oligourea Foldamers by Tandem Mass Spectrometry

NASA Astrophysics Data System (ADS)

Bathany, Katell; Owens, Neil W.; Guichard, Gilles; Schmitter, Jean-Marie

2013-03-01

This study is focused on sequence analysis of peptidomimetic helical oligoureas by means of tandem mass spectrometry, to build a basis for de novo sequencing for future high-throughput combinatorial library screening of oligourea foldamers. After the evaluation of MS/MS spectra obtained for model compounds with either MALDI or ESI sources, we found that the MALDI-TOF-TOF instrument gave more satisfactory results. MS/MS spectra of oligoureas generated by decay of singly charged precursor ions show major ion series corresponding to fragmentation across both CO-NH and N'H-CO urea bonds. Oligourea backbones fragment to produce a pattern of a, x, b, and y type fragment ions. De novo decoding of spectral information is facilitated by the occurrence of low mass reporter ions, representative of constitutive monomers, in an analogous manner to the use of immonium ions for peptide sequencing.
The evolution of filamin – A protein domain repeat perspective

PubMed Central

Light, Sara; Sagit, Rauan; Ithychanda, Sujay S.; Qin, Jun; Elofsson, Arne

2013-01-01

Particularly in higher eukaryotes, some protein domains are found in tandem repeats, performing broad functions often related to cellular organization. For instance, the eukaryotic protein filamin interacts with many proteins and is crucial for the cytoskeleton. The functional properties of long repeat domains are governed by the specific properties of each individual domain as well as by the repeat copy number. To provide better understanding of the evolutionary and functional history of repeating domains, we investigated the mode of evolution of the filamin domain in some detail. Among the domains that are common in long repeat proteins, sushi and spectrin domains evolve primarily through cassette tandem duplications while scavenger and immunoglobulin repeats appear to evolve through clustered tandem duplications. Additionally, immunoglobulin and filamin repeats exhibit a unique pattern where every other domain shows high sequence similarity. This pattern may be the result of tandem duplications, serve to avert aggregation between adjacent domains or it is the result of functional constraints. In filamin, our studies confirm the presence of interspersed integrin binding domains in vertebrates, while invertebrates exhibit more varied patterns, including more clustered integrin binding domains. The most notable case is leech filamin, which contains a 20 repeat expansion and exhibits unique dimerization topology. Clearly, invertebrate filamins are varied and contain examples of similar adjacent integrin-binding domains. Given that invertebrate integrin shows more similarity to the weaker filamin binder, integrin β3, it is possible that the distance between integrin-binding domains is not as crucial for invertebrate filamins as for vertebrates. PMID:22414427
Thermal denaturation of the BRCT tandem repeat region of human tumour suppressor gene product BRCA1.

PubMed

Pyrpassopoulos, Serapion; Ladopoulou, Angela; Vlassi, Metaxia; Papanikolau, Yannis; Vorgias, Constantinos E; Yannoukakos, Drakoulis; Nounesis, George

2005-04-01

Reduced stability of the tandem BRCT domains of human BReast CAncer 1 (BRCA1) due to missense mutations may be critical for loss of function in DNA repair and damage-induced checkpoint control. In the present thermal denaturation study of the BRCA1 BRCT region, high-precision differential scanning calorimetry (DSC) and circular dichroism (CD) spectroscopy provide evidence for the existence of a denatured state that is structurally very similar to the native. Consistency between theoretical structure-based estimates of the enthalpy (DeltaH) and heat capacity change (DeltaCp) and the calorimetric results is obtained when considering partial thermal unfolding contained in the region of the conserved hydrophobic pocket formed at the interface of the two BRCT repeats. The structural integrity of this region has been shown to be crucial for the interaction of BRCA1 with phosphorylated peptides. In addition, cancer-causing missense mutations located at the inter-BRCT-repeat interface have been linked to the destabilization of the tandem BRCT structure.
Intergenic Variable-Number Tandem-Repeat Polymorphism Upstream of rocA Alters Toxin Production and Enhances Virulence in Streptococcus pyogenes.

PubMed

Zhu, Luchang; Olsen, Randall J; Horstmann, Nicola; Shelburne, Samuel A; Fan, Jia; Hu, Ye; Musser, James M

2016-07-01

Variable-number tandem-repeat (VNTR) polymorphisms are ubiquitous in bacteria. However, only a small fraction of them has been functionally studied. Here, we report an intergenic VNTR polymorphism that confers an altered level of toxin production and increased virulence in Streptococcus pyogenes The nature of the polymorphism is a one-unit deletion in a three-tandem-repeat locus upstream of the rocA gene encoding a sensor kinase. S. pyogenes strains with this type of polymorphism cause human infection and produce significantly larger amounts of the secreted cytotoxins S. pyogenes NADase (SPN) and streptolysin O (SLO). Using isogenic mutant strains, we demonstrate that deleting one or more units of the tandem repeats abolished RocA production, reduced CovR phosphorylation, derepressed multiple CovR-regulated virulence factors (such as SPN and SLO), and increased virulence in a mouse model of necrotizing fasciitis. The phenotypic effect of the VNTR polymorphism was nearly the same as that of inactivating the rocA gene. In summary, we identified and characterized an intergenic VNTR polymorphism in S. pyogenes that affects toxin production and virulence. These new findings enhance understanding of rocA biology and the function of VNTR polymorphisms in S. pyogenes. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Concerted evolution of the tandem array encoding primate U2 snRNA occurs in situ, without changing the cytological context of the RNU2 locus.

PubMed Central

Pavelitz, T; Rusché, L; Matera, A G; Scharf, J M; Weiner, A M

1995-01-01

In primates, the tandemly repeated genes encoding U2 small nuclear RNA evolve concertedly, i.e. the sequence of the U2 repeat unit is essentially homogeneous within each species but differs somewhat between species. Using chromosome painting and the NGFR gene as an outside marker, we show that the U2 tandem array (RNU2) has remained at the same chromosomal locus (equivalent to human 17q21) through multiple speciation events over > 35 million years leading to the Old World monkey and hominoid lineages. The data suggest that the U2 tandem repeat, once established in the primate lineage, contained sequence elements favoring perpetuation and concerted evolution of the array in situ, despite a pericentric inversion in chimpanzee, a reciprocal translocation in gorilla and a paracentric inversion in orang utan. Comparison of the 11 kb U2 repeat unit found in baboon and other Old World monkeys with the 6 kb U2 repeat unit in humans and other hominids revealed that an ancestral U2 repeat unit was expanded by insertion of a 5 kb retrovirus bearing 1 kb long terminal repeats (LTRs). Subsequent excision of the provirus by homologous recombination between the LTRs generated a 6 kb U2 repeat unit containing a solo LTR. Remarkably, both junctions between the human U2 tandem array and flanking chromosomal DNA at 17q21 fall within the solo LTR sequence, suggesting a role for the LTR in the origin or maintenance of the primate U2 array. Images PMID:7828589
Development of a massively parallel sequencing assay for investigating sequence polymorphisms of 15 short tandem repeats in a Chinese Northern Han population.

PubMed

Zhang, Qing-Xia; Yang, Meng; Pan, Ya-Jiao; Zhao, Jing; Qu, Bao-Wang; Cheng, Feng; Yang, Ya-Ran; Jiao, Zhang-Ping; Liu, Li; Yan, Jiang-Wei

2018-05-17

Massively parallel sequencing (MPS) has been used in forensic genetics in recent years owing to several advantages, e.g. MPS can provide precise descriptions of the repeat allele structure and variation in the repeat-flanking regions, increasing the discriminating power among loci and individuals. However, it cannot be fully utilized unless sufficient population data are available for all loci. Thus, there is a pressing need to perform population studies providing a basis for the introduction of MPS into forensic practice. Here, we constructed a multiplex PCR system with fusion primers for one-directional PCR for MPS of 15 commonly used forensic autosomal STRs and amelogenin. Samples from 554 unrelated Chinese Northern Han individuals were typed using this MPS assay. In total, 313 alleles obtained by MPS for all 15 STRs were observed, and the corresponding allele frequencies ranged between 0.0009 and 0.5162. Of all 15 loci, the number of alleles identified for 12 loci increased compared to capillary electrophoresis approaches, and for the following six loci more than double the number of alleles was found: D2S1338, D5S818, D21S11, D13S317, vWA, and D3S1358. Forensic parameters were calculated based on length and sequence-based alleles. D21S11 showed the highest heterozygosity (0.8791), discrimination power (0.9865), and paternity exclusion probability in trios (0.7529). The cumulative match probability for MPS was approximately 2.3157 × 10 -20 . © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Characterization and assessment of an avian repetitive DNA sequence as an icterid phylogenetic marker.

PubMed

Quinn, J S; Guglich, E; Seutin, G; Lau, R; Marsolais, J; Parna, L; Boag, P T; White, B N

1992-02-01

The first tandemly repeated sequence examined in a passerine bird, a 431-bp PstI fragment named pMAT1, has been cloned from the genome of the brown-headed cowbird (Molothrus ater). The sequence represents about 5-10% of the genome (about 4 x 10(5) copies) and yields prominent ethidium bromide stained bands when genomic DNA cut with a variety of restriction enzymes is electrophoresed in agarose gels. A particularly striking ladder of fragments is apparent when the DNA is cut with HinfI, indicative of a tandem arrangement of the monomer. The cloned PstI monomer has been sequenced, revealing no internal repeated structure. There are sequences that hybridize with pMAT1 found in related nine-primaried oscines but not in more distantly related oscines, suboscines, or nonpasserine species. Little sequence similarity to tandemly repeated PstI cut sequences from the merlin (Falco columbarius), saurus crane (Grus antigone), or Puerto Rican parrot (Amazona vittata) or to HinfI digested sequence from the Toulouse goose (Anser anser) was detected. The isolated sequence was used as a probe to examine DNA samples of eight members of the tribe Icterini. This examination revealed phylogenetically informative characters. The repeat contains cutting sites from a number of restriction enzymes, which, if sufficiently polymorphic, would provide new phylogenetic characters. Sequences like these, conserved within a species, but variable between closely related species, may be very useful for phylogenetic studies of closely related taxa.
Copy Number Heterogeneity, Large Origin Tandem Repeats, and Interspecies Recombination in Human Herpesvirus 6A (HHV-6A) and HHV-6B Reference Strains

PubMed Central

Roychoudhury, Pavitra; Makhsous, Negar; Hanson, Derek; Chase, Jill; Krueger, Gerhard; Xie, Hong; Huang, Meei-Li; Saunders, Lindsay; Ablashi, Dharam; Koelle, David M.; Cook, Linda; Jerome, Keith R.

2018-01-01

ABSTRACT Quantitative PCR is a diagnostic pillar for clinical virology testing, and reference materials are necessary for accurate, comparable quantitation between clinical laboratories. Accurate quantitation of human herpesvirus 6A/B (HHV-6A/B) is important for detection of viral reactivation and inherited chromosomally integrated HHV-6A/B in immunocompromised patients. Reference materials in clinical virology commonly consist of laboratory-adapted viral strains that may be affected by the culture process. We performed next-generation sequencing to make relative copy number measurements at single nucleotide resolution of eight candidate HHV-6A and seven HHV-6B reference strains and DNA materials from the HHV-6 Foundation and Advanced Biotechnologies Inc. Eleven of 17 (65%) HHV-6A/B candidate reference materials showed multiple copies of the origin of replication upstream of the U41 gene by next-generation sequencing. These large tandem repeats arose independently in culture-adapted HHV-6A and HHV-6B strains, measuring 1,254 bp and 983 bp, respectively. The average copy number measured was between 5 and 10 times the number of copies of the rest of the genome. We also report the first interspecies recombinant HHV-6A/B strain with a HHV-6A backbone and a >5.5-kb region from HHV-6B, from U41 to U43, that covered the origin tandem repeat. Specific HHV-6A reference strains demonstrated duplication of regions at U1/U2, U87, and U89, as well as deletion in the U12-to-U24 region and the U94/U95 genes. HHV-6A/B strains derived from cord blood mononuclear cells from different laboratories on different continents with fewer passages revealed no copy number differences throughout the viral genome. These data indicate that large origin tandem duplications are an adaptation of both HHV-6A and HHV-6B in culture and show interspecies recombination is possible within the Betaherpesvirinae. IMPORTANCE Anything in science that needs to be quantitated requires a standard unit of
A Legionella pneumophila collagen-like protein encoded by a gene with a variable number of tandem repeats is involved in the adherence and invasion of host cells.

PubMed

Vandersmissen, Liesbeth; De Buck, Emmy; Saels, Veerle; Coil, David A; Anné, Jozef

2010-05-01

Legionella pneumophila is a Gram-negative, facultative intracellular pathogen and the causative agent of Legionnaires' disease, a severe pneumonia in humans. Analysis of the Legionella sequenced genomes revealed a gene with a variable number of tandem repeats (VNTRs), whose number varies between strains. We examined the strain distribution of this gene among a collection of 108 clinical, environmental and hot spring serotype I strains. Twelve variants were identified, but no correlation was observed between the number of repeat units and clinical and environmental strains. The encoded protein contains the C-terminal consensus motif of outer membrane proteins and has a large region of collagen-like repeats that is encoded by the VNTR region. We have therefore annotated this protein Lcl for Legionella collagen-like protein. Lcl was shown to contribute to the adherence and invasion of host cells and it was demonstrated that the number of repeat units present in lcl had an influence on these adhesion characteristics.
Mapping Simple Repeated DNA Sequences in Heterochromatin of Drosophila Melanogaster

PubMed Central

Lohe, A. R.; Hilliker, A. J.; Roberts, P. A.

1993-01-01

Heterochromatin in Drosophila has unusual genetic, cytological and molecular properties. Highly repeated DNA sequences (satellites) are the principal component of heterochromatin. Using probes from cloned satellites, we have constructed a chromosome map of 10 highly repeated, simple DNA sequences in heterochromatin of mitotic chromosomes of Drosophila melanogaster. Despite extensive sequence homology among some satellites, chromosomal locations could be distinguished by stringent in situ hybridizations for each satellite. Only two of the localizations previously determined using gradient-purified bulk satellite probes are correct. Eight new satellite localizations are presented, providing a megabase-level chromosome map of one-quarter of the genome. Five major satellites each exhibit a multichromosome distribution, and five minor satellites hybridize to single sites on the Y chromosome. Satellites closely related in sequence are often located near one another on the same chromosome. About 80% of Y chromosome DNA is composed of nine simple repeated sequences, in particular (AAGAC)(n) (8 Mb), (AAGAG)(n) (7 Mb) and (AATAT)(n) (6 Mb). Similarly, more than 70% of the DNA in chromosome 2 heterochromatin is composed of five simple repeated sequences. We have also generated a high resolution map of satellites in chromosome 2 heterochromatin, using a series of translocation chromosomes whose breakpoints in heterochromatin were ordered by N-banding. Finally, staining and banding patterns of heterochromatic regions are correlated with the locations of specific repeated DNA sequences. The basis for the cytochemical heterogeneity in banding appears to depend exclusively on the different satellite DNAs present in heterochromatin. PMID:8375654
Evolution Analysis of Simple Sequence Repeats in Plant Genome.

PubMed

Qin, Zhen; Wang, Yanping; Wang, Qingmei; Li, Aixian; Hou, Fuyun; Zhang, Liming

2015-01-01

Simple sequence repeats (SSRs) are widespread units on genome sequences, and play many important roles in plants. In order to reveal the evolution of plant genomes, we investigated the evolutionary regularities of SSRs during the evolution of plant species and the plant kingdom by analysis of twelve sequenced plant genome sequences. First, in the twelve studied plant genomes, the main SSRs were those which contain repeats of 1-3 nucleotides combination. Second, in mononucleotide SSRs, the A/T percentage gradually increased along with the evolution of plants (except for P. patens). With the increase of SSRs repeat number the percentage of A/T in C. reinhardtii had no significant change, while the percentage of A/T in terrestrial plants species gradually declined. Third, in dinucleotide SSRs, the percentage of AT/TA increased along with the evolution of plant kingdom and the repeat number increased in terrestrial plants species. This trend was more obvious in dicotyledon than monocotyledon. The percentage of CG/GC showed the opposite pattern to the AT/TA. Forth, in trinucleotide SSRs, the percentages of combinations including two or three A/T were in a rising trend along with the evolution of plant kingdom; meanwhile with the increase of SSRs repeat number in plants species, different species chose different combinations as dominant SSRs. SSRs in C. reinhardtii, P. patens, Z. mays and A. thaliana showed their specific patterns related to evolutionary position or specific changes of genome sequences. The results showed that, SSRs not only had the general pattern in the evolution of plant kingdom, but also were associated with the evolution of the specific genome sequence. The study of the evolutionary regularities of SSRs provided new insights for the analysis of the plant genome evolution.
Inter-laboratory comparison of multi-locus variable-number tandem repeat analysis (MLVA) for verocytotoxin-producing Escherichia coli O157 to facilitate data sharing.

PubMed

Holmes, A; Perry, N; Willshaw, G; Hanson, M; Allison, L

2015-01-01

Multi-locus variable number tandem repeat analysis (MLVA) is used in clinical and reference laboratories for subtyping verocytotoxin-producing Escherichia coli O157 (VTEC O157). However, as yet there is no common allelic or profile nomenclature to enable laboratories to easily compare data. In this study, we carried out an inter-laboratory comparison of an eight-loci MLVA scheme using a set of 67 isolates of VTEC O157. We found all but two isolates were identical in profile in the two laboratories, and repeat units were homogeneous in size but some were incomplete. A subset of the isolates (n = 17) were sequenced to determine the actual copy number of representative alleles, thereby enabling alleles to be named according to international consensus guidelines. This work has enabled us to realize the potential of MLVA as a portable, highly discriminatory and convenient subtyping method.
Methods for sequencing GC-rich and CCT repeat DNA templates

DOEpatents

Robinson, Donna L.

2007-02-20

The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.
Algorithm to find distant repeats in a single protein sequence

PubMed Central

Banerjee, Nirjhar; Sarani, Rangarajan; Ranjani, Chellamuthu Vasuki; Sowmiya, Govindaraj; Michael, Daliah; Balakrishnan, Narayanasamy; Sekar, Kanagaraj

2008-01-01

Distant repeats in protein sequence play an important role in various aspects of protein analysis. A keen analysis of the distant repeats would enable to establish a firm relation of the repeats with respect to their function and three-dimensional structure during the evolutionary process. Further, it enlightens the diversity of duplication during the evolution. To this end, an algorithm has been developed to find all distant repeats in a protein sequence. The scores from Point Accepted Mutation (PAM) matrix has been deployed for the identification of amino acid substitutions while detecting the distant repeats. Due to the biological importance of distant repeats, the proposed algorithm will be of importance to structural biologists, molecular biologists, biochemists and researchers involved in phylogenetic and evolutionary studies. PMID:19052663
ScanRanker: Quality Assessment of Tandem Mass Spectra via Sequence Tagging

PubMed Central

Ma, Ze-Qiang; Chambers, Matthew C.; Ham, Amy-Joan L.; Cheek, Kristin L.; Whitwell, Corbin W.; Aerni, Hans-Rudolf; Schilling, Birgit; Miller, Aaron W.; Caprioli, Richard M.; Tabb, David L.

2011-01-01

In shotgun proteomics, protein identification by tandem mass spectrometry relies on bioinformatics tools. Despite recent improvements in identification algorithms, a significant number of high quality spectra remain unidentified for various reasons. Here we present ScanRanker, an open-source tool that evaluates the quality of tandem mass spectra via sequence tagging with reliable performance in data from different instruments. The superior performance of ScanRanker enables it not only to find unassigned high quality spectra that evade identification through database search, but also to select spectra for de novo sequencing and cross-linking analysis. In addition, we demonstrate that the distribution of ScanRanker scores predicts the richness of identifiable spectra among multiple LC-MS/MS runs in an experiment, and ScanRanker scores assist the process of peptide assignment validation to increase confident spectrum identifications. The source code and executable versions of ScanRanker are available from http://fenchurch.mc.vanderbilt.edu. PMID:21520941
Genome Wide Characterization of Simple Sequence Repeats in Cucumber

USDA-ARS?s Scientific Manuscript database

The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...
Diversity and Plasticity of the Intracellular Plant Pathogen and Insect Symbiont “Candidatus Liberibacter asiaticus” as Revealed by Hypervariable Prophage Genes with Intragenic Tandem Repeats ▿ †

PubMed Central

Zhou, Lijuan; Powell, Charles A.; Hoffman, Michele T.; Li, Wenbin; Fan, Guocheng; Liu, Bo; Lin, Hong; Duan, Yongping

2011-01-01

“Candidatus Liberibacter asiaticus” is a psyllid-transmitted, phloem-limited alphaproteobacterium and the most prevalent species of “Ca. Liberibacter” associated with a devastating worldwide citrus disease known as huanglongbing (HLB). Two related and hypervariable genes (hyvI and hyvII) were identified in the prophage regions of the Psy62 “Ca. Liberibacter asiaticus” genome. Sequence analyses of the hyvI and hyvII genes in 35 “Ca. Liberibacter asiaticus” DNA isolates collected globally revealed that the hyvI gene contains up to 12 nearly identical tandem repeats (NITRs, 132 bp) and 4 partial repeats, while hyvII contains up to 2 NITRs and 4 partial repeats and shares homology with hyvI. Frequent deletions or insertions of these repeats within the hyvI and hyvII genes were observed, none of which disrupted the open reading frames. Sequence conservation within the individual repeats but an extensive variation in repeat numbers, rearrangement, and the sequences flanking the repeat region indicate the diversity and plasticity of “Ca. Liberibacter asiaticus” bacterial populations in the world. These differences were found not only in samples of distinct geographical origins but also in samples from a single origin and even from a single “Ca. Liberibacter asiaticus”-infected sample. This is the first evidence of different “Ca. Liberibacter asiaticus” populations coexisting in a single HLB-affected sample. The Florida “Ca. Liberibacter asiaticus” isolates contain both hyvI and hyvII, while all other global “Ca. Liberibacter asiaticus” isolates contain either one or the other. Interclade assignments of the putative HyvI and HyvII proteins from Florida isolates with other global isolates in phylogenetic trees imply multiple “Ca. Liberibacter asiaticus” populations in the world and a multisource introduction of the “Ca. Liberibacter asiaticus” bacterium into Florida. PMID:21784907
Concerted evolution of the tandemly repeated genes encoding primate U2 small nuclear RNA (the RNU2 locus) does not prevent rapid diversification of the (CT){sub n} {center_dot} (GA){sub n} microsatellite embedded within the U2 repeat unit

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liao, D.; Weiner, A.M.

1995-12-10

The RNU2 locus encoding human U2 small nuclear RNA (snRNA) is organized as a nearly perfect tandem array containing 5 to 22 copies of a 5.8-kb repeat unit. Just downstream of the U2 snRNA gene in each 5.8-kb repeat unit lies a large (CT){sub n}{center_dot}(GA){sub n} dinucleotide repeat (n {approx} 70). This form of genomic organization, in which one repeat is embedded within another, provides an unusual opportunity to study the balance of forces maintaining the homogeneity of both kinds of repeats. Using a combination of field inversion gel electrophoresis and polymerase chain reaction, we have been able to studymore » the CT microsatellites within individual U2 tandem arrays. We find that the CT microsatellites within an RNU2 allele exhibit significant length polymorphism, despite the remarkable homogeneity of the surrounding U2 repeat units. Length polymorphism is due primarily to loss or gain of CT dinucleotide repeats, but other types of deletions, insertions, and substitutions are also frequent. Polymorphism is greatly reduced in regions where pure (CT){sub n} tracts are interrupted by occasional G residues, suggesting that irregularities stabilize both the length and the sequence of the dinucleotide repeat. We further show that the RNU2 loci of other catarrhine primates (gorilla, chimpanzee, ogangutan, and baboon) contain orthologous CT microsatellites; these also exhibit length polymorphism, but are highly divergent from each other. Thus, although the CT microsatellite is evolving far more rapidly than the rest of the U2 repeat unit, it has persisted through multiple speciation events spanning >35 Myr. The persistence of the CT microsatellite, despite polymorphism and rapid evolution, suggests that it might play a functional role in concerted evolution of the RNU2 loci, perhaps as an initiation site for recombination and/or gene conversion. 70 refs., 5 figs.« less
ACCA phosphopeptide recognition by the BRCT repeats of BRCA1.

PubMed

Ray, Hind; Moreau, Karen; Dizin, Eva; Callebaut, Isabelle; Venezia, Nicole Dalla

2006-06-16

The tumour suppressor gene BRCA1 encodes a 220 kDa protein that participates in multiple cellular processes. The BRCA1 protein contains a tandem of two BRCT repeats at its carboxy-terminal region. The majority of disease-associated BRCA1 mutations affect this region and provide to the BRCT repeats a central role in the BRCA1 tumour suppressor function. The BRCT repeats have been shown to mediate phospho-dependant protein-protein interactions. They recognize phosphorylated peptides using a recognition groove that spans both BRCT repeats. We previously identified an interaction between the tandem of BRCA1 BRCT repeats and ACCA, which was disrupted by germ line BRCA1 mutations that affect the BRCT repeats. We recently showed that BRCA1 modulates ACCA activity through its phospho-dependent binding to ACCA. To delineate the region of ACCA that is crucial for the regulation of its activity by BRCA1, we searched for potential phosphorylation sites in the ACCA sequence that might be recognized by the BRCA1 BRCT repeats. Using sequence analysis and structure modelling, we proposed the Ser1263 residue as the most favourable candidate among six residues, for recognition by the BRCA1 BRCT repeats. Using experimental approaches, such as GST pull-down assay with Bosc cells, we clearly showed that phosphorylation of only Ser1263 was essential for the interaction of ACCA with the BRCT repeats. We finally demonstrated by immunoprecipitation of ACCA in cells, that the whole BRCA1 protein interacts with ACCA when phosphorylated on Ser1263.
Mimosoid legume plastome evolution: IR expansion, tandem repeat expansions, and accelerated rate of evolution in clpP.

PubMed

Dugas, Diana V; Hernandez, David; Koenen, Erik J M; Schwarz, Erika; Straub, Shannon; Hughes, Colin E; Jansen, Robert K; Nageswara-Rao, Madhugiri; Staats, Martijn; Trujillo, Joshua T; Hajrah, Nahid H; Alharbi, Njud S; Al-Malki, Abdulrahman L; Sabir, Jamal S M; Bailey, C Donovan

2015-11-23

The Leguminosae has emerged as a model for studying angiosperm plastome evolution because of its striking diversity of structural rearrangements and sequence variation. However, most of what is known about legume plastomes comes from few genera representing a subset of lineages in subfamily Papilionoideae. We investigate plastome evolution in subfamily Mimosoideae based on two newly sequenced plastomes (Inga and Leucaena) and two recently published plastomes (Acacia and Prosopis), and discuss the results in the context of other legume and rosid plastid genomes. Mimosoid plastomes have a typical angiosperm gene content and general organization as well as a generally slow rate of protein coding gene evolution, but they are the largest known among legumes. The increased length results from tandem repeat expansions and an unusual 13 kb IR-SSC boundary shift in Acacia and Inga. Mimosoid plastomes harbor additional interesting features, including loss of clpP intron1 in Inga, accelerated rates of evolution in clpP for Acacia and Inga, and dN/dS ratios consistent with neutral and positive selection for several genes. These new plastomes and results provide important resources for legume comparative genomics, plant breeding, and plastid genetic engineering, while shedding further light on the complexity of plastome evolution in legumes and angiosperms.
Selection and Validation of a Multilocus Variable-Number Tandem-Repeat Analysis Panel for Typing Shigella spp.▿ †

PubMed Central

Gorgé, Olivier; Lopez, Stéphanie; Hilaire, Valérie; Lisanti, Olivier; Ramisse, Vincent; Vergnaud, Gilles

2008-01-01

The Shigella genus has historically been separated into four species, based on biochemical assays. The classification within each species relies on serotyping. Recently, genome sequencing and DNA assays, in particular the multilocus sequence typing (MLST) approach, greatly improved the current knowledge of the origin and phylogenetic evolution of Shigella spp. The Shigella and Escherichia genera are now considered to belong to a unique genomospecies. Multilocus variable-number tandem-repeat (VNTR) analysis (MLVA) provides valuable polymorphic markers for genotyping and performing phylogenetic analyses of highly homogeneous bacterial pathogens. Here, we assess the capability of MLVA for Shigella typing. Thirty-two potentially polymorphic VNTRs were selected by analyzing in silico five Shigella genomic sequences and subsequently evaluated. Eventually, a panel of 15 VNTRs was selected (i.e., MLVA15 analysis). MLVA15 analysis of 78 strains or genome sequences of Shigella spp. and 11 strains or genome sequences of Escherichia coli distinguished 83 genotypes. Shigella population cluster analysis gave consistent results compared to MLST. MLVA15 analysis showed capabilities for E. coli typing, providing classification among pathogenic and nonpathogenic E. coli strains included in the study. The resulting data can be queried on our genotyping webpage (http://mlva.u-psud.fr). The MLVA15 assay is rapid, highly discriminatory, and reproducible for Shigella and Escherichia strains, suggesting that it could significantly contribute to epidemiological trace-back analysis of Shigella infections and pathogenic Escherichia outbreaks. Typing was performed on strains obtained mostly from collections. Further studies should include strains of much more diverse origins, including all pathogenic E. coli types. PMID:18216214
Development of a Multiple-Locus Variable number of tandem repeat Analysis (MLVA) for Leptospira interrogans and its application to Leptospira interrogans serovar Australis isolates from Far North Queensland, Australia

PubMed Central

Slack, Andrew T; Dohnt, Michael F; Symonds, Meegan L; Smythe, Lee D

2005-01-01

Background Leptospirosis is a zoonotic disease caused by the genus, Leptospira. Leptospira interrogans is the most common genomospecies implicated in the disease. Epidemiological investigations are needed to distinguish outbreak situations or to trace reservoirs of the organisms. Current methodologies used for typing Leptospira have significant drawbacks. The development of an easy to perform yet high resolution method is needed for this organism. Methods In this study we have searched the available genomic sequence of L. interrogans serovar Copenhageni strain Fiocruz L1-130 for the presence of tandem repeats [1]. These repeats were evaluated against reference strains for diversity. Six loci were selected to create a Multiple Locus Variable Number of Tandem Repeats (VNTR) Analysis (MLVA) to explore the genetic diversity within L. interrogans serovar Australis clinical isolates from Far North Queensland. Results The 39 reference strains used for the development of the method displayed 39 distinct patterns. Diversity Indexes for the loci varied between 0.80 and 0.93 and the number of repeat units at each locus varied between less than one to 52 repeats. When the MLVA was applied to serovar Australis isolates three large clusters were distinguishable, each comprising various hosts including Rattus species, human and canines. Conclusion The MLVA described in this report, was easy to perform, analyse and was reproducible. The loci selected had high diversity allowing discrimination between serovars and also between strains within a serovar. This method provides a starting point on which improvements to the method and comparisons to other techniques can be made. PMID:15987533
Evolution of short inverted repeat in cupressophytes, transfer of accD to nucleus in Sciadopitys verticillata and phylogenetic position of Sciadopityaceae.

PubMed

Li, Jia; Gao, Lei; Chen, Shanshan; Tao, Ke; Su, Yingjuan; Wang, Ting

2016-02-11

Sciadopitys verticillata is an evergreen conifer and an economically valuable tree used in construction, which is the only member of the family Sciadopityaceae. Acquisition of the S. verticillata chloroplast (cp) genome will be useful for understanding the evolutionary mechanism of conifers and phylogenetic relationships among gymnosperm. In this study, we have first reported the complete chloroplast genome of S. verticillata. The total genome is 138,284 bp in length, consisting of 118 unique genes. The S. verticillata cp genome has lost one copy of the canonical inverted repeats and shown distinctive genomic structure comparing with other cupressophytes. Fifty-three simple sequence repeat loci and 18 forward tandem repeats were identified in the S. verticillata cp genome. According to the rearrangement of cupressophyte cp genome, we proposed one mechanism for the formation of inverted repeat: tandem repeat occured first, then rearrangement divided the tandem repeat into inverted repeats located at different regions. Phylogenetic estimates inferred from 59-gene sequences and cpDNA organizations have both shown that S. verticillata was sister to the clade consisting of Cupressaceae, Taxaceae, and Cephalotaxaceae. Moreover, accD gene was found to be lost in the S. verticillata cp genome, and a nucleus copy was identified from two transcriptome data.
Tandem repeat variation near the HIC1 (hypermethylated in cancer 1) promoter predicts outcome of oxaliplatin-based chemotherapy in patients with metastatic colorectal cancer.

PubMed

Okazaki, Satoshi; Schirripa, Marta; Loupakis, Fotios; Cao, Shu; Zhang, Wu; Yang, Dongyun; Ning, Yan; Berger, Martin D; Miyamoto, Yuji; Suenaga, Mitsukuni; Iqubal, Syma; Barzi, Afsaneh; Cremolini, Chiara; Falcone, Alfredo; Battaglin, Francesca; Salvatore, Lisa; Borelli, Beatrice; Helentjaris, Timothy G; Lenz, Heinz-Josef

2017-11-15

The hypermethylated in cancer 1/sirtuin 1 (HIC1/SIRT1) axis plays an important role in regulating the nucleotide excision repair pathway, which is the main oxaliplatin-induced damage-repair system. On the basis of prior evidence that the variable number of tandem repeat (VNTR) sequence located near the promoter lesion of HIC1 is associated with HIC1 gene expression, the authors tested the hypothesis that this VNTR is associated with clinical outcome in patients with metastatic colorectal cancer who receive oxaliplatin-based chemotherapy. Four independent cohorts were tested. Patients who received oxaliplatin-based chemotherapy served as the training cohort (n = 218), and those who received treatment without oxaliplatin served as the control cohort (n = 215). Two cohorts of patients who received oxaliplatin-based chemotherapy were used for validation studies (n = 176 and n = 73). The VNTR sequence near HIC1 was analyzed by polymerase chain reaction analysis and gel electrophoresis and was tested for associations with the response rate, progression-free survival, and overall survival. In the training cohort, patients who harbored at least 5 tandem repeats (TRs) in both alleles had a significantly shorter PFS compared with those who had fewer than 4 TRs in at least 1 allele (9.5 vs 11.6 months; hazard ratio, 1.93; P = .012), and these findings remained statistically significant after multivariate analysis (hazard ratio, 2.00; 95% confidence interval, 1.13-3.54; P = .018). This preliminary association was confirmed in the validation cohort, and patients who had at least 5 TRs in both alleles had a worse PFS compared with the other cohort (7.9 vs 9.8 months; hazard ratio, 1.85; P = .044). The current findings suggest that the VNTR sequence near HIC1 could be a predictive marker for oxaliplatin-based chemotherapy in patients with metastatic colorectal cancer. Cancer 2017;123:4506-14. © 2017 American Cancer Society. © 2017 American Cancer Society.
Whole-genome sequencing reveals a coding non-pathogenic variant tagging a non-coding pathogenic hexanucleotide repeat expansion in C9orf72 as cause of amyotrophic lateral sclerosis.

PubMed

Herdewyn, Sarah; Zhao, Hui; Moisse, Matthieu; Race, Valérie; Matthijs, Gert; Reumers, Joke; Kusters, Benno; Schelhaas, Helenius J; van den Berg, Leonard H; Goris, An; Robberecht, Wim; Lambrechts, Diether; Van Damme, Philip

2012-06-01

Motor neuron degeneration in amyotrophic lateral sclerosis (ALS) has a familial cause in 10% of patients. Despite significant advances in the genetics of the disease, many families remain unexplained. We performed whole-genome sequencing in five family members from a pedigree with autosomal-dominant classical ALS. A family-based elimination approach was used to identify novel coding variants segregating with the disease. This list of variants was effectively shortened by genotyping these variants in 2 additional unaffected family members and 1500 unrelated population-specific controls. A novel rare coding variant in SPAG8 on chromosome 9p13.3 segregated with the disease and was not observed in controls. Mutations in SPAG8 were not encountered in 34 other unexplained ALS pedigrees, including 1 with linkage to chromosome 9p13.2-23.3. The shared haplotype containing the SPAG8 variant in this small pedigree was 22.7 Mb and overlapped with the core 9p21 linkage locus for ALS and frontotemporal dementia. Based on differences in coverage depth of known variable tandem repeat regions between affected and non-affected family members, the shared haplotype was found to contain an expanded hexanucleotide (GGGGCC)(n) repeat in C9orf72 in the affected members. Our results demonstrate that rare coding variants identified by whole-genome sequencing can tag a shared haplotype containing a non-coding pathogenic mutation and that changes in coverage depth can be used to reveal tandem repeat expansions. It also confirms (GGGGCC)n repeat expansions in C9orf72 as a cause of familial ALS.
Short intronic repeat sequences facilitate circular RNA production.

PubMed

Liang, Dongming; Wilusz, Jeremy E

2014-10-15

Recent deep sequencing studies have revealed thousands of circular noncoding RNAs generated from protein-coding genes. These RNAs are produced when the precursor messenger RNA (pre-mRNA) splicing machinery "backsplices" and covalently joins, for example, the two ends of a single exon. However, the mechanism by which the spliceosome selects only certain exons to circularize is largely unknown. Using extensive mutagenesis of expression plasmids, we show that miniature introns containing the splice sites along with short (∼ 30- to 40-nucleotide) inverted repeats, such as Alu elements, are sufficient to allow the intervening exons to circularize in cells. The intronic repeats must base-pair to one another, thereby bringing the splice sites into close proximity to each other. More than simple thermodynamics is clearly at play, however, as not all repeats support circularization, and increasing the stability of the hairpin between the repeats can sometimes inhibit circular RNA biogenesis. The intronic repeats and exonic sequences must collaborate with one another, and a functional 3' end processing signal is required, suggesting that circularization may occur post-transcriptionally. These results suggest detailed and generalizable models that explain how the splicing machinery determines whether to produce a circular noncoding RNA or a linear mRNA. © 2014 Liang and Wilusz; Published by Cold Spring Harbor Laboratory Press.

Short intronic repeat sequences facilitate circular RNA production

PubMed Central

Liang, Dongming

2014-01-01

Recent deep sequencing studies have revealed thousands of circular noncoding RNAs generated from protein-coding genes. These RNAs are produced when the precursor messenger RNA (pre-mRNA) splicing machinery “backsplices” and covalently joins, for example, the two ends of a single exon. However, the mechanism by which the spliceosome selects only certain exons to circularize is largely unknown. Using extensive mutagenesis of expression plasmids, we show that miniature introns containing the splice sites along with short (∼30- to 40-nucleotide) inverted repeats, such as Alu elements, are sufficient to allow the intervening exons to circularize in cells. The intronic repeats must base-pair to one another, thereby bringing the splice sites into close proximity to each other. More than simple thermodynamics is clearly at play, however, as not all repeats support circularization, and increasing the stability of the hairpin between the repeats can sometimes inhibit circular RNA biogenesis. The intronic repeats and exonic sequences must collaborate with one another, and a functional 3′ end processing signal is required, suggesting that circularization may occur post-transcriptionally. These results suggest detailed and generalizable models that explain how the splicing machinery determines whether to produce a circular noncoding RNA or a linear mRNA. PMID:25281217
Stress-induced rearrangement of Fusarium retrotransposon sequences.

PubMed

Anaya, N; Roncero, M I

1996-11-27

Rearrangement of fusarium oxysporum retrotransposon skippy was induced by growth in the presence of potassium chlorate. Three fungal strains, one sensitive to chlorate (Co60) and two resistant to chlorate and deficient for nitrate reductase (Co65 and Co94), were studied by Southern analysis of their genomic DNA. Polymorphism was detected in their hybridization banding pattern, relative to the wild type grown in the absence of chlorate, using various enzymes with or without restriction sites within the retrotransposon. Results were consistent with the assumption that three different events had occurred in strain Co60: genomic amplification of skippy yielding tandem arrays of the element, generation of new skippy sequences, and deletion of skippy sequences. Amplification of Co60 genomic DNA using the polymerase chain reaction and divergent primers derived from the retrotransposon generated a new band, corresponding to one long terminal repeat plus flanking sequences, that was not present in the wild-type strain. Molecular analysis of nitrate reductase-deficient mutants showed that generation and deletion of skippy sequences, but not genomic amplification in tandem repeats, had occurred in their genomes.
Length and sequence variability in mitochondrial control region of the milkfish, Chanos chanos.

PubMed

Ravago, Rachel G; Monje, Virginia D; Juinio-Meñez, Marie Antonette

2002-01-01

Extensive length variability was observed in the mitochondrial control region of the milkfish, Chanos chanos. The nucleotide sequence of the control region and flanking regions was determined. Length variability and heteroplasmy was due to the presence of varying numbers of a 41-bp tandemly repeated sequence and a 48-bp insertion/deletion (indel). The structure and organization of the milkfish control region is similar to that of other teleost fish and vertebrates. However, extensive variation in the copy number of tandem repeats (4-20 copies) and the presence of a relatively large (48-bp) indel, are apparently uncommon in teleost fish control region sequences reported to date. High sequence variability of control region peripheral domains indicates the potential utility of selected regions as markers for population-level studies.
Always look on both sides: Phylogenetic information conveyed by simple sequence repeat allele sequences

USDA-ARS?s Scientific Manuscript database

Simple sequence repeat (SSR) markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily,...
DNA Fingerprint Analysis of Three Short Tandem Repeat (STR) Loci for Biochemistry and Forensic Science Laboratory Courses

ERIC Educational Resources Information Center

McNamara-Schroeder, Kathleen; Olonan, Cheryl; Chu, Simon; Montoya, Maria C.; Alviri, Mahta; Ginty, Shannon; Love, John J.

2006-01-01

We have devised and implemented a DNA fingerprinting module for an upper division undergraduate laboratory based on the amplification and analysis of three of the 13 short tandem repeat loci that are required by the Federal Bureau of Investigation Combined DNA Index System (FBI CODIS) data base. Students first collect human epithelial (cheek)…
Identification, variation and transcription of pneumococcal repeat sequences

PubMed Central

2011-01-01

Background Small interspersed repeats are commonly found in many bacterial chromosomes. Two families of repeats (BOX and RUP) have previously been identified in the genome of Streptococcus pneumoniae, a nasopharyngeal commensal and respiratory pathogen of humans. However, little is known about the role they play in pneumococcal genetics. Results Analysis of the genome of S. pneumoniae ATCC 700669 revealed the presence of a third repeat family, which we have named SPRITE. All three repeats are present at a reduced density in the genome of the closely related species S. mitis. However, they are almost entirely absent from all other streptococci, although a set of elements related to the pneumococcal BOX repeat was identified in the zoonotic pathogen S. suis. In conjunction with information regarding their distribution within the pneumococcal chromosome, this suggests that it is unlikely that these repeats are specialised sequences performing a particular role for the host, but rather that they constitute parasitic elements. However, comparing insertion sites between pneumococcal sequences indicates that they appear to transpose at a much lower rate than IS elements. Some large BOX elements in S. pneumoniae were found to encode open reading frames on both strands of the genome, whilst another was found to form a composite RNA structure with two T box riboswitches. In multiple cases, such BOX elements were demonstrated as being expressed using directional RNA-seq and RT-PCR. Conclusions BOX, RUP and SPRITE repeats appear to have proliferated extensively throughout the pneumococcal chromosome during the species' past, but novel insertions are currently occurring at a relatively slow rate. Through their extensive secondary structures, they seem likely to affect the expression of genes with which they are co-transcribed. Software for annotation of these repeats is freely available from ftp://ftp.sanger.ac.uk/pub/pathogens/strep_repeats/. PMID:21333003
TANDEM: matching proteins with tandem mass spectra.

PubMed

Craig, Robertson; Beavis, Ronald C

2004-06-12

Tandem mass spectra obtained from fragmenting peptide ions contain some peptide sequence specific information, but often there is not enough information to sequence the original peptide completely. Several proprietary software applications have been developed to attempt to match the spectra with a list of protein sequences that may contain the sequence of the peptide. The application TANDEM was written to provide the proteomics research community with a set of components that can be used to test new methods and algorithms for performing this type of sequence-to-data matching. The source code and binaries for this software are available at http://www.proteome.ca/opensource.html, for Windows, Linux and Macintosh OSX. The source code is made available under the Artistic License, from the authors.
Visualization of tandem repeat mutagenesis in Bacillus subtilis.

PubMed

Dormeyer, Miriam; Lentes, Sabine; Ballin, Patrick; Wilkens, Markus; Klumpp, Stefan; Kohlheyer, Dietrich; Stannek, Lorena; Grünberger, Alexander; Commichau, Fabian M

2018-03-01

Mutations are crucial for the emergence and evolution of proteins with novel functions, and thus for the diversity of life. Tandem repeats (TRs) are mutational hot spots that are present in the genomes of all organisms. Understanding the molecular mechanism underlying TR mutagenesis at the level of single cells requires the development of mutation reporter systems. Here, we present a mutation reporter system that is suitable to visualize mutagenesis of TRs occurring in single cells of the Gram-positive model bacterium Bacillus subtilis using microfluidic single-cell cultivation. The system allows measuring the elimination of TR units due to growth rate recovery. The cultivation of bacteria carrying the mutation reporter system in microfluidic chambers allowed us for the first time to visualize the emergence of a specific mutation at the level of single cells. The application of the mutation reporter system in combination with microfluidics might be helpful to elucidate the molecular mechanism underlying TR (in)stability in bacteria. Moreover, the mutation reporter system might be useful to assess whether mutations occur in response to nutrient starvation. Copyright © 2018 Elsevier B.V. All rights reserved.
The repeat organizer, a specialized insulator element within the intergenic spacer of the Xenopus rRNA genes.

PubMed Central

Robinett, C C; O'Connor, A; Dunaway, M

1997-01-01

We have identified a novel activity for the region of the intergenic spacer of the Xenopus laevis rRNA genes that contains the 35- and 100-bp repeats. We devised a new assay for this region by constructing DNA plasmids containing a tandem repeat of rRNA reporter genes that were separated by the 35- and 100-bp repeat region and a rRNA gene enhancer. When the 35- and 100-bp repeat region is present in its normal position and orientation at the 3' end of the rRNA reporter genes, the enhancer activates the adjacent downstream promoter but not the upstream rRNA promoter on the same plasmid. Because this element can restrict the range of an enhancer's activity in the context of tandem genes, we have named it the repeat organizer (RO). The ability to restrict enhancer action is a feature of insulator elements, but unlike previously described insulator elements the RO does not block enhancer action in a simple enhancer-blocking assay. Instead, the activity of the RO requires that it be in its normal position and orientation with respect to the other sequence elements of the rRNA genes. The enhancer-binding transcription factor xUBF also binds to the repetitive sequences of the RO in vitro, but these sequences do not activate transcription in vivo. We propose that the RO is a specialized insulator element that organizes the tandem array of rRNA genes into single-gene expression units by promoting activation of a promoter by its proximal enhancers. PMID:9111359
A complete mitochondrial genome sequence of Asian black bear Sichuan subspecies (Ursus thibetanus mupinensis)

PubMed Central

Hou, Wan-ru; Chen, Yu; Wu, Xia; Hu, Jin-chu; Peng, Zheng-song; Yang, Jung; Tang, Zong-xiang; Zhou, Cai-Quan; Li, Yu-ming; Yang, Shi-kui; Du, Yu-jie; Kong, Ling-lu; Ren, Zheng-long; Zhang, Huai-yu; Shuai, Su-rong

2007-01-01

We obtained the complete mitochondrial genome of U.thibetanus mupinensis by DNA sequencing based on the PCR fragments of 18 primers we designed. The results indicate that the mtDNA is 16 868 bp in size, encodes 13 protein genes, 22 tRNA genes, and 2 rRNA genes, with an overall H-strand base composition of 31.2% A, 25.4% C, 15.5% G and 27.9% T. The sequence of the control region (CR) located between tRNA-Pro and tRNA-Phe is 1422 bp in size, consists of 8.43% of the whole genome, GC content is 51.9% and has a 6bp tandem repeat and two 10bp tandem repeats identified by using the Tandem Repeats Finder. U. thibetanus mupinensis mitochondrial genome shares high similarity with those of three other Ursidae: U. americanus (91.46%), U. arctos (89.25%) and U. maritimus (87.66%). PMID:17205108
The production and characterization of novel heavy-chain antibodies against the tandem repeat region of MUC1 mucin.

PubMed

Rahbarizadeh, Fatemeh; Rasaee, Mohammad J; Forouzandeh, Mehdi; Allameh, Abdolamir; Sarrami, Ramin; Nasiry, Habib; Sadeghizadeh, Majid

2005-01-01

Camelidae are known to produce immunoglobulins (Igs) devoid of light chains and constant heavy-chain domains (CH1). Antigen-specific fragments of these heavy-chain IgGs (VHH) are of great interest in biotechnology applications. This paper describes the first example of successfully raised heavy-chain antibodies in Camelus dromedarius (single-humped camel) and Camelus bactrianus (two-humped camel) against a MUC1 related peptide that is found to be an important epitope expressed in cancerous tissue. Camels were immunized against a synthetic peptide corresponding to the tandem repeat region of MUC1 mucin and cancerous tissue preparation obtained from patients suffering from breast carcinoma. Three IgG subclasses with different binding properties to protein A and G were purified by affinity chromatography. Both conventional and heavy-chain IgG antibodies were produced in response to MUC1-related peptide. The elicited antibodies could react specifically with the tandem repeat region of MUC1 mucin in an enzyme linked immunosorbant assay (ELISA). Anti-peptide antibodies were purified after passing antiserum over two affinity chromatography columns. Using ELISA, immunocytochemistry and Western blotting, the interaction of purified antibodies with different antigens was evaluated. The antibodies were observed to be selectively bound to antigens namely: MUC1 peptide (tandem repeat region), human milk fat globule membrane (HMFG), deglycosylated human milk fat globule membrane (D-HMFG), homogenized cancerous breast tissue and a native MUC1 purified from ascitic fluid. Ka values of specific polyclonal antipeptide antibodies were estimated in C. dromedarius and C. bactrianus, as 7 x 10(10) M(-1) and 1.4 x 10(10) M(-1) respectively.
Immunogenicity of a recombinant fusion protein of tandem repeat epitopes of foot-and-mouth disease virus type Asia 1 for guinea pigs.

PubMed

Zhang, Q; Yang, Y Q; Zhang, Z Y; Li, L; Yan, W Y; Jiang, W J; Xin, A G; Lei, C X; Zheng, Z X

2002-01-01

In this study, the sequences of capsid protein VPI regions of YNAs1.1 and YNAs1.2 isolates of foot-and-mouth disease virus (FMDV) were analyzed and a peptide containing amino acids (aa) 133-158 of VP1 and aa 20-34 of VP4 of FMDV type Asia I was assumed to contain B and T cell epitopes, because it is hypervariable and includes a cell attachment site RGD located in the G-H loop. The DNA fragments encoding aa 133-158 of VP1 and aa 20-34 of VP4 of FMDV type Asia 1 were chemically synthesized and ligated into a tandem repeat of aa 133-158-20 approximately 34-133-158. In order to enhance its immunogenicity, the tandem repeat was inserted downstream of the beta-galactosidase gene in the expression vector pWR590. This insertion yielded a recombinant expression vector pAS1 encoding the fusion protein. The latter reacted with sera from FMDV type Asia 1-infected animals in vitro and elicited high levels of neutralizing antibodies in guinea pigs. The T cell proliferation in immunized animals increased following stimulation with the fusion protein. It is reported for the first time that a recombinant fusion protein vaccine was produced using B and T cell epitopes of FMDV type Asia 1 and that this fusion protein was immunogenic. The fusion protein reported here can serve as a candidate of fusion epitopes for design of a vaccine against FMDV type Asia 1.
Ehrlichia chaffeensis Tandem Repeat Proteins and Ank200 are Type 1 Secretion System Substrates Related to the Repeats-in-Toxin Exoprotein Family

PubMed Central

Wakeel, Abdul; den Dulk-Ras, Amke; Hooykaas, Paul J. J.; McBride, Jere W.

2011-01-01

Ehrlichia chaffeensis has type 1 and 4 secretion systems (T1SS and T4SS), but the substrates have not been identified. Potential substrates include secreted tandem repeat protein (TRP) 47, TRP120, and TRP32, and the ankyrin repeat protein, Ank200, that are involved in molecular host–pathogen interactions including DNA binding and a network of protein–protein interactions with host targets associated with signaling, transcriptional regulation, vesicle trafficking, and apoptosis. In this study we report that E. chaffeensis TRP47, TRP32, TRP120, and Ank200 were not secreted in the Agrobacterium tumefaciens Cre recombinase reporter assay routinely used to identify T4SS substrates. In contrast, all TRPs and the Ank200 proteins were secreted by the Escherichia coli complemented with the hemolysin secretion system (T1SS), and secretion was reduced in a T1SS mutant (ΔTolC), demonstrating that these proteins are T1SS substrates. Moreover, T1SS secretion signals were identified in the C-terminal domains of the TRPs and Ank200, and a detailed bioinformatic analysis of E. chaffeensis TRPs and Ank200 revealed features consistent with those described in the repeats-in-toxins (RTX) family of exoproteins, including glycine- and aspartate-rich tandem repeats, homology with ATP-transporters, a non-cleavable C-terminal T1SS signal, acidic pIs, and functions consistent with other T1SS substrates. Using a heterologous E. coli T1SS, this investigation has identified the first Ehrlichia T1SS substrates supporting the conclusion that the T1SS and corresponding substrates are involved in molecular host–pathogen interactions that contribute to Ehrlichia pathobiology. Further investigation of the relationship between Ehrlichia TRPs, Ank200, and the RTX exoprotein family may lead to a greater understanding of the importance of T1SS substrates and specific functions of T1SS in the pathobiology of obligately intracellular bacteria. PMID:22919588
Exceptionally long 5' UTR short tandem repeats specifically linked to primates.

PubMed

Namdar-Aligoodarzi, P; Mohammadparast, S; Zaker-Kandjani, B; Talebi Kakroodi, S; Jafari Vesiehsari, M; Ohadi, M

2015-09-10

We have previously reported genome-scale short tandem repeats (STRs) in the core promoter interval (i.e. -120 to +1 to the transcription start site) of protein-coding genes that have evolved identically in primates vs. non-primates. Those STRs may function as evolutionary switch codes for primate speciation. In the current study, we used the Ensembl database to analyze the 5' untranslated region (5' UTR) between +1 and +60 of the transcription start site of the entire human protein-coding genes annotated in the GeneCards database, in order to identify "exceptionally long" STRs (≥5-repeats), which may be of selective/adaptive advantage. The importance of this critical interval is its function as core promoter, and its effect on transcription and translation. In order to minimize ascertainment bias, we analyzed the evolutionary status of the human 5' UTR STRs of ≥5-repeats in several species encompassing six major orders and superorders across mammals, including primates, rodents, Scandentia, Laurasiatheria, Afrotheria, and Xenarthra. We introduce primate-specific STRs, and STRs which have expanded from mouse to primates. Identical co-occurrence of the identified STRs of rare average frequency between 0.006 and 0.0001 in primates supports a role for those motifs in processes that diverged primates from other mammals, such as neuronal differentiation (e.g. APOD and FGF4), and craniofacial development (e.g. FILIP1L). A number of the identified STRs of ≥5-repeats may be human-specific (e.g. ZMYM3 and DAZAP1). Future work is warranted to examine the importance of the listed genes in primate/human evolution, development, and disease. Copyright © 2015 Elsevier B.V. All rights reserved.
Sequence Effect on the Formation of DNA Minidumbbells.

PubMed

Liu, Yuan; Lam, Sik Lok

2017-11-16

The DNA minidumbbell (MDB) is a recently identified non-B structure. The reported MDBs contain two TTTA, CCTG, or CTTG type II loops. At present, the knowledge and understanding of the sequence criteria for MDB formation are still limited. In this study, we performed a systematic high-resolution nuclear magnetic resonance (NMR) and native gel study to investigate the effect of sequence variations in tandem repeats on the formation of MDBs. Our NMR results reveal the importance of hydrogen bonds, base-base stacking, and hydrophobic interactions from each of the participating residues. We conclude that in the MDBs formed by tandem repeats, C-G loop-closing base pairs are more stabilizing than T-A loop-closing base pairs, and thymine residues in both the second and third loop positions are more stabilizing than cytosine residues. The results from this study enrich our knowledge on the sequence criteria for the formation of MDBs, paving a path for better exploring their potential roles in biological systems and DNA nanotechnology.
Isolation of human simple repeat loci by hybridization selection.

PubMed

Armour, J A; Neumann, R; Gobert, S; Jeffreys, A J

1994-04-01

We have isolated short tandem repeat arrays from the human genome, using a rapid method involving filter hybridization to enrich for tri- or tetranucleotide tandem repeats. About 30% of clones from the enriched library cross-hybridize with probes containing trimeric or tetrameric tandem arrays, facilitating the rapid isolation of large numbers of clones. In an initial analysis of 54 clones, 46 different tandem arrays were identified. Analysis of these tandem repeat loci by PCR showed that 24 were polymorphic in length; substantially higher levels of polymorphism were displayed by the tetrameric repeat loci isolated than by the trimeric repeats. Primary mapping of these loci by linkage analysis showed that they derive from 17 chromosomes, including the X chromosome. We anticipate the use of this strategy for the efficient isolation of tandem repeats from other sources of genomic DNA, including DNA from flow-sorted chromosomes, and from other species.
Evolutionary conservation of sequence and secondary structures inCRISPR repeats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kunin, Victor; Sorek, Rotem; Hugenholtz, Philip

Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in {approx}40% of bacterial and all archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CAS), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been proposed that the CRISPR/CAS system samples, maintains a record of, and inactivates invasive DNA that the cell has encountered, and therefore constitutes a prokaryotic analog of an immune system. Here we analyze CRISPR repeatsmore » identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. All individual repeats in any given cluster were inferred to form characteristic RNA secondary structure, ranging from non-existent to pronounced. Stable secondary structures included G:U base pairs and exhibited multiple compensatory base changes in the stem region, indicating evolutionary conservation and functional importance. We also show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification including specific relationships between CRISPR and CAS subtypes.« less
The 28S–18S rDNA intergenic spacer from Crithidia fasciculata: repeated sequences, length heterogeneity, putative processing sites and potential interactions between U3 small nucleolar RNA and the ribosomal RNA precursor

PubMed Central

Schnare, Murray N.; Collings, James C.; Spencer, David F.; Gray, Michael W.

2000-01-01

In Crithidia fasciculata, the ribosomal RNA (rRNA) gene repeats range in size from ∼11 to 12 kb. This length heterogeneity is localized to a region of the intergenic spacer (IGS) that contains tandemly repeated copies of a 19mer sequence. The IGS also contains four copies of an ∼55 nt repeat that has an internal inverted repeat and is also present in the IGS of Leishmania species. We have mapped the C.fasciculata transcription initiation site as well as two other reverse transcriptase stop sites that may be analogous to the A0 and A′ pre-rRNA processing sites within the 5′ external transcribed spacer (ETS) of other eukaryotes. Features that could influence processing at these sites include two stretches of conserved primary sequence and three secondary structure elements present in the 5′ ETS. We also characterized the C.fasciculata U3 snoRNA, which has the potential for base-pairing with pre-rRNA sequences. Finally, we demonstrate that biosynthesis of large subunit rRNA in both C.fasciculata and Trypanosoma brucei involves 3′-terminal addition of three A residues that are not present in the corresponding DNA sequences. PMID:10982863
De novo Transcriptome Sequencing Reveals a Considerable Bias in the Incidence of Simple Sequence Repeats towards the Downstream of ‘Pre-miRNAs’ of Black Pepper

PubMed Central

Joy, Nisha; Asha, Srinivasan; Mallika, Vijayan; Soniya, Eppurathu Vasudevan

2013-01-01

Next generation sequencing has an advantageon transformational development of species with limited available sequence data as it helps to decode the genome and transcriptome. We carried out the de novo sequencing using illuminaHiSeq™ 2000 to generate the first leaf transcriptome of black pepper (Piper nigrum L.), an important spice variety native to South India and also grown in other tropical regions. Despite the economic and biochemical importance of pepper, a scientifically rigorous study at the molecular level is far from complete due to lack of sufficient sequence information and cytological complexity of its genome. The 55 million raw reads obtained, when assembled using Trinity program generated 2,23,386 contigs and 1,28,157 unigenes. Reports suggest that the repeat-rich genomic regions give rise to small non-coding functional RNAs. MicroRNAs (miRNAs) are the most abundant type of non-coding regulatory RNAs. In spite of the widespread research on miRNAs, little is known about the hair-pin precursors of miRNAs bearing Simple Sequence Repeats (SSRs). We used the array of transcripts generated, for the in silico prediction and detection of ‘43 pre-miRNA candidates bearing different types of SSR motifs’. The analysis identified 3913 different types of SSR motifs with an average of one SSR per 3.04 MB of thetranscriptome. About 0.033% of the transcriptome constituted ‘pre-miRNA candidates bearing SSRs’. The abundance, type and distribution of SSR motifs studied across the hair-pin miRNA precursors, showed a significant bias in the position of SSRs towards the downstream of predicted ‘pre-miRNA candidates’. The catalogue of transcripts identified, together with the demonstration of reliable existence of SSRs in the miRNA precursors, permits future opportunities for understanding the genetic mechanism of black pepper and likely functions of ‘tandem repeats’ in miRNAs. PMID:23469176
Repeat-containing protein effectors of plant-associated organisms

PubMed Central

Mesarich, Carl H.; Bowen, Joanna K.; Hamiaux, Cyril; Templeton, Matthew D.

2015-01-01

Many plant-associated organisms, including microbes, nematodes, and insects, deliver effector proteins into the apoplast, vascular tissue, or cell cytoplasm of their prospective hosts. These effectors function to promote colonization, typically by altering host physiology or by modulating host immune responses. The same effectors however, can also trigger host immunity in the presence of cognate host immune receptor proteins, and thus prevent colonization. To circumvent effector-triggered immunity, or to further enhance host colonization, plant-associated organisms often rely on adaptive effector evolution. In recent years, it has become increasingly apparent that several effectors of plant-associated organisms are repeat-containing proteins (RCPs) that carry tandem or non-tandem arrays of an amino acid sequence or structural motif. In this review, we highlight the diverse roles that these repeat domains play in RCP effector function. We also draw attention to the potential role of these repeat domains in adaptive evolution with regards to RCP effector function and the evasion of effector-triggered immunity. The aim of this review is to increase the profile of RCP effectors from plant-associated organisms. PMID:26557126

Repeat-containing protein effectors of plant-associated organisms.

PubMed

Mesarich, Carl H; Bowen, Joanna K; Hamiaux, Cyril; Templeton, Matthew D

2015-01-01

Many plant-associated organisms, including microbes, nematodes, and insects, deliver effector proteins into the apoplast, vascular tissue, or cell cytoplasm of their prospective hosts. These effectors function to promote colonization, typically by altering host physiology or by modulating host immune responses. The same effectors however, can also trigger host immunity in the presence of cognate host immune receptor proteins, and thus prevent colonization. To circumvent effector-triggered immunity, or to further enhance host colonization, plant-associated organisms often rely on adaptive effector evolution. In recent years, it has become increasingly apparent that several effectors of plant-associated organisms are repeat-containing proteins (RCPs) that carry tandem or non-tandem arrays of an amino acid sequence or structural motif. In this review, we highlight the diverse roles that these repeat domains play in RCP effector function. We also draw attention to the potential role of these repeat domains in adaptive evolution with regards to RCP effector function and the evasion of effector-triggered immunity. The aim of this review is to increase the profile of RCP effectors from plant-associated organisms.
Analysis of tandem repeat units of the promoter of capsanthin/capsorubin synthase (Ccs) gene in pepper fruit.

PubMed

Tian, Shi-Lin; Li, Zheng; Li, Li; Shah, S N M; Gong, Zhen-Hui

2017-07-01

Capsanthin/capsorubin synthase ( Ccs ) gene is a key gene that regulates the synthesis of capsanthin and the development of red coloration in pepper fruits. There are three tandem repeat units in the promoter region of Ccs , but the potential effects of the number of repetitive units on the transcriptional regulation of Ccs has been unclear. In the present study, expression vectors carrying different numbers of repeat units of the Ccs promoter were constructed, and the transient expression of the β-glucuronidase ( GUS ) gene was used to detect differences in expression levels associated with the promoter fragments. These repeat fragments and the plant expression vector PBI121 containing the 35s CaMV promoter were ligated to form recombinant vectors that were transfected into Agrobacterium tumefaciens GV3101. A fluorescence spectrophotometer was used to analyze the expression associated with the various repeat units. It was concluded that the constructs containing at least one repeat were associated with GUS expression, though they did not differ from one another. This repeating unit likely plays a role in transcription and regulation of Ccs expression.
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats

PubMed Central

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-01-01

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. PMID:26481363
Multiple-locus, variable number of tandem repeat analysis (MLVA) of the fish-pathogen Francisella noatunensis

PubMed Central

2011-01-01

Background Since Francisella noatunensis was first isolated from cultured Atlantic cod in 2004, it has emerged as a global fish pathogen causing disease in both warm and cold water species. Outbreaks of francisellosis occur in several important cultured fish species making a correct management of this disease a matter of major importance. Currently there are no vaccines or treatments available. A strain typing system for use in studies of F. noatunensis epizootics would be an important tool for disease management. However, the high genetic similarity within the Francisella spp. makes strain typing difficult, but such typing of the related human pathogen Francisella tullarensis has been performed successfully by targeting loci with higher genetic variation than the traditional signature sequences. These loci are known as Variable Numbers of Tandem Repeat (VNTR). The aim of this study is to identify possible useful VNTRs in the genome of F. noatunensis. Results Seven polymorphic VNTR loci were identified in the preliminary genome sequence of F. noatunensis ssp. noatunensis GM2212 isolate. These VNTR-loci were sequenced in F. noatunensis isolates collected from Atlantic cod (Gadus morhua) from Norway (n = 21), Three-line grunt (Parapristipoma trilineatum) from Japan (n = 1), Tilapia (Oreochromis spp.) from Indonesia (n = 3) and Atlantic salmon (Salmo salar) from Chile (n = 1). The Norwegian isolates presented in this study show both nine allelic profiles and clades, and that the majority of the farmed isolates belong in two clades only, while the allelic profiles from wild cod are unique. Conclusions VNTRs can be used to separate isolates belonging to both subspecies of F. noatunensis. Low allelic diversity in F. noatunensis isolates from outbreaks in cod culture compared to isolates wild cod, indicate that transmission of these isolates may be a result of human activity. The sequence based MLVA system presented in this study should provide a good starting point for
Analysis of Salmonella enterica Serovar Typhimurium Variable-Number Tandem-Repeat Data for Public Health Investigation Based on Measured Mutation Rates and Whole-Genome Sequence Comparisons

PubMed Central

Dimovski, Karolina; Cao, Hanwei; Wijburg, Odilia L. C.; Strugnell, Richard A.; Mantena, Radha K.; Whipp, Margaret; Hogg, Geoff

2014-01-01

Variable-number tandem repeats (VNTRs) mutate rapidly and can be useful markers for genotyping. While multilocus VNTR analysis (MLVA) is increasingly used in the detection and investigation of food-borne outbreaks caused by Salmonella enterica serovar Typhimurium (S. Typhimurium) and other bacterial pathogens, MLVA data analysis usually relies on simple clustering approaches that may lead to incorrect interpretations. Here, we estimated the rates of copy number change at each of the five loci commonly used for S. Typhimurium MLVA, during in vitro and in vivo passage. We found that loci STTR5, STTR6, and STTR10 changed during passage but STTR3 and STTR9 did not. Relative rates of change were consistent across in vitro and in vivo growth and could be accurately estimated from diversity measures of natural variation observed during large outbreaks. Using a set of 203 isolates from a series of linked outbreaks and whole-genome sequencing of 12 representative isolates, we assessed the accuracy and utility of several alternative methods for analyzing and interpreting S. Typhimurium MLVA data. We show that eBURST analysis was accurate and informative. For construction of MLVA-based trees, a novel distance metric, based on the geometric model of VNTR evolution coupled with locus-specific weights, performed better than the commonly used simple or categorical distance metrics. The data suggest that, for the purpose of identifying potential transmission clusters for further investigation, isolates whose profiles differ at one of the rapidly changing STTR5, STTR6, and STTR10 loci should be collapsed into the same cluster. PMID:24957617
Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

PubMed

Rehm, Charlotte; Wurmthaler, Lena A; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S

2015-01-01

In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1-5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6-9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria.
Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

PubMed Central

Rehm, Charlotte; Wurmthaler, Lena A.; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S.

2015-01-01

In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1–5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6–9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria. PMID:26695179
A Predominant Variable-Number Tandem-Repeat Cluster of Mycobacterium tuberculosis Isolates among Asylum Seekers in the Netherlands and Denmark, Deciphered by Whole-Genome Sequencing

PubMed Central

de Neeling, Albert; Rasmussen, Erik Michael; Norman, Anders; Mulder, Arnout; van Hunen, Rianne; de Vries, Gerard; Haddad, Walid; Anthony, Richard; Lillebaek, Troels; van der Hoek, Wim; van Soolingen, Dick

2017-01-01

ABSTRACT In many countries, Mycobacterium tuberculosis isolates are routinely subjected to variable-number tandem-repeat (VNTR) typing to investigate M. tuberculosis transmission. Unexpectedly, cross-border clusters were identified among African refugees in the Netherlands and Denmark, although transmission in those countries was unlikely. Whole-genome sequencing (WGS) was applied to analyze transmission in depth and to assess the precision of VNTR typing. WGS was applied to 40 M. tuberculosis isolates from refugees in the Netherlands and Denmark (most of whom were from the Horn of Africa) that shared the exact same VNTR profile. Cluster investigations were undertaken to identify in-country epidemiological links. Combining WGS results for the isolates (all members of the central Asian strain [CAS]/Delhi genotype), from both European countries, an average genetic distance of 80 single-nucleotide polymorphisms (SNPs) (maximum, 153 SNPs) was observed. The few pairs of isolates with confirmed epidemiological links, except for one pair, had a maximum distance of 12 SNPs. WGS divided this refugee cluster into several subclusters of patients from the same country of origin. Although the M. tuberculosis cases, mainly originating from African countries, shared the exact same VNTR profile, most were clearly distinguished by WGS. The average genetic distance in this specific VNTR cluster was 2 times greater than that in other VNTR clusters. Thus, identical VNTR profiles did not represent recent direct M. tuberculosis transmission for this group of patients. It appears that either these strains from Africa are extremely conserved genetically or there is ongoing transmission of this genotype among refugees on their long migration routes from Africa to Europe. PMID:29167288
A Predominant Variable-Number Tandem-Repeat Cluster of Mycobacterium tuberculosis Isolates among Asylum Seekers in the Netherlands and Denmark, Deciphered by Whole-Genome Sequencing.

PubMed

Jajou, Rana; de Neeling, Albert; Rasmussen, Erik Michael; Norman, Anders; Mulder, Arnout; van Hunen, Rianne; de Vries, Gerard; Haddad, Walid; Anthony, Richard; Lillebaek, Troels; van der Hoek, Wim; van Soolingen, Dick

2018-02-01

In many countries, Mycobacterium tuberculosis isolates are routinely subjected to variable-number tandem-repeat (VNTR) typing to investigate M. tuberculosis transmission. Unexpectedly, cross-border clusters were identified among African refugees in the Netherlands and Denmark, although transmission in those countries was unlikely. Whole-genome sequencing (WGS) was applied to analyze transmission in depth and to assess the precision of VNTR typing. WGS was applied to 40 M. tuberculosis isolates from refugees in the Netherlands and Denmark (most of whom were from the Horn of Africa) that shared the exact same VNTR profile. Cluster investigations were undertaken to identify in-country epidemiological links. Combining WGS results for the isolates (all members of the central Asian strain [CAS]/Delhi genotype), from both European countries, an average genetic distance of 80 single-nucleotide polymorphisms (SNPs) (maximum, 153 SNPs) was observed. The few pairs of isolates with confirmed epidemiological links, except for one pair, had a maximum distance of 12 SNPs. WGS divided this refugee cluster into several subclusters of patients from the same country of origin. Although the M. tuberculosis cases, mainly originating from African countries, shared the exact same VNTR profile, most were clearly distinguished by WGS. The average genetic distance in this specific VNTR cluster was 2 times greater than that in other VNTR clusters. Thus, identical VNTR profiles did not represent recent direct M. tuberculosis transmission for this group of patients. It appears that either these strains from Africa are extremely conserved genetically or there is ongoing transmission of this genotype among refugees on their long migration routes from Africa to Europe. Copyright © 2018 Jajou et al.
High Quality Maize Centromere 10 Sequence Reveals Evidence of Frequent Recombination Events

PubMed Central

Wolfgruber, Thomas K.; Nakashima, Megan M.; Schneider, Kevin L.; Sharma, Anupma; Xie, Zidian; Albert, Patrice S.; Xu, Ronghui; Bilinski, Paul; Dawe, R. Kelly; Ross-Ibarra, Jeffrey; Birchler, James A.; Presting, Gernot G.

2016-01-01

The ancestral centromeres of maize contain long stretches of the tandemly arranged CentC repeat. The abundance of tandem DNA repeats and centromeric retrotransposons (CR) has presented a significant challenge to completely assembling centromeres using traditional sequencing methods. Here, we report a nearly complete assembly of the 1.85 Mb maize centromere 10 from inbred B73 using PacBio technology and BACs from the reference genome project. The error rates estimated from overlapping BAC sequences are 7 × 10−6 and 5 × 10−5 for mismatches and indels, respectively. The number of gaps in the region covered by the reassembly was reduced from 140 in the reference genome to three. Three expressed genes are located between 92 and 477 kb from the inferred ancestral CentC cluster, which lies within the region of highest centromeric repeat density. The improved assembly increased the count of full-length CR from 5 to 55 and revealed a 22.7 kb segmental duplication that occurred approximately 121,000 years ago. Our analysis provides evidence of frequent recombination events in the form of partial retrotransposons, deletions within retrotransposons, chimeric retrotransposons, segmental duplications including higher order CentC repeats, a deleted CentC monomer, centromere-proximal inversions, and insertion of mitochondrial sequences. Double-strand DNA break (DSB) repair is the most plausible mechanism for these events and may be the major driver of centromere repeat evolution and diversity. In many cases examined here, DSB repair appears to be mediated by microhomology, suggesting that tandem repeats may have evolved to efficiently repair frequent DSBs in centromeres. PMID:27047500
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

PubMed

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-11-16

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure.

PubMed

Ruhlman, Tracey A; Zhang, Jin; Blazier, John C; Sabir, Jamal S M; Jansen, Robert K

2017-04-01

There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements. We used Illumina paired-end and PacBio single-molecule real-time (SMRT) sequences to characterize repeat structure in the plastome of Monsonia emarginata (Geraniaceae). We used OrgConv and inspected nucleotide alignments to infer ancestral nucleotides and identify gene conversion among repeats and mapped long (>1 kb) SMRT reads against the unit-genome assembly to identify alternative sequence arrangements. Although M. emarginata lacks the canonical IR, we found that large repeats (>1 kilobase; kb) represent ∼22% of the plastome nucleotide content. Among the largest repeats (>2 kb), we identified GC-biased gene conversion and mapping filtered, long SMRT reads to the M. emarginata unit-genome assembly revealed alternative, substoichiometric sequence arrangements. We offer a model based on RDR and gene conversion between long repeated sequences in the M. emarginata plastome and provide support that both intra-and intermolecular recombination between large repeats, particularly in repeat-rich plastomes, varies unit-genome structure while homogenizing the nucleotide sequence of repeats. © 2017 Botanical Society of America.
Two different size classes of 5S rDNA units coexisting in the same tandem array in the razor clam Ensis macha: is this region suitable for phylogeographic studies?

PubMed

Fernández-Tajes, Juan; Méndez, Josefina

2009-12-01

For a study of 5S ribosomal genes (rDNA) in the razor clam Ensis macha, the 5S rDNA region was amplified and sequenced. Two variants, so-called type I or short repeat (approximately 430 bp) and type II or long repeat (approximately 735 bp), appeared to be the main components of the 5S rDNA of this species. Their spacers differed markedly, both in length and nucleotide composition. The organization of the two variants was investigated by amplifying the genomic DNA with primers based on the sequence of the type I and type II spacers. PCR amplification products with primers EMLbF and EMSbR showed that the long and short repeats are associated within the same tandem array, suggesting an intermixed arrangement of both spacers. Nevertheless, amplifications carried out with inverse primers EMSinvF/R and EMLinvF/R revealed that some short and long repeats are contiguous in the same tandem array. This is the first report of the coexistence of two variable spacers in the same tandem array in bivalve mollusks.
Histone and ribosomal RNA repetitive gene clusters of the boll weevil are linked in a tandem array.

PubMed

Roehrdanz, R; Heilmann, L; Senechal, P; Sears, S; Evenson, P

2010-08-01

Histones are the major protein component of chromatin structure. The histone family is made up of a quintet of proteins, four core histones (H2A, H2B, H3 & H4) and the linker histones (H1). Spacers are found between the coding regions. Among insects this quintet of genes is usually clustered and the clusters are tandemly repeated. Ribosomal DNA contains a cluster of the rRNA sequences 18S, 5.8S and 28S. The rRNA genes are separated by the spacers ITS1, ITS2 and IGS. This cluster is also tandemly repeated. We found that the ribosomal RNA repeat unit of at least two species of Anthonomine weevils, Anthonomus grandis and Anthonomus texanus (Coleoptera: Curculionidae), is interspersed with a block containing the histone gene quintet. The histone genes are situated between the rRNA 18S and 28S genes in what is known as the intergenic spacer region (IGS). The complete reiterated Anthonomus grandis histone-ribosomal sequence is 16,248 bp.
Tandem Mass Spectrum Sequencing: An Alternative to Database Search Engines in Shotgun Proteomics.

PubMed

Muth, Thilo; Rapp, Erdmann; Berven, Frode S; Barsnes, Harald; Vaudel, Marc

2016-01-01

Protein identification via database searches has become the gold standard in mass spectrometry based shotgun proteomics. However, as the quality of tandem mass spectra improves, direct mass spectrum sequencing gains interest as a database-independent alternative. In this chapter, the general principle of this so-called de novo sequencing is introduced along with pitfalls and challenges of the technique. The main tools available are presented with a focus on user friendly open source software which can be directly applied in everyday proteomic workflows.
Novel protein domains and repeats in Drosophila melanogaster: insights into structure, function, and evolution.

PubMed

Ponting, C P; Mott, R; Bork, P; Copley, R R

2001-12-01

Sequence database searching methods such as BLAST, are invaluable for predicting molecular function on the basis of sequence similarities among single regions of proteins. Searches of whole databases however, are not optimized to detect multiple homologous regions within a single polypeptide. Here we have used the prospero algorithm to perform self-comparisons of all predicted Drosophila melanogaster gene products. Predicted repeats, and their homologs from all species, were analyzed further to detect hitherto unappreciated evolutionary relationships. Results included the identification of novel tandem repeats in the human X-linked retinitis pigmentosa type-2 gene product, repeated segments in cystinosin, associated with a defect in cystine transport, and 'nested' homologous domains in dysferlin, whose gene is mutated in limb girdle muscular dystrophy. Novel signaling domain families were found that may regulate the microtubule-based cytoskeleton and ubiquitin-mediated proteolysis, respectively. Two families of glycosyl hydrolases were shown to contain internal repetitions that hint at their evolution via a piecemeal, modular approach. In addition, three examples of fruit fly genes were detected with tandem exons that appear to have arisen via internal duplication. These findings demonstrate how completely sequenced genomes can be exploited to further understand the relationships between molecular structure, function, and evolution.
DNA fingerprinting of Shiga-toxin producing Escherichia coli O157 based on Multiple-Locus Variable-Number Tandem-Repeats Analysis (MLVA)

PubMed Central

Lindstedt, Bjørn-Arne; Heir, Even; Gjernes, Elisabet; Vardund, Traute; Kapperud, Georg

2003-01-01

Background The ability to react early to possible outbreaks of Escherichia coli O157:H7 and to trace possible sources relies on the availability of highly discriminatory and reliable techniques. The development of methods that are fast and has the potential for complete automation is needed for this important pathogen. Methods In all 73 isolates of shiga-toxin producing E. coli O157 (STEC) were used in this study. The two available fully sequenced STEC genomes were scanned for tandem repeated stretches of DNA, which were evaluated as polymorphic markers for isolate identification. Results The 73 E. coli isolates displayed 47 distinct patterns and the MLVA assay was capable of high discrimination between the E. coli O157 strains. The assay was fast and all the steps can be automated. Conclusion The findings demonstrate a novel high discriminatory molecular typing method for the important pathogen E. coli O157 that is fast, robust and offers many advantages compared to current methods. PMID:14664722
Chromosome rearrangements via template switching between diverged repeated sequences

PubMed Central

Anand, Ranjith P.; Tsaponina, Olga; Greenwell, Patricia W.; Lee, Cheng-Sheng; Du, Wei; Petes, Thomas D.

2014-01-01

Recent high-resolution genome analyses of cancer and other diseases have revealed the occurrence of microhomology-mediated chromosome rearrangements and copy number changes. Although some of these rearrangements appear to involve nonhomologous end-joining, many must have involved mechanisms requiring new DNA synthesis. Models such as microhomology-mediated break-induced replication (MM-BIR) have been invoked to explain these rearrangements. We examined BIR and template switching between highly diverged sequences in Saccharomyces cerevisiae, induced during repair of a site-specific double-strand break (DSB). Our data show that such template switches are robust mechanisms that give rise to complex rearrangements. Template switches between highly divergent sequences appear to be mechanistically distinct from the initial strand invasions that establish BIR. In particular, such jumps are less constrained by sequence divergence and exhibit a different pattern of microhomology junctions. BIR traversing repeated DNA sequences frequently results in complex translocations analogous to those seen in mammalian cells. These results suggest that template switching among repeated genes is a potent driver of genome instability and evolution. PMID:25367035
Simple sequence repeat marker loci discovery using SSR primer.

PubMed

Robinson, Andrew J; Love, Christopher G; Batley, Jacqueline; Barker, Gary; Edwards, David

2004-06-12

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. With the increase in the availability of DNA sequence information, an automated process to identify and design PCR primers for amplification of SSR loci would be a useful tool in plant breeding programs. We report an application that integrates SPUTNIK, an SSR repeat finder, with Primer3, a PCR primer design program, into one pipeline tool, SSR Primer. On submission of multiple FASTA formatted sequences, the script screens each sequence for SSRs using SPUTNIK. The results are parsed to Primer3 for locus-specific primer design. The script makes use of a Web-based interface, enabling remote use. This program has been written in PERL and is freely available for non-commercial users by request from the authors. The Web-based version may be accessed at http://hornbill.cspp.latrobe.edu.au/
Molecular characterization of long direct repeat (LDR) sequences expressing a stable mRNA encoding for a 35-amino-acid cell-killing peptide and a cis-encoded small antisense RNA in Escherichia coli.

PubMed

Kawano, Mitsuoki; Oshima, Taku; Kasai, Hiroaki; Mori, Hirotada

2002-07-01

Genome sequence analyses of Escherichia coli K-12 revealed four copies of long repetitive elements. These sequences are designated as long direct repeat (LDR) sequences. Three of the repeats (LDR-A, -B, -C), each approximately 500 bp in length, are located as tandem repeats at 27.4 min on the genetic map. Another copy (LDR-D), 450 bp in length and nearly identical to LDR-A, -B and -C, is located at 79.7 min, a position that is directly opposite the position of LDR-A, -B and -C. In this study, we demonstrate that LDR-D encodes a 35-amino-acid peptide, LdrD, the overexpression of which causes rapid cell killing and nucleoid condensation of the host cell. Northern blot and primer extension analysis showed constitutive transcription of a stable mRNA (approximately 370 nucleotides) encoding LdrD and an unstable cis-encoded antisense RNA (approximately 60 nucleotides), which functions as a trans-acting regulator of ldrD translation. We propose that LDR encodes a toxin-antitoxin module. LDR-homologous sequences are not pre-sent on any known plasmids but are conserved in Salmonella and other enterobacterial species.

Long-Read Single Molecule Sequencing to Resolve Tandem Gene Copies: The Mst77Y Region on the Drosophila melanogaster Y Chromosome

PubMed Central

Krsticevic, Flavia J.; Schrago, Carlos G.; Carvalho, A. Bernardo

2015-01-01

The autosomal gene Mst77F of Drosophila melanogaster is essential for male fertility. In 2010, Krsticevic et al. (Genetics 184: 295−307) found 18 Y-linked copies of Mst77F (“Mst77Y”), which collectively account for 20% of the functional Mst77F-like mRNA. The Mst77Y genes were severely misassembled in the then-available genome assembly and were identified by cloning and sequencing polymerase chain reaction products. The genomic structure of the Mst77Y region and the possible existence of additional copies remained unknown. The recent publication of two long-read assemblies of D. melanogaster prompted us to reinvestigate this challenging region of the Y chromosome. We found that the Illumina Synthetic Long Reads assembly failed in the Mst77Y region, most likely because of its tandem duplication structure. The PacBio MHAP assembly of the Mst77Y region seems to be very accurate, as revealed by comparisons with the previously found Mst77Y genes, a bacterial artificial chromosome sequence, and Illumina reads of the same strain. We found that the Mst77Y region spans 96 kb and originated from a 3.4-kb transposition from chromosome 3L to the Y chromosome, followed by tandem duplications inside the Y chromosome and invasion of transposable elements, which account for 48% of its length. Twelve of the 18 Mst77Y genes found in 2010 were confirmed in the PacBio assembly, the remaining six being polymerase chain reaction−induced artifacts. There are several identical copies of some Mst77Y genes, coincidentally bringing the total copy number to 18. Besides providing a detailed picture of the Mst77Y region, our results highlight the utility of PacBio technology in assembling difficult genomic regions such as tandemly repeated genes. PMID:25858959
Linkage analysis with multiplexed short tandem repeat polymorphisms using infrared fluorescence and M13 tailed primers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oetting, W.S.; Lee, H.K.; Flanders, D.J.

The use of short tandem repeat polymorphisms (STRPs) as marker loci for linkage analysis is becoming increasingly important due to their large numbers in the human genome and their high degree of polymorphism. Fluorescence-based detection of the STRP pattern with an automated DNA sequencer has improved the efficiency of this technique by eliminating the need for radioactivity and producing a digitized autoradiogram-like image that can be used for computer analysis. In an effort to simplify the procedure and to reduce the cost of fluorescence STRP analysis, we have developed a technique known as multiplexing STRPs with tailed primers (MSTP) usingmore » primers that have a 19-bp extension, identical to the sequence of an M13 sequencing primer, on the 5{prime} end of the forward primer in conjunction with multiplexing several primer pairs in a single polymerase chain reaction (PCR) amplification. The banding pattern is detected with the addition of the M13 primer-dye conjugate as the sole primer conjugated to the fluorescent dye, eliminating the need for direct conjugation of the infrared fluorescent dye to the STRP primers. The use of MSTP for linkage analysis greatly reduces the number of PCR reactions. Up to five primer pairs can be multiplexed together in the same reaction. At present, a set of 148 STRP markers spaced at an average genetic distance of 28 cM throughout the autosomal genome can be analyzed in 37 sets of multiplexed amplification reactions. We have automated the analysis of these patterns for linkage using software that both detects the STRP banding pattern and determines their sizes. This information can then be exported in a user-defined format from a database manager for linkage analysis. 15 refs., 2 figs., 4 tabs.« less
Identification of Simple Sequence Repeats in Chloroplast Genomes of Magnoliids Through Bioinformatics Approach.

PubMed

Srivastava, Deepika; Shanker, Asheesh

2016-12-01

Basal angiosperms or Magnoliids is an important clade of commercially important plants which mainly include spices and edible fruits. In this study, 17 chloroplast genome sequences belonging to clade Magnoliids were screened for the identification of chloroplast simple sequence repeats (cpSSRs). Simple sequence repeats or microsatellites are short stretches of DNA up to 1-6 base pair in length. These repeats are ubiquitous and play important role in the development of molecular markers and to study the mapping of traits of economic, medical or ecological interest. A total of 479 SSRs were detected, showing average density of 1 SSR/6.91 kb. Depending on the repeat units, the length of SSRs ranged from 12 to 24 bp for mono-, 12 to 18 bp for di-, 12 to 26 bp for tri-, 12 to 24 bp for tetra-, 15 bp for penta- and 18 bp for hexanucleotide repeats. Mononucleotide repeats were the most frequent (207, 43.21 %) followed by tetranucleotide repeats (130, 27.13 %). Penta- and hexanucleotide repeats were least frequent or absent in these chloroplast genomes.
The Contribution of Short Repeats of Low Sequence Complexity to Large Conifer Genomes

Treesearch

A. Schmidt; R.L. Doudrick; J.S. Heslop-Harrison; T. Schmidt

2000-01-01

Abstract: The abundance and genomic organization of six simple sequence repeats, consisting of di-, tri-, and tetranucleotide sequence motifs, and a minisatellite repeat have been analyzed in different gymnosperms by Southern hybridization. Within the gymnosperm genomes investigated, the abundance and genomic organization of micro- and...
Rapid carrier screening using short tandem repeats in the phenylalanine hydroxylase gene.

PubMed

Shawky, R M; el-Aleem, K A; Rifaat, M M; el-Naggar, R L; Marzouk, G M

2002-01-01

Phenylketonuria (PKU) is an autosomal recessive genetic disorder caused by defects in the phenylalanine hydroxylase (PAH) system. Our work aimed to screen the PAH locus for the presence of potentially useful short tandem repeats (STR) as markers for carrier detection in PKU families in Egypt, and to determine the level of PAH heterozygosity within the Egyptian population. The system contains at least eight independent alleles in the Egyptian population, transmitted in a Mendelian fashion. Variations in the number of STR in the 16 families studied gave rise to polymorphisms that proved to be suitable markers for PKU carrier detection and prenatal diagnosis. The most frequent allelic fragment size in PKU patients was 246 bp (35.7%), which together with a fragment of 254 bp accounted for 60.7% of the mutant chromosomes.
Multi-locus variable-number tandem repeat analysis for outbreak studies of Salmonella enterica serotype Enteritidis

PubMed Central

Malorny, Burkhard; Junker, Ernst; Helmuth, Reiner

2008-01-01

Background Salmonella enterica subsp. enterica serotype Enteritidis is known as an important and pathogenic clonal group which continues to cause worldwide sporadic cases and outbreaks in humans. Here a new multiple-locus variable-number tandem repeat analysis (MLVA) method is reported for highly-discriminative subtyping of Salmonella Enteritidis. Emphasis was given on the most predominant phage types PT4 and PT8. The method comprises multiplex PCR specifically amplifying repeated sequences from nine different loci followed by an automatic fragment size analysis using a multicolor capillary electrophoresis instrument. A total of 240 human, animal, food and environmental isolates of S. Enteritidis including 23 definite phage types were used for development and validation. Furthermore, the MLVA types were compared to the phage types of several isolates from two recent outbreaks to determine the concordance between both methods and to estimate their in vivo stability. The in vitro stability of the two MLVA types specifically for PT4 and PT8 strains were determined by multiple freeze-thaw cycles. Results Seventy-nine different MLVA types were identified in 240 S. Enteritidis strains. The Simpson's diversity index for the MLVA method was 0.919 and Nei diversity values for the nine VNTR loci ranged from 0.07 to 0.65. Twenty-four MLVA types could be assigned to 62 PT4 strains and 21 types to 81 PT8 strains. All outbreak isolates had an indistinguishable outbreak specific MLVA type. The in vitro stability experiments showed no changes of the MLVA type compared to the original isolate. Conclusion This MLVA method is useful to discriminate S. Enteritidis strains even within a single phage type. It is easy in use, fast, and cheap compared to other high-resolution molecular methods and therefore an important tool for surveillance and outbreak studies for S. Enteritidis. PMID:18513386
Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

DOEpatents

Weier, H.U.G.; Gray, J.W.

1995-06-27

A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.
Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

DOEpatents

Weier, Heinz-Ulrich G.; Gray, Joe W.

1995-01-01

A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.
Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

PubMed

Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

2017-07-01

PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.
GATA simple sequence repeats function as enhancer blocker boundaries.

PubMed

Kumar, Ram P; Krishnan, Jaya; Pratap Singh, Narendra; Singh, Lalji; Mishra, Rakesh K

2013-01-01

Simple sequence repeats (SSRs) account for ~3% of the human genome, but their functional significance still remains unclear. One of the prominent SSRs the GATA tetranucleotide repeat has preferentially accumulated in complex organisms. GATA repeats are particularly enriched on the human Y chromosome, and their non-random distribution and exclusive association with genes expressed during early development indicate their role in coordinated gene regulation. Here we show that GATA repeats have enhancer blocker activity in Drosophila and human cells. This enhancer blocker activity is seen in transgenic as well as native context of the enhancers at various developmental stages. These findings ascribe functional significance to SSRs and offer an explanation as to why SSRs, especially GATA, may have accumulated in complex organisms.
Infrared fluorescent automated detection of thirteen short tandem repeat polymorphisms and one gender-determining system of the CODIS core system.

PubMed

Ricci, U; Sani, I; Guarducci, S; Biondi, C; Pelagatti, S; Lazzerini, V; Brusaferri, A; Lapini, M; Andreucci, E; Giunti, L; Giovannucci Uzielli, M L

2000-11-01

We used an infrared (IR) automated fluorescence monolaser sequencer for the analysis of 13 autosomal short tandem repeat (STR) systems (TPOX, D3S1358, FGA, CSF1PO, D5S818, D7S820, D8S1179, TH01, vWA, D13S317, D16S359, D18S51, D21S11) and the X-Y homologous gene amelogenin system. These two systems represent the core of the combined DNA index systems (CODIS). Four independent multiplex reactions, based on the polymerase chain reaction (PCR) technique and on the direct labeling of the forward primer of every primer pair, with a new molecule (IRDye800), were set up, permitting the exact characterization of the alleles by comparison with ladders of specific sequenced alleles. This is the first report of the whole analysis of the STRs of the CODIS core using an IR automated DNA sequencer. The protocol was used to solve paternity/maternity tests and for population studies. The electrophoretic system also proved useful for the correct typing of those loci differing in size by only 2 bp. A sensibility study demonstrated that the test can detect an average of 10 pg of undegraded human DNA. We also performed a preliminary study analyzing some forensic samples and mixed stains, which suggested the usefulness of using this analytical system for human identification as well as for forensic purposes.
Production of novel recombinant single-domain antibodies against tandem repeat region of MUC1 mucin.

PubMed

Rahbarizadeh, F; Rasaee, M J; Forouzandeh Moghadam, M; Allameh, A A; Sadroddiny, E

2004-06-01

Recently, the existence of "heavy-chain" antibody in Camelidae has been described. However, as yet there is no data on the binding of this type of antibody to peptides. In addition, there was not any report of production of single-domain antibodies in two-humped camels (Camelus bactrianus). In the present study, these questions are addressed. We showed the feasibility of immunizing old world camels, cloning the repertoire of the variable domain of their heavy-chain antibodies, panning and selection, leading to the successful identification of minimum-sized antigen binders. Antigen-specific fragments of the heavy-chain IgGs (V(HH)) are of great interest in biotechnology because they are very stable, highly soluble, and react specifically and with high affinity to the antigens. In this study, we immunized two camels (Camelus dromedarius and Camelus bactrianus) with homogenized cancerous tissues, synthetic peptide, and human milk fat globule membrane (HMFG), and generated two V(HH) libraries displayed on phage particles. Some single-domain antibody fragments have been isolated that specifically recognize the tandem repeat region of MUC1. The camels' single-domain V(HH) harbor the original, intact antigen binding site and reacted specifically and with high affinity to the tandem repeat region of MUC1. Indeed soluble, specific antigen binders and good affinities (in the range of 0.2 x 10(9) M(-1) to 0.6 x 10(9) M(-1)) were identified from these libraries. This is the first example of the isolation of camel anti-peptide V(HH) domains.
Multiple-locus variable-number tandem repeat analysis of Salmonella Enteritidis isolates from human and non-human sources using a single multiplex PCR

PubMed Central

Cho, Seongbeom; Boxrud, David J; Bartkus, Joanne M; Whittam, Thomas S; Saeed, Mahdi

2007-01-01

Simplified multiple-locus variable-number tandem repeat analysis (MLVA) was developed using one-shot multiplex PCR for seven variable-number tandem repeats (VNTR) markers with high diversity capacity. MLVA, phage typing, and PFGE methods were applied on 34 diverse Salmonella Enteritidis isolates from human and non-human sources. MLVA detected allelic variations that helped to classify the S. Enteritidis isolates into more evenly distributed subtypes than other methods. MLVA-based S. Enteritidis clonal groups were largely associated with sources of the isolates. Nei's diversity indices for polymorphism ranged from 0.25 to 0.70 for seven VNTR loci markers. Based on Simpson's and Shannon's diversity indices, MLVA had a higher discriminatory power than pulsed field gel electrophoresis (PFGE), phage typing, or multilocus enzyme electrophoresis. Therefore, MLVA may be used along with PFGE to enhance the effectiveness of the molecular epidemiologic investigation of S. Enteritidis infections. PMID:17692097
SSRscanner: a program for reporting distribution and exact location of simple sequence repeats.

PubMed

Anwar, Tamanna; Khan, Asad U

2006-02-20

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com.
SSRscanner: a program for reporting distribution and exact location of simple sequence repeats

PubMed Central

Anwar, Tamanna; Khan, Asad U

2006-01-01

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. Availability This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com PMID:17597863
Thermodynamic characterization of tandem mismatches found in naturally occurring RNA

PubMed Central

Christiansen, Martha E.; Znosko, Brent M.

2009-01-01

Although all sequence symmetric tandem mismatches and some sequence asymmetric tandem mismatches have been thermodynamically characterized and a model has been proposed to predict the stability of previously unmeasured sequence asymmetric tandem mismatches [Christiansen,M.E. and Znosko,B.M. (2008) Biochemistry, 47, 4329–4336], experimental thermodynamic data for frequently occurring tandem mismatches is lacking. Since experimental data is preferred over a predictive model, the thermodynamic parameters for 25 frequently occurring tandem mismatches were determined. These new experimental values, on average, are 1.0 kcal/mol different from the values predicted for these mismatches using the previous model. The data for the sequence asymmetric tandem mismatches reported here were then combined with the data for 72 sequence asymmetric tandem mismatches that were published previously, and the parameters used to predict the thermodynamics of previously unmeasured sequence asymmetric tandem mismatches were updated. The average absolute difference between the measured values and the values predicted using these updated parameters is 0.5 kcal/mol. This updated model improves the prediction for tandem mismatches that were predicted rather poorly by the previous model. This new experimental data and updated predictive model allow for more accurate calculations of the free energy of RNA duplexes containing tandem mismatches, and, furthermore, should allow for improved prediction of secondary structure from sequence. PMID:19509311
Evolutionary Conservation of a Coding Function for D4Z4, the Tandem DNA Repeat Mutated in Facioscapulohumeral Muscular Dystrophy

PubMed Central

Clapp, Jannine ; Mitchell, Laura M. ; Bolland, Daniel J. ; Fantes, Judy ; Corcoran, Anne E. ; Scotting, Paul J. ; Armour, John A. L. ; Hewitt, Jane E.

2007-01-01

Facioscapulohumeral muscular dystrophy (FSHD) is caused by deletions within the polymorphic DNA tandem array D4Z4. Each D4Z4 repeat unit has an open reading frame (ORF), termed “DUX4,” containing two homeobox sequences. Because there has been no evidence of a transcript from the array, these deletions are thought to cause FSHD by a position effect on other genes. Here, we identify D4Z4 homologues in the genomes of rodents, Afrotheria (superorder of elephants and related species), and other species and show that the DUX4 ORF is conserved. Phylogenetic analysis suggests that primate and Afrotherian D4Z4 arrays are orthologous and originated from a retrotransposed copy of an intron-containing DUX gene, DUXC. Reverse-transcriptase polymerase chain reaction and RNA fluorescence and tissue in situ hybridization data indicate transcription of the mouse array. Together with the conservation of the DUX4 ORF for >100 million years, this strongly supports a coding function for D4Z4 and necessitates re-examination of current models of the FSHD disease mechanism. PMID:17668377
Comprehensive mutation analysis of 17 Y-chromosomal short tandem repeat polymorphisms included in the AmpFlSTR Yfiler PCR amplification kit.

PubMed

Goedbloed, Miriam; Vermeulen, Mark; Fang, Rixun N; Lembring, Maria; Wollstein, Andreas; Ballantyne, Kaye; Lao, Oscar; Brauer, Silke; Krüger, Carmen; Roewer, Lutz; Lessig, Rüdiger; Ploski, Rafal; Dobosz, Tadeusz; Henke, Lotte; Henke, Jürgen; Furtado, Manohar R; Kayser, Manfred

2009-11-01

The Y-chromosomal short tandem repeat (Y-STR) polymorphisms included in the AmpFlSTR Yfiler polymerase chain reaction amplification kit have become widely used for forensic and evolutionary applications where a reliable knowledge on mutation properties is necessary for correct data interpretation. Therefore, we investigated the 17 Yfiler Y-STRs in 1,730-1,764 DNA-confirmed father-son pairs per locus and found 84 sequence-confirmed mutations among the 29,792 meiotic transfers covered. Of the 84 mutations, 83 (98.8%) were single-repeat changes and one (1.2%) was a double-repeat change (ratio, 1:0.01), as well as 43 (51.2%) were repeat gains and 41 (48.8%) repeat losses (ratio, 1:0.95). Medians from Bayesian estimation of locus-specific mutation rates ranged from 0.0003 for DYS448 to 0.0074 for DYS458, with a median rate across all 17 Y-STRs of 0.0025. The mean age (at the time of son's birth) of fathers with mutations was with 34.40 (+/-11.63) years higher than that of fathers without ones at 30.32 (+/-10.22) years, a difference that is highly statistically significant (p < 0.001). A Poisson-based modeling revealed that the Y-STR mutation rate increased with increasing father's age on a statistically significant level (alpha = 0.0294, 2.5% quantile = 0.0001). From combining our data with those previously published, considering all together 135,212 meiotic events and 331 mutations, we conclude for the Yfiler Y-STRs that (1) none had a mutation rate of >1%, 12 had mutation rates of >0.1% and four of <0.1%, (2) single-repeat changes were strongly favored over multiple-repeat ones for all loci but 1 and (3) considerable variation existed among loci in the ratio of repeat gains versus losses. Our finding of three Y-STR mutations in one father-son pair (and two pairs with two mutations each) has consequences for determining the threshold of allelic differences to conclude exclusion constellations in future applications of Y-STRs in paternity testing and pedigree analyses.
The structure of the protein phosphatase 2A PR65/A subunit reveals the conformation of its 15 tandemly repeated HEAT motifs.

PubMed

Groves, M R; Hanlon, N; Turowski, P; Hemmings, B A; Barford, D

1999-01-08

The PR65/A subunit of protein phosphatase 2A serves as a scaffolding molecule to coordinate the assembly of the catalytic subunit and a variable regulatory B subunit, generating functionally diverse heterotrimers. Mutations of the beta isoform of PR65 are associated with lung and colon tumors. The crystal structure of the PR65/Aalpha subunit, at 2.3 A resolution, reveals the conformation of its 15 tandemly repeated HEAT sequences, degenerate motifs of approximately 39 amino acids present in a variety of proteins, including huntingtin and importin beta. Individual motifs are composed of a pair of antiparallel alpha helices that assemble in a mainly linear, repetitive fashion to form an elongated molecule characterized by a double layer of alpha helices. Left-handed rotations at three interrepeat interfaces generate a novel left-hand superhelical conformation. The protein interaction interface is formed from the intrarepeat turns that are aligned to form a continuous ridge.
Variable Number of Tandem Repeats in Salmonella enterica subsp. enterica for Typing Purposes

PubMed Central

Ramisse, Vincent; Houssu, Perrine; Hernandez, Eric; Denoeud, France; Hilaire, Valérie; Lisanti, Olivier; Ramisse, Françoise; Cavallo, Jean-Didier; Vergnaud, Gilles

2004-01-01

The genomic sequences of Salmonella enterica subsp. enterica strains CT18, Ty2 (serovar Typhi), and LT2 (serovar Typhimurium) were analyzed for potential variable number tandem repeats (VNTRs). A multiple-locus VNTR analysis (MLVA) of 99 strains of S. enterica supsp. enterica based on 10 VNTRs distinguished 52 genotypes and placed them into four groups. All strains tested were independent human isolates from France and did not reflect isolates from outbreak episodes. Of these 10 VNTRs, 7 showed variability within serovar Typhi, whereas 1 showed variability within serovar Typhimurium. Four VNTRs showed high Nei's diversity indices (DIs) of 0.81 to 0.87 within serovar Typhi (n = 27). Additionally, three of these more variable VNTRs showed DIs of 0.18 to 0.58 within serovar Paratyphi A (n = 10). The VNTR polymorphic site within multidrug-resistant (MDR) serovar Typhimurium isolates (n = 39; resistance to ampicillin, chloramphenicol, spectinomycin, sulfonamides, and tetracycline) showed a DI of 0.81. Cluster analysis not only identified three genetically distinct groups consistent with the present serovar classification of salmonellae (serovars Typhi, Paratyphi A, and Typhimurium) but also discriminated 25 subtypes (93%) within serovar Typhi isolates. The analysis discriminated only eight subtypes within serovar Typhimurium isolates resistant to ampicillin, chloramphenicol, spectinomycin, sulfonamides, and tetracycline, possibly reflecting the emergence in the mid-1990s of the DT104 phage type, which often displays such an MDR spectrum. Coupled with the ongoing improvements in automated procedures offered by capillary electrophoresis, use of these markers is proposed in further investigations of the potential of MLVA in outbreaks of salmonellosis, especially outbreaks of typhoid fever. PMID:15583305

Spectrum of Phenylalanine Hydroxylase Gene Mutations in Hamadan and Lorestan Provinces of Iran and Their Associations with Variable Number of Tandem Repeat Alleles.

PubMed

Alibakhshi, Reza; Moradi, Keivan; Biglari, Mostafa; Shafieenia, Samaneh

2018-05-01

Phenylketonuria (PKU) is one of the most common known inherited metabolic diseases. The present study aimed to investigate the status of molecular defects in phenylalanine hydroxylase ( PAH ) gene in western Iranian PKU patients (predominantly from Kermanshah, Hamadan, and Lorestan provinces) during 2014-2016. Additionally, the results were compared with similar studies in Iran. Nucleotide sequence analysis of all 13 exons and their flanking intronic regions of the PAH gene was performed in 18 western Iranian PKU patients. Moreover, a variable number of tandem repeat (VNTR) located in the PAH gene was studied. The results revealed a mutational spectrum encompassing 11 distinct mutations distributed along the PAH gene sequence on 34 of the 36 mutant alleles (diagnostic efficiency of 94.4%). Also, four PAH VNTR alleles (with repeats of 3, 7, 8 and 9) were detected. The three most frequent mutations were IVS9+5G>A, IVS7-5T>C, and p.P281L with the frequency of 27.8%, 11%, and 11%, respectively. The results showed that there is not only a consanguineous relation, but also a difference in PAH characters of mutations between Kermanshah and the other two parts of western Iran (Hamadan and Lorestan). Also, it seems that the spectrum of mutations in western Iran is relatively distinct from other parts of the country, suggesting that this region might be a special PAH gene distribution region. Moreover, our findings can be useful in the identification of genotype to phenotype relationship in patients, and provide future abilities for confirmatory diagnostic testing, prognosis, and predict the severity of PKU patients.
Spatio-temporal Variations of Characteristic Repeating Earthquake Sequences along the Middle America Trench in Mexico

NASA Astrophysics Data System (ADS)

Dominguez, L. A.; Taira, T.; Hjorleifsdottir, V.; Santoyo, M. A.

2015-12-01

Repeating earthquake sequences are sets of events that are thought to rupture the same area on the plate interface and thus provide nearly identical waveforms. We systematically analyzed seismic records from 2001 through 2014 to identify repeating earthquakes with highly correlated waveforms occurring along the subduction zone of the Cocos plate. Using the correlation coefficient (cc) and spectral coherency (coh) of the vertical components as selection criteria, we found a set of 214 sequences whose waveforms exceed cc≥95% and coh≥95%. Spatial clustering along the trench shows large variations in repeating earthquakes activity. Particularly, the rupture zone of the M8.1, 1985 earthquake shows an almost absence of characteristic repeating earthquakes, whereas the Guerrero Gap zone and the segment of the trench close to the Guerrero-Oaxaca border shows a significantly larger number of repeating earthquakes sequences. Furthermore, temporal variations associated to stress changes due to major shows episodes of unlocking and healing of the interface. Understanding the different components that control the location and recurrence time of characteristic repeating sequences is a key factor to pinpoint areas where large megathrust earthquakes may nucleate and consequently to improve the seismic hazard assessment.
Multilocus variable-number tandem repeat analysis distinguishes outbreak and sporadic Escherichia coli O157:H7 isolates.

PubMed

Noller, Anna C; McEllistrem, M Catherine; Pacheco, Antonio G F; Boxrud, David J; Harrison, Lee H

2003-12-01

Escherichia coli O157:H7 is a major cause of food-borne illness in the United States. Outbreak detection involves traditional epidemiological methods and routine molecular subtyping by pulsed-field gel electrophoresis (PFGE). PFGE is labor-intensive, and the results are difficult to analyze and not easily transferable between laboratories. Multilocus variable-number tandem repeat (VNTR) analysis (MLVA) is a fast, portable method that analyzes multiple VNTR loci, which are areas of the bacterial genome that evolve quickly. Eighty isolates, including 21 isolates from five epidemiologically well-characterized outbreaks from Pennsylvania and Minnesota, were analyzed by PFGE and MLVA. Strains in PFGE clusters were defined as strains that differed by less than or equal to one band by using XbaI and the confirmatory enzyme SpeI. MLVA was performed by comparing the number of tandem repeats at seven loci. From 6 to 30 alleles were found at the seven loci, resulting in 64 MLVA types among the 80 isolates. MLVA correctly identified the isolates from all five outbreaks if only a single-locus variant was allowed. MLVA differentiated strains with unique PFGE types. Additionally, MLVA discriminated strains within PFGE-defined clusters that were not known to be part of an outbreak. In addition to being a simple and validated method for E. coli O157:H7 outbreak detection, MLVA appears to have a sensitivity equal to that of PFGE and a specificity superior to that of PFGE.
Characterization of species-specific repeated DNA sequences from B. nigra.

PubMed

Gupta, V; Lakshmisita, G; Shaila, M S; Jagannathan, V; Lakshmikumaran, M S

1992-07-01

The construction and characterization of two genome-specific recombinant DNA clones from B. nigra are described. Southern analysis showed that the two clones belong to a dispersed repeat family. They differ from each other in their length, distribution and sequence, though the average GC content is nearly the same (45%). These B genome-specific repeats have been used to analyse the phylogenetic relationships between cultivated and wild species of the family Brassicaceae.
Comparative genomics and repetitive sequence divergence in the species of diploid Nicotiana section Alatae.

PubMed

Lim, K Yoong; Kovarik, Ales; Matyasek, Roman; Chase, Mark W; Knapp, Sandra; McCarthy, Elizabeth; Clarkson, James J; Leitch, Andrew R

2006-12-01

Combining phylogenetic reconstructions of species relationships with comparative genomic approaches is a powerful way to decipher evolutionary events associated with genome divergence. Here, we reconstruct the history of karyotype and tandem repeat evolution in species of diploid Nicotiana section Alatae. By analysis of plastid DNA, we resolved two clades with high bootstrap support, one containing N. alata, N. langsdorffii, N. forgetiana and N. bonariensis (called the n = 9 group) and another containing N. plumbaginifolia and N. longiflora (called the n = 10 group). Despite little plastid DNA sequence divergence, we observed, via fluorescent in situ hybridization, substantial chromosomal repatterning, including altered chromosome numbers, structure and distribution of repeats. Effort was focussed on 35S and 5S nuclear ribosomal DNA (rDNA) and the HRS60 satellite family of tandem repeats comprising the elements HRS60, NP3R and NP4R. We compared divergence of these repeats in diploids and polyploids of Nicotiana. There are dramatic shifts in the distribution of the satellite repeats and complete replacement of intergenic spacers (IGSs) of 35S rDNA associated with divergence of the species in section Alatae. We suggest that sequence homogenization has replaced HRS60 family repeats at sub-telomeric regions, but that this process may not occur, or occurs more slowly, when the repeats are found at intercalary locations. Sequence homogenization acts more rapidly (at least two orders of magnitude) on 35S rDNA than 5S rDNA and sub-telomeric satellite sequences. This rapid rate of divergence is analogous to that found in polyploid species, and is therefore, in plants, not only associated with polyploidy.
Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum) and Comparative Analysis with Common Buckwheat (F. esculentum)

PubMed Central

Cho, Kwang-Soo; Yun, Bong-Kyoung; Yoon, Young-Ho; Hong, Su-Young; Mekapogu, Manjulatha; Kim, Kyung-Hee; Yang, Tae-Jin

2015-01-01

We report the chloroplast (cp) genome sequence of tartary buckwheat (Fagopyrum tataricum) obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale) cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp) were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats) and F. esculentum (one repeat), and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes—rpoC2, ycf3, accD, and clpP—have high synonymous (Ks) value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum. PMID:25966355
CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats.

PubMed

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2007-07-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) constitute a particular family of tandem repeats found in a wide range of prokaryotic genomes (half of eubacteria and almost all archaea). They consist of a succession of highly conserved regions (DR) varying in size from 23 to 47 bp, separated by similarly sized unique sequences (spacer) of usually viral origin. A CRISPR cluster is flanked on one side by an AT-rich sequence called the leader and assumed to be a transcriptional promoter. Recent studies suggest that this structure represents a putative RNA-interference-based immune system. Here we describe CRISPRFinder, a web service offering tools to (i) detect CRISPRs including the shortest ones (one or two motifs); (ii) define DRs and extract spacers; (iii) get the flanking sequences to determine the leader; (iv) blast spacers against Genbank database and (v) check if the DR is found elsewhere in prokaryotic sequenced genomes. CRISPRFinder is freely accessible at http://crispr.u-psud.fr/Server/CRISPRfinder.php.
Sequences spanning the leader-repeat junction mediate CRISPR adaptation to phage in Streptococcus thermophilus

PubMed Central

Wei, Yunzhou; Chesne, Megan T.; Terns, Rebecca M.; Terns, Michael P.

2015-01-01

CRISPR-Cas systems are RNA-based immune systems that protect prokaryotes from invaders such as phages and plasmids. In adaptation, the initial phase of the immune response, short foreign DNA fragments are captured and integrated into host CRISPR loci to provide heritable defense against encountered foreign nucleic acids. Each CRISPR contains a ∼100–500 bp leader element that typically includes a transcription promoter, followed by an array of captured ∼35 bp sequences (spacers) sandwiched between copies of an identical ∼35 bp direct repeat sequence. New spacers are added immediately downstream of the leader. Here, we have analyzed adaptation to phage infection in Streptococcus thermophilus at the CRISPR1 locus to identify cis-acting elements essential for the process. We show that the leader and a single repeat of the CRISPR locus are sufficient for adaptation in this system. Moreover, we identified a leader sequence element capable of stimulating adaptation at a dormant repeat. We found that sequences within 10 bp of the site of integration, in both the leader and repeat of the CRISPR, are required for the process. Our results indicate that information at the CRISPR leader-repeat junction is critical for adaptation in this Type II-A system and likely other CRISPR-Cas systems. PMID:25589547
Revisiting the TALE repeat.

PubMed

Deng, Dong; Yan, Chuangye; Wu, Jianping; Pan, Xiaojing; Yan, Nieng

2014-04-01

Transcription activator-like (TAL) effectors specifically bind to double stranded (ds) DNA through a central domain of tandem repeats. Each TAL effector (TALE) repeat comprises 33-35 amino acids and recognizes one specific DNA base through a highly variable residue at a fixed position in the repeat. Structural studies have revealed the molecular basis of DNA recognition by TALE repeats. Examination of the overall structure reveals that the basic building block of TALE protein, namely a helical hairpin, is one-helix shifted from the previously defined TALE motif. Here we wish to suggest a structure-based re-demarcation of the TALE repeat which starts with the residues that bind to the DNA backbone phosphate and concludes with the base-recognition hyper-variable residue. This new numbering system is consistent with the α-solenoid superfamily to which TALE belongs, and reflects the structural integrity of TAL effectors. In addition, it confers integral number of TALE repeats that matches the number of bound DNA bases. We then present fifteen crystal structures of engineered dHax3 variants in complex with target DNA molecules, which elucidate the structural basis for the recognition of bases adenine (A) and guanine (G) by reported or uncharacterized TALE codes. Finally, we analyzed the sequence-structure correlation of the amino acid residues within a TALE repeat. The structural analyses reported here may advance the mechanistic understanding of TALE proteins and facilitate the design of TALEN with improved affinity and specificity.
High-Frame-Rate Doppler Ultrasound Using a Repeated Transmit Sequence

PubMed Central

Podkowa, Anthony S.; Oelze, Michael L.; Ketterling, Jeffrey A.

2018-01-01

The maximum detectable velocity of high-frame-rate color flow Doppler ultrasound is limited by the imaging frame rate when using coherent compounding techniques. Traditionally, high quality ultrasonic images are produced at a high frame rate via coherent compounding of steered plane wave reconstructions. However, this compounding operation results in an effective downsampling of the slow-time signal, thereby artificially reducing the frame rate. To alleviate this effect, a new transmit sequence is introduced where each transmit angle is repeated in succession. This transmit sequence allows for direct comparison between low resolution, pre-compounded frames at a short time interval in ways that are resistent to sidelobe motion. Use of this transmit sequence increases the maximum detectable velocity by a scale factor of the transmit sequence length. The performance of this new transmit sequence was evaluated using a rotating cylindrical phantom and compared with traditional methods using a 15-MHz linear array transducer. Axial velocity estimates were recorded for a range of ±300 mm/s and compared to the known ground truth. Using these new techniques, the root mean square error was reduced from over 400 mm/s to below 50 mm/s in the high-velocity regime compared to traditional techniques. The standard deviation of the velocity estimate in the same velocity range was reduced from 250 mm/s to 30 mm/s. This result demonstrates the viability of the repeated transmit sequence methods in detecting and quantifying high-velocity flow. PMID:29910966
Fine-tuning gene networks using simple sequence repeats

PubMed Central

Egbert, Robert G.; Klavins, Eric

2012-01-01

The parameters in a complex synthetic gene network must be extensively tuned before the network functions as designed. Here, we introduce a simple and general approach to rapidly tune gene networks in Escherichia coli using hypermutable simple sequence repeats embedded in the spacer region of the ribosome binding site. By varying repeat length, we generated expression libraries that incrementally and predictably sample gene expression levels over a 1,000-fold range. We demonstrate the utility of the approach by creating a bistable switch library that programmatically samples the expression space to balance the two states of the switch, and we illustrate the need for tuning by showing that the switch’s behavior is sensitive to host context. Further, we show that mutation rates of the repeats are controllable in vivo for stability or for targeted mutagenesis—suggesting a new approach to optimizing gene networks via directed evolution. This tuning methodology should accelerate the process of engineering functionally complex gene networks. PMID:22927382
Limitations of variable number of tandem repeat typing identified through whole genome sequencing of Mycobacterium avium subsp. paratuberculosis on a national and herd level.

PubMed

Ahlstrom, Christina; Barkema, Herman W; Stevenson, Karen; Zadoks, Ruth N; Biek, Roman; Kao, Rowland; Trewby, Hannah; Haupstein, Deb; Kelton, David F; Fecteau, Gilles; Labrecque, Olivia; Keefe, Greg P; McKenna, Shawn L B; De Buck, Jeroen

2015-03-08

Mycobacterium avium subsp. paratuberculosis (MAP), the causative bacterium of Johne's disease in dairy cattle, is widespread in the Canadian dairy industry and has significant economic and animal welfare implications. An understanding of the population dynamics of MAP can be used to identify introduction events, improve control efforts and target transmission pathways, although this requires an adequate understanding of MAP diversity and distribution between herds and across the country. Whole genome sequencing (WGS) offers a detailed assessment of the SNP-level diversity and genetic relationship of isolates, whereas several molecular typing techniques used to investigate the molecular epidemiology of MAP, such as variable number of tandem repeat (VNTR) typing, target relatively unstable repetitive elements in the genome that may be too unpredictable to draw accurate conclusions. The objective of this study was to evaluate the diversity of bovine MAP isolates in Canadian dairy herds using WGS and then determine if VNTR typing can distinguish truly related and unrelated isolates. Phylogenetic analysis based on 3,039 SNPs identified through WGS of 124 MAP isolates identified eight genetically distinct subtypes in dairy herds from seven Canadian provinces, with the dominant type including over 80% of MAP isolates. VNTR typing of 527 MAP isolates identified 12 types, including "bison type" isolates, from seven different herds. At a national level, MAP isolates differed from each other by 1-2 to 239-240 SNPs, regardless of whether they belonged to the same or different VNTR types. A herd-level analysis of MAP isolates demonstrated that VNTR typing may both over-estimate and under-estimate the relatedness of MAP isolates found within a single herd. The presence of multiple MAP subtypes in Canada suggests multiple introductions into the country including what has now become one dominant type, an important finding for Johne's disease control. VNTR typing often failed to
Analysis of simple sequence repeat (SSR) structure and sequence within Epichloë endophyte genomes reveals impacts on gene structure and insights into ancestral hybridization events.

PubMed

Clayton, William; Eaton, Carla Jane; Dupont, Pierre-Yves; Gillanders, Tim; Cameron, Nick; Saikia, Sanjay; Scott, Barry

2017-01-01

Epichloë grass endophytes comprise a group of filamentous fungi of both sexual and asexual species. Known for the beneficial characteristics they endow upon their grass hosts, the identification of these endophyte species has been of great interest agronomically and scientifically. The use of simple sequence repeat loci and the variation in repeat elements has been used to rapidly identify endophyte species and strains, however, little is known of how the structure of repeat elements changes between species and strains, and where these repeat elements are located in the fungal genome. We report on an in-depth analysis of the structure and genomic location of the simple sequence repeat locus B10, commonly used for Epichloë endophyte species identification. The B10 repeat was found to be located within an exon of a putative bZIP transcription factor, suggesting possible impacts on polypeptide sequence and thus protein function. Analysis of this repeat in the asexual endophyte hybrid Epichloë uncinata revealed that the structure of B10 alleles reflects the ancestral species that hybridized to give rise to this species. Understanding the structure and sequence of these simple sequence repeats provides a useful set of tools for readily distinguishing strains and for gaining insights into the ancestral species that have undergone hybridization events.
Screening of repetitive motifs inside the genome of the flat oyster (Ostrea edulis): Transposable elements and short tandem repeats.

PubMed

Vera, Manuel; Bello, Xabier; Álvarez-Dios, Jose-Antonio; Pardo, Belen G; Sánchez, Laura; Carlsson, Jens; Carlsson, Jeanette E L; Bartolomé, Carolina; Maside, Xulio; Martinez, Paulino

2015-12-01

The flat oyster (Ostrea edulis) is one of the most appreciated molluscs in Europe, but its production has been greatly reduced by the parasite Bonamia ostreae. Here, new generation genomic resources were used to analyse the repetitive fraction of the oyster genome, with the aim of developing molecular markers to face this main oyster production challenge. The resulting oyster database, consists of two sets of 10,318 and 7159 unique contigs (4.8 Mbp and 6.8 Mbp in total length) representing the oyster's genome (WG) and haemocyte transcriptome (HT), respectively. A total of 1083 sequences were identified as TE-derived, which corresponded to 4.0% of WG and 1.1% of HT. They were clustered into 142 homology groups, most of which were assigned to the Penelope order of retrotransposons, and to the Helitron and TIR DNA-transposons. Simple repeats and rRNA pseudogenes, also made a significant contribution to the oyster's genome (0.5% and 0.3% of WG and HT, respectively).The most frequent short tandem repeats identified in WG were tetranucleotide motifs while trinucleotide motifs were in HT. Forty identified microsatellite loci, 20 from each database, were selected for technical validation. Success was much lower among WG than HT microsatellites (15% vs 55%), which could reflect higher variation in anonymous regions interfering with primer annealing. All microsatellites developed adjusted to Hardy-Weinberg proportions and represent a useful tool to support future breeding programmes and to manage genetic resources of natural flat oyster beds. Copyright © 2015 Elsevier B.V. All rights reserved.
The Peculiar Landscape of Repetitive Sequences in the Olive (Olea europaea L.) Genome

PubMed Central

Barghini, Elena; Natali, Lucia; Cossu, Rosa Maria; Giordani, Tommaso; Pindo, Massimo; Cattonaro, Federica; Scalabrin, Simone; Velasco, Riccardo; Morgante, Michele; Cavallini, Andrea

2014-01-01

Analyzing genome structure in different species allows to gain an insight into the evolution of plant genome size. Olive (Olea europaea L.) has a medium-sized haploid genome of 1.4 Gb, whose structure is largely uncharacterized, despite the growing importance of this tree as oil crop. Next-generation sequencing technologies and different computational procedures have been used to study the composition of the olive genome and its repetitive fraction. A total of 2.03 and 2.3 genome equivalents of Illumina and 454 reads from genomic DNA, respectively, were assembled following different procedures, which produced more than 200,000 differently redundant contigs, with mean length higher than 1,000 nt. Mapping Illumina reads onto the assembled sequences was used to estimate their redundancy. The genome data set was subdivided into highly and medium redundant and nonredundant contigs. By combining identification and mapping of repeated sequences, it was established that tandem repeats represent a very large portion of the olive genome (∼31% of the whole genome), consisting of six main families of different length, two of which were first discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements (especially long terminal repeat-retrotransposons). On the whole, the results of our analyses show the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, more than that reported for any other sequenced plant genome. PMID:24671744
The peculiar landscape of repetitive sequences in the olive (Olea europaea L.) genome.

PubMed

Barghini, Elena; Natali, Lucia; Cossu, Rosa Maria; Giordani, Tommaso; Pindo, Massimo; Cattonaro, Federica; Scalabrin, Simone; Velasco, Riccardo; Morgante, Michele; Cavallini, Andrea

2014-04-01

Analyzing genome structure in different species allows to gain an insight into the evolution of plant genome size. Olive (Olea europaea L.) has a medium-sized haploid genome of 1.4 Gb, whose structure is largely uncharacterized, despite the growing importance of this tree as oil crop. Next-generation sequencing technologies and different computational procedures have been used to study the composition of the olive genome and its repetitive fraction. A total of 2.03 and 2.3 genome equivalents of Illumina and 454 reads from genomic DNA, respectively, were assembled following different procedures, which produced more than 200,000 differently redundant contigs, with mean length higher than 1,000 nt. Mapping Illumina reads onto the assembled sequences was used to estimate their redundancy. The genome data set was subdivided into highly and medium redundant and nonredundant contigs. By combining identification and mapping of repeated sequences, it was established that tandem repeats represent a very large portion of the olive genome (∼31% of the whole genome), consisting of six main families of different length, two of which were first discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements (especially long terminal repeat-retrotransposons). On the whole, the results of our analyses show the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, more than that reported for any other sequenced plant genome.
Centromere and telomere sequence alterations reflect the rapid genome evolution within the carnivorous plant genus Genlisea.

PubMed

Tran, Trung D; Cao, Hieu X; Jovtchev, Gabriele; Neumann, Pavel; Novák, Petr; Fojtová, Miloslava; Vu, Giang T H; Macas, Jiří; Fajkus, Jiří; Schubert, Ingo; Fuchs, Joerg

2015-12-01

Linear chromosomes of eukaryotic organisms invariably possess centromeres and telomeres to ensure proper chromosome segregation during nuclear divisions and to protect the chromosome ends from deterioration and fusion, respectively. While centromeric sequences may differ between species, with arrays of tandemly repeated sequences and retrotransposons being the most abundant sequence types in plant centromeres, telomeric sequences are usually highly conserved among plants and other organisms. The genome size of the carnivorous genus Genlisea (Lentibulariaceae) is highly variable. Here we study evolutionary sequence plasticity of these chromosomal domains at an intrageneric level. We show that Genlisea nigrocaulis (1C = 86 Mbp; 2n = 40) and G. hispidula (1C = 1550 Mbp; 2n = 40) differ as to their DNA composition at centromeres and telomeres. G. nigrocaulis and its close relative G. pygmaea revealed mainly 161 bp tandem repeats, while G. hispidula and its close relative G. subglabra displayed a combination of four retroelements at centromeric positions. G. nigrocaulis and G. pygmaea chromosome ends are characterized by the Arabidopsis-type telomeric repeats (TTTAGGG); G. hispidula and G. subglabra instead revealed two intermingled sequence variants (TTCAGG and TTTCAGG). These differences in centromeric and, surprisingly, also in telomeric DNA sequences, uncovered between groups with on average a > 9-fold genome size difference, emphasize the fast genome evolution within this genus. Such intrageneric evolutionary alteration of telomeric repeats with cytosine in the guanine-rich strand, not yet known for plants, might impact the epigenetic telomere chromatin modification. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.
Comparative Chloroplast Genomics of Gossypium Species: Insights Into Repeat Sequence Variations and Phylogeny

PubMed Central

Wu, Ying; Liu, Fang; Yang, Dai-Gang; Li, Wei; Zhou, Xiao-Jian; Pei, Xiao-Yu; Liu, Yan-Gai; He, Kun-Lun; Zhang, Wen-Sheng; Ren, Zhong-Ying; Zhou, Ke-Hai; Ma, Xiong-Feng; Li, Zhong-Hu

2018-01-01

Cotton is one of the most economically important fiber crop plants worldwide. The genus Gossypium contains a single allotetraploid group (AD) and eight diploid genome groups (A–G and K). However, the evolution of repeat sequences in the chloroplast genomes and the phylogenetic relationships of Gossypium species are unclear. Thus, we determined the variations in the repeat sequences and the evolutionary relationships of 40 cotton chloroplast genomes, which represented the most diverse in the genus, including five newly sequenced diploid species, i.e., G. nandewarense (C1-n), G. armourianum (D2-1), G. lobatum (D7), G. trilobum (D8), and G. schwendimanii (D11), and an important semi-wild race of upland cotton, G. hirsutum race latifolium (AD1). The genome structure, gene order, and GC content of cotton species were similar to those of other higher plant plastid genomes. In total, 2860 long sequence repeats (>10 bp in length) were identified, where the F-genome species had the largest number of repeats (G. longicalyx F1: 108) and E-genome species had the lowest (G. stocksii E1: 53). Large-scale repeat sequences possibly enrich the genetic information and maintain genome stability in cotton species. We also identified 10 divergence hotspot regions, i.e., rpl33-rps18, psbZ-trnG (GCC), rps4-trnT (UGU), trnL (UAG)-rpl32, trnE (UUC)-trnT (GGU), atpE, ndhI, rps2, ycf1, and ndhF, which could be useful molecular genetic markers for future population genetics and phylogenetic studies. Site-specific selection analysis showed that some of the coding sites of 10 chloroplast genes (atpB, atpE, rps2, rps3, petB, petD, ccsA, cemA, ycf1, and rbcL) were under protein sequence evolution. Phylogenetic analysis based on the whole plastomes suggested that the Gossypium species grouped into six previously identified genetic clades. Interestingly, all 13 D-genome species clustered into a strong monophyletic clade. Unexpectedly, the cotton species with C, G, and K-genomes were admixed and
DeNovoGUI: An Open Source Graphical User Interface for de Novo Sequencing of Tandem Mass Spectra

PubMed Central

2013-01-01

De novo sequencing is a popular technique in proteomics for identifying peptides from tandem mass spectra without having to rely on a protein sequence database. Despite the strong potential of de novo sequencing algorithms, their adoption threshold remains quite high. We here present a user-friendly and lightweight graphical user interface called DeNovoGUI for running parallelized versions of the freely available de novo sequencing software PepNovo+, greatly simplifying the use of de novo sequencing in proteomics. Our platform-independent software is freely available under the permissible Apache2 open source license. Source code, binaries, and additional documentation are available at http://denovogui.googlecode.com. PMID:24295440
DeNovoGUI: an open source graphical user interface for de novo sequencing of tandem mass spectra.

PubMed

Muth, Thilo; Weilnböck, Lisa; Rapp, Erdmann; Huber, Christian G; Martens, Lennart; Vaudel, Marc; Barsnes, Harald

2014-02-07

De novo sequencing is a popular technique in proteomics for identifying peptides from tandem mass spectra without having to rely on a protein sequence database. Despite the strong potential of de novo sequencing algorithms, their adoption threshold remains quite high. We here present a user-friendly and lightweight graphical user interface called DeNovoGUI for running parallelized versions of the freely available de novo sequencing software PepNovo+, greatly simplifying the use of de novo sequencing in proteomics. Our platform-independent software is freely available under the permissible Apache2 open source license. Source code, binaries, and additional documentation are available at http://denovogui.googlecode.com .

Spectroscopic insights into quadruplexes of five-repeat telomere DNA sequences upon G-block damage.

PubMed

Dvořáková, Zuzana; Vorlíčková, Michaela; Renčiuk, Daniel

2017-11-01

The DNA lesions, resulting from oxidative damage, were shown to destabilize human telomere four-repeat quadruplex and to alter its structure. Long telomere DNA, as a repetitive sequence, offers, however, other mechanisms of dealing with the lesion: extrusion of the damaged repeat into loop or shifting the quadruplex position by one repeat. Using circular dichroism and UV absorption spectroscopy and polyacrylamide electrophoresis, we studied consequences of lesions at different positions of the model five-repeat human telomere DNA sequences on the structure and stability of their quadruplexes in sodium and in potassium. The repeats affected by lesion are preferentially positioned as terminal overhangs of the core quadruplex structurally similar to the four-repeat one. Forced affecting of the inner repeats leads to presence of variety of more parallel folds in potassium. In sodium the designed models form mixture of two dominant antiparallel quadruplexes whose population varies with the position of the affected repeat. The shapes of quadruplex CD spectra, namely the height of dominant peaks, significantly correlate with melting temperatures. Lesion in one guanine tract of a more than four repeats long human telomere DNA sequence may cause re-positioning of its quadruplex arrangement associated with a shift of the structure to less common quadruplex conformations. The type of the quadruplex depends on the loop position and external conditions. The telomere DNA quadruplexes are quite resistant to the effect of point mutations due to the telomere DNA repetitive nature, although their structure and, consequently, function might be altered. Copyright © 2017. Published by Elsevier B.V.
Divergence, differential methylation and interspersion of melon satellite DNA sequences.

PubMed Central

Shmookler Reis, R; Timmis, J N; Ingle, J

1981-01-01

Melon (Cucumis melo) satellite DNA consists of two components, Q and S, each with a buoyant density in CsCl of 1.707 g/ml, but differing by 9 degrees C in "melting" temperature. These physical properties appear to be in contradiction, since both depend on G + C content. In order to resolve this anomaly, base compositions were directly determined for isolated fractions. the low-"melting" component S contains 41.8% G + C, with 6% of C present as 5-methylcytosine, whereas Q DNA contains 54% G + C, with 41% of C methylated. Analyses of restriction site loss agreed well with the direct determinations of methylation and divergence, and indicated some clustering of methylated sites in Q DNA. Analysis of restricted main-band DNA by hydridization with RNA complementary to Q satellite DNA ("Southern transfer") showed satellite Q tandem arrays interspersed in DNA of main-band density. Sequence divergence and extent of methylation did not appear to depend on whether a repeat array was present as satellite or interspersed in main-band DNA. Hydridization in situ indicated considerable heterogeneity in the genomic proportion of the Q-DNA sequences in melon fruit nuclei, implying over- and under-representation consistent with extensive unequal recombination in satellite Q tandem arrays. The cucumber, Cucumis sativus, contains less than 8% as much Q-homologous DNA per genome as the melon, suggesting rapid evolutionary gain or loss of these tandem repeat sequences. Images Fig. 2. PLATE 1 Fig. 4. Fig. 10. PMID:6172117
A further analysis of the relationship between yellow ripe-fruit color and the capsanthin-capsorubin synthase gene in pepper (Capsicum sp.) indicated a new mutant variant in C. annuum and a tandem repeat structure in promoter region.

PubMed

Li, Zheng; Wang, Shu; Gui, Xiao-Ling; Chang, Xiao-Bei; Gong, Zhen-Hui

2013-01-01

Mature pepper (Capsicum sp.) fruits come in a variety of colors, including red, orange, yellow, brown, and white. To better understand the genetic and regulatory relationships between the yellow fruit phenotype and the capsanthin-capsorubin synthase gene (Ccs), we examined 156 Capsicum varieties, most of which were collected from Northwest Chinese landraces. A new ccs variant was identified in the yellow fruit cultivar CK7. Cluster analysis revealed that CK7, which belongs to the C. annuum species, has low genetic similarity to other yellow C. annuum varieties. In the coding sequence of this ccs allele, we detected a premature stop codon derived from a C to G change, as well as a downstream frame-shift caused by a 1-bp nucleotide deletion. In addition, the expression of the gene was detected in mature CK7 fruit. Furthermore, the promoter sequences of Ccs from some pepper varieties were examined, and we detected a 176-bp tandem repeat sequence in the promoter region. In all C. annuum varieties examined in this study, the repeat number was three, compared with four in two C. chinense accessions. The sequence similarity ranged from 84.8% to 97.7% among the four types of repeats, and some putative cis-elements were also found in every repeat. This suggests that the transcriptional regulation of Ccs expression is complex. Based on the analysis of the novel C. annuum mutation reported here, along with the studies of three mutation types in yellow C. annuum and C. chinense accessions, we suggest that the mechanism leading to the production of yellow color fruit may be not as complex as that leading to orange fruit production.
A Further Analysis of the Relationship between Yellow Ripe-Fruit Color and the Capsanthin-Capsorubin Synthase Gene in Pepper (Capsicum sp.) Indicated a New Mutant Variant in C. annuum and a Tandem Repeat Structure in Promoter Region

PubMed Central

Gui, Xiao-Ling; Chang, Xiao-Bei; Gong, Zhen-Hui

2013-01-01

Mature pepper (Capsicum sp.) fruits come in a variety of colors, including red, orange, yellow, brown, and white. To better understand the genetic and regulatory relationships between the yellow fruit phenotype and the capsanthin-capsorubin synthase gene (Ccs), we examined 156 Capsicum varieties, most of which were collected from Northwest Chinese landraces. A new ccs variant was identified in the yellow fruit cultivar CK7. Cluster analysis revealed that CK7, which belongs to the C. annuum species, has low genetic similarity to other yellow C. annuum varieties. In the coding sequence of this ccs allele, we detected a premature stop codon derived from a C to G change, as well as a downstream frame-shift caused by a 1-bp nucleotide deletion. In addition, the expression of the gene was detected in mature CK7 fruit. Furthermore, the promoter sequences of Ccs from some pepper varieties were examined, and we detected a 176-bp tandem repeat sequence in the promoter region. In all C. annuum varieties examined in this study, the repeat number was three, compared with four in two C. chinense accessions. The sequence similarity ranged from 84.8% to 97.7% among the four types of repeats, and some putative cis-elements were also found in every repeat. This suggests that the transcriptional regulation of Ccs expression is complex. Based on the analysis of the novel C. annuum mutation reported here, along with the studies of three mutation types in yellow C. annuum and C. chinense accessions, we suggest that the mechanism leading to the production of yellow color fruit may be not as complex as that leading to orange fruit production. PMID:23637942
Protein arginine methyltransferase 7 has a novel homodimer-like structure formed by tandem repeats.

PubMed

Hasegawa, Morio; Toma-Fukai, Sachiko; Kim, Jun-Dal; Fukamizu, Akiyoshi; Shimizu, Toshiyuki

2014-05-21

Protein arginine methyltransferase 7 (PRMT7) is a member of a family of enzymes that catalyze the transfer of methyl groups from S-adenosyl-l-methionine to nitrogen atoms on arginine residues. Here, we describe the crystal structure of Caenorhabditis elegans PRMT7 in complex with its reaction product S-adenosyl-L-homocysteine. The structural data indicated that PRMT7 harbors two tandem repeated PRMT core domains that form a novel homodimer-like structure. S-adenosyl-L-homocysteine bound to the N-terminal catalytic site only; the C-terminal catalytic site is occupied by a loop that inhibits cofactor binding. Mutagenesis demonstrated that only the N-terminal catalytic site of PRMT7 is responsible for cofactor binding. Copyright © 2014 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Highly Informative Simple Sequence Repeat (SSR) Markers for Fingerprinting Hazelnut

USDA-ARS?s Scientific Manuscript database

Simple sequence repeat (SSR) or microsatellite markers have many applications in breeding and genetic studies of plants, including fingerprinting of cultivars and investigations of genetic diversity, and therefore provide information for better management of germplasm collections. They are repeatab...
Comparison of seven techniques for typing international epidemic strains of Clostridium difficile: restriction endonuclease analysis, pulsed-field gel electrophoresis, PCR-ribotyping, multilocus sequence typing, multilocus variable-number tandem-repeat analysis, amplified fragment length polymorphism, and surface layer protein A gene sequence typing.

PubMed

Killgore, George; Thompson, Angela; Johnson, Stuart; Brazier, Jon; Kuijper, Ed; Pepin, Jacques; Frost, Eric H; Savelkoul, Paul; Nicholson, Brad; van den Berg, Renate J; Kato, Haru; Sambol, Susan P; Zukowski, Walter; Woods, Christopher; Limbago, Brandi; Gerding, Dale N; McDonald, L Clifford

2008-02-01

Using 42 isolates contributed by laboratories in Canada, The Netherlands, the United Kingdom, and the United States, we compared the results of analyses done with seven Clostridium difficile typing techniques: multilocus variable-number tandem-repeat analysis (MLVA), amplified fragment length polymorphism (AFLP), surface layer protein A gene sequence typing (slpAST), PCR-ribotyping, restriction endonuclease analysis (REA), multilocus sequence typing (MLST), and pulsed-field gel electrophoresis (PFGE). We assessed the discriminating ability and typeability of each technique as well as the agreement among techniques in grouping isolates by allele profile A (AP-A) through AP-F, which are defined by toxinotype, the presence of the binary toxin gene, and deletion in the tcdC gene. We found that all isolates were typeable by all techniques and that discrimination index scores for the techniques tested ranged from 0.964 to 0.631 in the following order: MLVA, REA, PFGE, slpAST, PCR-ribotyping, MLST, and AFLP. All the techniques were able to distinguish the current epidemic strain of C. difficile (BI/027/NAP1) from other strains. All of the techniques showed multiple types for AP-A (toxinotype 0, binary toxin negative, and no tcdC gene deletion). REA, slpAST, MLST, and PCR-ribotyping all included AP-B (toxinotype III, binary toxin positive, and an 18-bp deletion in tcdC) in a single group that excluded other APs. PFGE, AFLP, and MLVA grouped two, one, and two different non-AP-B isolates, respectively, with their AP-B isolates. All techniques appear to be capable of detecting outbreak strains, but only REA and MLVA showed sufficient discrimination to distinguish strains from different outbreaks.
[Usefulness of the variable numbers of tandem repeats (VNTR) analysis for complex infections of Mycobacterium avium and Mycobacterium intracellulare].

PubMed

Tsunematsu, Noriko; Goto, Mieko; Saiki, Yumiko; Baba, Michiko; Udagawa, Tadashi; Kazumi, Yuko

2008-09-01

The bacilli which were isolated from a patient suspected of the mixed infections with Mycobacterium avium and Mycobacterium intracellulare, were analyzed. The genotypes of M. avium in the sedimented fractions of treated sputum and in some colonies isolated from Ogawa medium were compared by the Variable Numbers of Tandem Repeats (VNTR). A woman, aged 57. Mycobacterial species isolated from some colonies by culture in 2004 and 2006 and from the treated sputum in 2006, were determined by DNA sequencing analysis of the 16S rRNA gene. Also, by using VNTR, the genotype of mycobacteria was analyzed. [Results] (1) The colony isolated from Ogawa medium in 2004 was monoclonal M. avium. (2) By VNTR analyses of specimens in 2006, multiple acid-fast bacteria were found in the sputum sediment and in isolated bacteria from Ogawa medium. (3) By analyses of 16S rRNA DNA sequence, M. avium and M. intracellulare were found in the colonies isolated from the sputum sediment and the Ogawa medium in 2006. (4) The same VNTR patterns were obtained in M. avium in 2004 and 2006 when single colony was analyzed. (5) From the showerhead and culvert of the bathroom in the patient's house, M. avium was not detected. By VNTR analyses, it was considered that the mixed infections of M. avium and M. intracellulare had been generated during treatment in this case. Therefore, in the case of suspected complex infection, VNTR analysis would be a useful genotyping method in M. avium complex infection.
Tandem alternative polyadenylation events of genes in non-eosinophilic nasal polyp tissue identified by high-throughput sequencing analysis

PubMed Central

TIAN, PENG; LI, JIE; LIU, XIANG; LI, YUXI; CHEN, MEIHENG; MA, YUN; ZHENG, YI QING; FU, YONGGUI; ZOU, HUA

2014-01-01

Nasal polyps (NP) is highly associated with the disorder of immune cells. Alternative polyadenylation (APA) produces mRNA isoforms with different length of 3′-untranslated region (UTR) and regulates gene expression. It has been proven that this APA-mediated regulation of 3′UTR length is an immune-associated phenomenon. The aim of this study was to investigate the genome-wide alternative tandem 3′UTR length switching events in non-eosinophilic nasal polyp tissue. Thirteen patients diagnosed as having non-eosinophilic nasal polyps were included in this study. Nasal polyp tissue and control mucosa were collected during surgery. The 3′ end library of cDNA was constructed. The recovered libraries were sequenced with second sequencing technology, and the sequencing data were analyzed by an in-house bioinformatics pipeline. Tandem 3′UTR length switching between samples was detected by a test of linear trend alternative to independence. We found a significant alteration in the tandem 3′UTR length in 1,920 genes in nasal polyp samples. Functional annotation results showed that several gene ontology (GO) terms were enriched in the list of genes with switched APA sites, including regulation of transcription, macromolecule catabolic localization and mRNA processing. The results suggested that APA-mediated alternative 3′UTR regulation plays an important role in the post-transcriptional regulation of gene expression in non-eosinophilic nasal polyps. PMID:24715051
Transferability of short tandem repeat markers for two wild Canid species inhabiting the Brazilian Cerrado.

PubMed

Rodrigues, F M; Telles, M P C; Resende, L V; Soares, T N; Diniz-Filho, J A F; Jácomo, A T A; Silveira, L

2006-12-13

The maned wolf (Chrysocyon brachyurus) and the crab-eating fox (Cerdocyon thous) are two wild-canid species found in the Brazilian Cerrado. We tested cross-amplification and transferability of 29 short tandem repeat primers originally developed for cattle and domestic dogs and cats on 38 individuals of each of these two species, collected in the Emas National Park, which is the largest national park in the Cerrado region. Six of these primers were successfully transferred (CSSM-038, PEZ-05, PEZ-12, LOCO-13, LOCO-15, and PEZ-20); five of which were found to be polymorphic. Genetic parameter values (number of alleles per locus, observed and expected heterozygosities, and fixation indices) were within the expected range reported for canid populations worldwide.
Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain

PubMed Central

de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas

2014-01-01

The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. PMID:24792163
Whole-genome sequencing in patients with ciliopathies uncovers a novel recurrent tandem duplication in IFT140.

PubMed

Geoffroy, Véronique; Stoetzel, Corinne; Scheidecker, Sophie; Schaefer, Elise; Perrault, Isabelle; Bär, Séverine; Kröll, Ariane; Delbarre, Marion; Antin, Manuela; Leuvrey, Anne-Sophie; Henry, Charline; Blanché, Hélène; Decker, Eva; Kloth, Katja; Klaus, Günter; Mache, Christoph; Martin-Coignard, Dominique; McGinn, Steven; Boland, Anne; Deleuze, Jean-François; Friant, Sylvie; Saunier, Sophie; Rozet, Jean-Michel; Bergmann, Carsten; Dollfus, Hélène; Muller, Jean

2018-04-24

Ciliopathies represent a wide spectrum of rare diseases with overlapping phenotypes and a high genetic heterogeneity. Among those, IFT140 is implicated in a variety of phenotypes ranging from isolated retinis pigmentosa to more syndromic cases. Using whole-genome sequencing in patients with uncharacterized ciliopathies, we identified a novel recurrent tandem duplication of exon 27-30 (6.7 kb) in IFT140, c.3454-488_4182+2588dup p.(Tyr1152_Thr1394dup), missed by whole-exome sequencing. Pathogenicity of the mutation was assessed on the patients' skin fibroblasts. Several hundreds of patients with a ciliopathy phenotype were screened and biallelic mutations were identified in 11 families representing 12 pathogenic variants of which seven are novel. Among those unrelated families especially with a Mainzer-Saldino syndrome, eight carried the same tandem duplication (two at the homozygous state and six at the heterozygous state). In conclusion, we demonstrated the implication of structural variations in IFT140-related diseases expanding its mutation spectrum. We also provide evidences for a unique genomic event mediated by an Alu-Alu recombination occurring on a shared haplotype. We confirm that whole-genome sequencing can be instrumental in the ability to detect structural variants for genomic disorders. © 2018 Wiley Periodicals, Inc.
Variable-number-of-tandem-repeats analysis of genetic diversity in Pasteuria ramosa.

PubMed

Mouton, L; Ebert, D

2008-05-01

Variable-number-of-tandem-repeats (VNTR) markers are increasingly being used in population genetic studies of bacteria. They were recently developed for Pasteuria ramosa, an endobacterium that infects Daphnia species. In the present study, we genotyped P. ramosa in 18 infected hosts from the United Kingdom, Belgium, and two lakes in the United States using seven VNTR markers. Two Daphnia species were collected: D. magna and D. dentifera. Six loci showed length polymorphism, with as many as five alleles identified for a single locus. Similarity coefficient calculations showed that the extent of genetic variation between pairs of isolates within populations differed according to the population, but it was always less than the genetic distances among populations. Analysis of the genetic distances performed using principal component analysis revealed strong clustering by location of origin, but not by host Daphnia species. Our study demonstrated that the VNTR markers available for P. ramosa are informative in revealing genetic differences within and among populations and may therefore become an important tool for providing detailed analysis of population genetics and epidemiology.
PGLa-H tandem-repeat peptides active against multidrug resistant clinical bacterial isolates.

PubMed

Rončević, Tomislav; Gajski, Goran; Ilić, Nada; Goić-Barišić, Ivana; Tonkić, Marija; Zoranić, Larisa; Simunić, Juraj; Benincasa, Monica; Mijaković, Marijana; Tossi, Alessandro; Juretić, Davor

2017-02-01

Antimicrobial peptides (AMPs) are promising candidates for new antibiotic classes but often display an unacceptably high toxicity towards human cells. A naturally produced C-terminal fragment of PGLa, named PGLa-H, has been reported to have a very low haemolytic activity while maintaining a moderate antibacterial activity. A sequential tandem repeat of this fragment, diPGLa-H, was designed, as well as an analogue with a Val to Gly substitution at a key position. These peptides showed markedly improved in vitro bacteriostatic and bactericidal activity against both reference strains and multidrug resistant clinical isolates of Gram-negative and Gram-positive pathogens, with generally low toxicity for human cells as assessed by haemolysis, cell viability, and DNA damage assays. The glycine substitution analogue, kiadin, had a slightly better antibacterial activity and reduced haemolytic activity, which may correlate with an increased flexibility of its helical structure, as deduced using molecular dynamics simulations. These peptides may serve as useful lead compounds for developing anti-infective agents against resistant Gram-negative and Gram-positive species. Copyright © 2016 Elsevier B.V. All rights reserved.
Clustering of Tuberculosis Cases Based on Variable-Number Tandem-Repeat Typing in Relation to the Population Structure of Mycobacterium tuberculosis in the Netherlands

PubMed Central

Sloot, Rosa; Borgdorff, Martien W.; de Beer, Jessica L.; van Ingen, Jakko; Supply, Philip

2013-01-01

The population structure of 3,776 Mycobacterium tuberculosis isolates was determined using variable-number tandem-repeat (VNTR) typing. The degree of clonality was so high that a more relaxed definition of clustering cannot be applied. Among recent immigrants with non-Euro-American isolates, transmission is overestimated if based on identical VNTR patterns. PMID:23658260
Peptides derivatized with bicyclic quaternary ammonium ionization tags. Sequencing via tandem mass spectrometry.

PubMed

Setner, Bartosz; Rudowska, Magdalena; Klem, Ewelina; Cebrat, Marek; Szewczuk, Zbigniew

2014-10-01

Improving the sensitivity of detection and fragmentation of peptides to provide reliable sequencing of peptides is an important goal of mass spectrometric analysis. Peptides derivatized by bicyclic quaternary ammonium ionization tags: 1-azabicyclo[2.2.2]octane (ABCO) or 1,4-diazabicyclo[2.2.2]octane (DABCO), are characterized by an increased detection sensitivity in electrospray ionization mass spectrometry (ESI-MS) and longer retention times on the reverse-phase (RP) chromatography columns. The improvement of the detection limit was observed even for peptides dissolved in 10 mM NaCl. Collision-induced dissociation tandem mass spectrometry of quaternary ammonium salts derivatives of peptides showed dominant a- and b-type ions, allowing facile sequencing of peptides. The bicyclic ionization tags are stable in collision-induced dissociation experiments, and the resulted fragmentation pattern is not significantly influenced by either acidic or basic amino acid residues in the peptide sequence. Obtained results indicate the general usefulness of the bicyclic quaternary ammonium ionization tags for ESI-MS/MS sequencing of peptides. Copyright © 2014 John Wiley & Sons, Ltd.
Single-cell forensic short tandem repeat typing within microfluidic droplets.

PubMed

Geng, Tao; Novak, Richard; Mathies, Richard A

2014-01-07

A short tandem repeat (STR) typing method is developed for forensic identification of individual cells. In our strategy, monodisperse 1.5 nL agarose-in-oil droplets are produced with a high frequency using a microfluidic droplet generator. Statistically dilute single cells, along with primer-functionalized microbeads, are randomly compartmentalized in the droplets. Massively parallel single-cell droplet polymerase chain reaction (PCR) is performed to transfer replicas of desired STR targets from the single-cell genomic DNA onto the coencapsulated microbeads. These DNA-conjugated beads are subsequently harvested and reamplified under statistically dilute conditions for conventional capillary electrophoresis (CE) STR fragment size analysis. The 9-plex STR profiles of single cells from both pure and mixed populations of GM09947 and GM09948 human lymphoid cells show that all alleles are correctly called and allelic drop-in/drop-out is not observed. The cell mixture study exhibits a good linear relationship between the observed and input cell ratios in the range of 1:1 to 10:1. Additionally, the STR profile of GM09947 cells could be deduced even in the presence of a high concentration of cell-free contaminating 9948 genomic DNA. Our method will be valuable for the STR analysis of samples containing mixtures of cells/DNA from multiple contributors and for low-concentration samples.
Simple sequence repeat markers that identify Claviceps species and strains

USDA-ARS?s Scientific Manuscript database

Claviceps purpurea is a pathogen that infects most members of the Pooideae subfamily and causes ergot, a floral disease in which the ovary is replaced with a sclerotium. This study was initiated to develop Simple Sequence Repeat (SSRs) markers for rapid identification of C. purpurea. SSRs were desi...
The Effective Mutation Rate at Y Chromosome Short Tandem Repeats, with Application to Human Population-Divergence Time

PubMed Central

Zhivotovsky, Lev A.; Underhill, Peter A.; Cinnioğlu, Cengiz; Kayser, Manfred; Morar, Bharti; Kivisild, Toomas; Scozzari, Rosaria; Cruciani, Fulvio; Destro-Bisol, Giovanni; Spedini, Gabriella; Chambers, Geoffrey K.; Herrera, Rene J.; Yong, Kiau Kiun; Gresham, David; Tournev, Ivailo; Feldman, Marcus W.; Kalaydjieva, Luba

2004-01-01

We estimate an effective mutation rate at an average Y chromosome short-tandem repeat locus as 6.9×10-4 per 25 years, with a standard deviation across loci of 5.7×10-4, using data on microsatellite variation within Y chromosome haplogroups defined by unique-event polymorphisms in populations with documented short-term histories, as well as comparative data on worldwide populations at both the Y chromosome and various autosomal loci. This value is used to estimate the times of the African Bantu expansion, the divergence of Polynesian populations (the Maoris, Cook Islanders, and Samoans), and the origin of Gypsy populations from Bulgaria. PMID:14691732
Clostridium botulinum group I strain genotyping by 15-locus multilocus variable-number tandem-repeat analysis.

PubMed

Fillo, Silvia; Giordani, Francesco; Anniballi, Fabrizio; Gorgé, Olivier; Ramisse, Vincent; Vergnaud, Gilles; Riehm, Julia M; Scholz, Holger C; Splettstoesser, Wolf D; Kieboom, Jasper; Olsen, Jaran-Strand; Fenicia, Lucia; Lista, Florigio

2011-12-01

Clostridium botulinum is a taxonomic designation that encompasses a broad variety of spore-forming, Gram-positive bacteria producing the botulinum neurotoxin (BoNT). C. botulinum is the etiologic agent of botulism, a rare but severe neuroparalytic disease. Fine-resolution genetic characterization of C. botulinum isolates of any BoNT type is relevant for both epidemiological studies and forensic microbiology. A 10-locus multiple-locus variable-number tandem-repeat analysis (MLVA) was previously applied to isolates of C. botulinum type A. The present study includes five additional loci designed to better address proteolytic B and F serotypes. We investigated 79 C. botulinum group I strains isolated from human and food samples in several European countries, including types A (28), B (36), AB (4), and F (11) strains, and 5 nontoxic Clostridium sporogenes. Additional data were deduced from in silico analysis of 10 available fully sequenced genomes. This 15-locus MLVA (MLVA-15) scheme identified 86 distinct genotypes that clustered consistently with the results of amplified fragment length polymorphism (AFLP) and MLVA genotyping in previous reports. An MLVA-7 scheme, a subset of the MLVA-15, performed on a lab-on-a-chip device using a nonfluorescent subset of primers, is also proposed as a first-line assay. The phylogenetic grouping obtained with the MLVA-7 does not differ significantly from that generated by the MLVA-15. To our knowledge, this report is the first to analyze genetic variability among all of the C. botulinum group I serotypes by MLVA. Our data provide new insights into the genetic variability of group I C. botulinum isolates worldwide and demonstrate that this group is genetically highly diverse.

Second generation subtyping: a proposed PulseNet protocol for multiple-locus variable-number tandem repeat analysis of Shiga toxin-producing Escherichia coli O157 (STEC O157).

PubMed

Hyytiä-Trees, Eija; Smole, Sandra C; Fields, Patricia A; Swaminathan, Bala; Ribot, Efrain M

2006-01-01

Most bacterial genomes contain tandem duplications of short DNA sequences, termed "variable-number tandem repeats" (VNTR). A subtyping method targeting these repeats, multiple-locus VNTR analysis (MLVA), has emerged as a powerful tool for characterization of clonal organisms such as Shiga toxin-producing Escherichia coli O157 (STEC O157). We modified and optimized a recently published MLVA scheme targeting 29 polymorphic VNTR regions of STEC O157 to render it suitable for routine use by public health laboratories that participate in PulseNet, the national and international molecular subtyping network for foodborne disease surveillance. Nine VNTR loci were included in the final protocol. They were amplified in three PCR reactions, after which the PCR products were sized using capillary electrophoresis. Two hundred geographically diverse, sporadic and outbreak- related STEC O157 isolates were characterized by MLVA and the results were compared with data obtained by pulsed-field gel electrophoresis (PFGE) using XbaI macrorestriction of genomic DNA. A total of 139 unique XbaI PFGE patterns and 162 MLVA types were identified. A subset of 100 isolates characterized by both XbaI and BlnI macrorestriction had 62 unique PFGE and MLVA types. Although the clustering of isolates by the two subtyping systems was generally in agreement, some discrepancies were observed. Importantly, MLVA was able to discriminate among some epidemiologically unrelated isolates which were indistinguishable by PFGE. However, among strains from three of the eight outbreaks included in the study, two single locus MLVA variants and one double locus variant were detected among epidemiologically implicated isolates that were indistinguishable by PFGE. Conversely, in three other outbreaks, isolates that were indistinguishable by MLVA displayed multiple PFGE types. An additional more extensive multi-laboratory validation of the MLVA protocol is in progress in order to address critical issues such as
ATP hydrolysis provides functions that promote rejection of pairings between different copies of long repeated sequences

PubMed Central

Danilowicz, Claudia; Hermans, Laura; Coljee, Vincent; Prévost, Chantal

2017-01-01

Abstract During DNA recombination and repair, RecA family proteins must promote rapid joining of homologous DNA. Repeated sequences with >100 base pair lengths occupy more than 1% of bacterial genomes; however, commitment to strand exchange was believed to occur after testing ∼20–30 bp. If that were true, pairings between different copies of long repeated sequences would usually become irreversible. Our experiments reveal that in the presence of ATP hydrolysis even 75 bp sequence-matched strand exchange products remain quite reversible. Experiments also indicate that when ATP hydrolysis is present, flanking heterologous dsDNA regions increase the reversibility of sequence matched strand exchange products with lengths up to ∼75 bp. Results of molecular dynamics simulations provide insight into how ATP hydrolysis destabilizes strand exchange products. These results inspired a model that shows how pairings between long repeated sequences could be efficiently rejected even though most homologous pairings form irreversible products. PMID:28854739
Development of new multilocus variable number of tandem repeat analysis (MLVA) for Listeria innocua and its application in a food processing plant.

PubMed

Takahashi, Hajime; Ohshima, Chihiro; Nakagawa, Miku; Thanatsang, Krittaporn; Phraephaisarn, Chirapiphat; Chaturongkasumrit, Yuphakhun; Keeratipibul, Suwimon; Kuda, Takashi; Kimura, Bon

2014-01-01

Listeria innocua is an important hygiene indicator bacterium in food industries because it behaves similar to Listeria monocytogenes, which is pathogenic to humans. PFGE is often used to characterize bacterial strains and to track contamination source. However, because PFGE is an expensive, complicated, time-consuming protocol, and poses difficulty in data sharing, development of a new typing method is necessary. MLVA is a technique that identifies bacterial strains on the basis of the number of tandem repeats present in the genome varies depending on the strains. MLVA has gained attention due to its high reproducibility and ease of data sharing. In this study, we developed a MLVA protocol to assess L. innocua and evaluated it by tracking the contamination source of L. innocua in an actual food manufacturing factory by typing the bacterial strains isolated from the factory. Three VNTR regions of the L. innocua genome were chosen for use in the MLVA. The number of repeat units in each VNTR region was calculated based on the results of PCR product analysis using capillary electrophoresis (CE). The calculated number of repetitions was compared with the results of the gene sequence analysis to demonstrate the accuracy of the CE repeat number analysis. The developed technique was evaluated using 60 L. innocua strains isolated from a food factory. These 60 strains were classified into 11 patterns using MLVA. Many of the strains were classified into ST-6, revealing that this MLVA strain type can contaminate each manufacturing process in the factory. The MLVA protocol developed in this study for L. innocua allowed rapid and easy analysis through the use of CE. This technique was found to be very useful in hygiene control in factories because it allowed us to track contamination sources and provided information regarding whether the bacteria were present in the factories.
Variable number of tandem repeat polymorphisms of DRD4: re-evaluation of selection hypothesis and analysis of association with schizophrenia

PubMed Central

Hattori, Eiji; Nakajima, Mizuho; Yamada, Kazuo; Iwayama, Yoshimi; Toyota, Tomoko; Saitou, Naruya; Yoshikawa, Takeo

2009-01-01

Associations have been reported between the variable number of tandem repeat (VNTR) polymorphisms in the exon 3 of dopamine D4 receptor gene gene and multiple psychiatric illnesses/traits. We examined the distribution of VNTR alleles of different length in a Japanese cohort and found that, as reported earlier, the size of allele ‘7R' was much rarer (0.5%) in Japanese than in Caucasian populations (∼20%). This presents a challenge to an earlier proposed hypothesis that positive selection favoring the allele 7R has contributed to its high frequency. To further address the issue of selection, we carried out sequencing of the VNTR region not only from human but also from chimpanzee samples, and made inference on the ancestral repeat motif and haplotype by use of a phylogenetic analysis program. The most common 4R variant was considered to be the ancestral haplotype as earlier proposed. However, in a gene tree of VNTR constructed on the basis of this inferred ancestral haplotype, the allele 7R had five descendent haplotypes in relatively long lineage, where genetic drift can have major influence. We also tested this length polymorphism for association with schizophrenia, studying two Japanese sample sets (one with 570 cases and 570 controls, and the other with 124 pedigrees). No evidence of association between the allele 7R and schizophrenia was found in any of the two data sets. Collectively, this study suggests that the VNTR variation does not have an effect large enough to cause either selection or a detectable association with schizophrenia in a study of samples of moderate size. PMID:19092778
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis

PubMed Central

Zheng, Jin-shuang; Sun, Cheng-zhen; Zhang, Shu-ning; Hou, Xi-lin; Bonnema, Guusje

2016-01-01

A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis. PMID:27507974
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis.

PubMed

Zheng, Jin-Shuang; Sun, Cheng-Zhen; Zhang, Shu-Ning; Hou, Xi-Lin; Bonnema, Guusje

2016-01-01

A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis.
6-mercaptopurine influences TPMT gene transcription in a TPMT gene promoter variable number of tandem repeats-dependent manner.

PubMed

Kotur, Nikola; Stankovic, Biljana; Kassela, Katerina; Georgitsi, Marianthi; Vicha, Anna; Leontari, Iliana; Dokmanovic, Lidija; Janic, Dragana; Krstovski, Nada; Klaassen, Kristel; Radmilovic, Milena; Stojiljkovic, Maja; Nikcevic, Gordana; Simeonidis, Argiris; Sivolapenko, Gregory; Pavlovic, Sonja; Patrinos, George P; Zukic, Branka

2012-02-01

TPMT activity is characterized by a trimodal distribution, namely low, intermediate and high methylator. TPMT gene promoter contains a variable number of GC-rich tandem repeats (VNTRs), namely A, B and C, ranging from three to nine repeats in length in an A(n)B(m)C architecture. We have previously shown that the VNTR architecture in the TPMT gene promoter affects TPMT gene transcription. MATERIALS, METHODS & RESULTS: Here we demonstrate, using reporter assays, that 6-mercaptopurine (6-MP) treatment results in a VNTR architecture-dependent decrease of TPMT gene transcription, mediated by the binding of newly recruited protein complexes to the TPMT gene promoter, upon 6-MP treatment. We also show that acute lymphoblastic leukemia patients undergoing 6-MP treatment display a VNTR architecture-dependent response to 6-MP. These data suggest that the TPMT gene promoter VNTR architecture can be potentially used as a pharmacogenomic marker to predict toxicity due to 6-MP treatment in acute lymphoblastic leukemia patients.
Multilocus variable-number tandem repeat analysis for molecular typing and phylogenetic analysis of Shigella flexneri

PubMed Central

2009-01-01

Background Shigella flexneri is one of the causative agents of shigellosis, a major cause of childhood mortality in developing countries. Multilocus variable-number tandem repeat (VNTR) analysis (MLVA) is a prominent subtyping method to resolve closely related bacterial isolates for investigation of disease outbreaks and provide information for establishing phylogenetic patterns among isolates. The present study aimed to develop an MLVA method for S. flexneri and the VNTR loci identified were tested on 242 S. flexneri isolates to evaluate their variability in various serotypes. The isolates were also analyzed by pulsed-field gel electrophoresis (PFGE) to compare the discriminatory power and to evaluate the usefulness of MLVA as a tool for phylogenetic analysis of S. flexneri. Results Thirty-six VNTR loci were identified by exploring the repeat sequence loci in genomic sequences of Shigella species and by testing the loci on nine isolates of different subserotypes. The VNTR loci in different serotype groups differed greatly in their variability. The discriminatory power of an MLVA assay based on four most variable VNTR loci was higher, though not significantly, than PFGE for the total isolates, a panel of 2a isolates, which were relatively diverse, and a panel of 4a/Y isolates, which were closely-related. Phylogenetic groupings based on PFGE patterns and MLVA profiles were considerably concordant. The genetic relationships among the isolates were correlated with serotypes. The phylogenetic trees constructed using PFGE patterns and MLVA profiles presented two distinct clusters for the isolates of serotype 3 and one distinct cluster for each of the serotype groups, 1a/1b/NT, 2a/2b/X/NT, 4a/Y, and 6. Isolates that had different serotypes but had closer genetic relatedness than those with the same serotype were observed between serotype Y and subserotype 4a, serotype X and subserotype 2b, subserotype 1a and 1b, and subserotype 3a and 3b. Conclusions The 36 VNTR loci
Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain.

PubMed

de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas

2014-06-01

The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Organization and evolution of highly repeated satellite DNA sequences in plant chromosomes.

PubMed

Sharma, S; Raina, S N

2005-01-01

A major component of the plant nuclear genome is constituted by different classes of repetitive DNA sequences. The structural, functional and evolutionary aspects of the satellite repetitive DNA families, and their organization in the chromosomes is reviewed. The tandem satellite DNA sequences exhibit characteristic chromosomal locations, usually at subtelomeric and centromeric regions. The repetitive DNA family(ies) may be widely distributed in a taxonomic family or a genus, or may be specific for a species, genome or even a chromosome. They may acquire large-scale variations in their sequence and copy number over an evolutionary time-scale. These features have formed the basis of extensive utilization of repetitive sequences for taxonomic and phylogenetic studies. Hybrid polyploids have especially proven to be excellent models for studying the evolution of repetitive DNA sequences. Recent studies explicitly show that some repetitive DNA families localized at the telomeres and centromeres have acquired important structural and functional significance. The repetitive elements are under different evolutionary constraints as compared to the genes. Satellite DNA families are thought to arise de novo as a consequence of molecular mechanisms such as unequal crossing over, rolling circle amplification, replication slippage and mutation that constitute "molecular drive". Copyright 2005 S. Karger AG, Basel.
A blackberry (Rubus L.) expressed sequence tag library for the development of simple sequence repeat markers

PubMed Central

Lewers, Kim S; Saski, Chris A; Cuthbertson, Brandon J; Henry, David C; Staton, Meg E; Main, Dorrie S; Dhanaraj, Anik L; Rowland, Lisa J; Tomkins, Jeff P

2008-01-01

Background The recent development of novel repeat-fruiting types of blackberry (Rubus L.) cultivars, combined with a long history of morphological marker-assisted selection for thornlessness by blackberry breeders, has given rise to increased interest in using molecular markers to facilitate blackberry breeding. Yet no genetic maps, molecular markers, or even sequences exist specifically for cultivated blackberry. The purpose of this study is to begin development of these tools by generating and annotating the first blackberry expressed sequence tag (EST) library, designing primers from the ESTs to amplify regions containing simple sequence repeats (SSR), and testing the usefulness of a subset of the EST-SSRs with two blackberry cultivars. Results A cDNA library of 18,432 clones was generated from expanding leaf tissue of the cultivar Merton Thornless, a progenitor of many thornless commercial cultivars. Among the most abundantly expressed of the 3,000 genes annotated were those involved with energy, cell structure, and defense. From individual sequences containing SSRs, 673 primer pairs were designed. Of a randomly chosen set of 33 primer pairs tested with two blackberry cultivars, 10 detected an average of 1.9 polymorphic PCR products. Conclusion This rate predicts that this library may yield as many as 940 SSR primer pairs detecting 1,786 polymorphisms. This may be sufficient to generate a genetic map that can be used to associate molecular markers with phenotypic traits, making possible molecular marker-assisted breeding to compliment existing morphological marker-assisted breeding in blackberry. PMID:18570660
Multiple-locus variable-number tandem-repeats analysis of Listeria monocytogenes using multicolour capillary electrophoresis and comparison with pulsed-field gel electrophoresis typing.

PubMed

Lindstedt, Bjørn-Arne; Tham, Wilhelm; Danielsson-Tham, Marie-Louise; Vardund, Traute; Helmersson, Seved; Kapperud, Georg

2008-02-01

The multiple-locus variable-number tandem-repeats analysis (MLVA) method for genotyping has proven to be a fast and reliable typing tool in several bacterial species. MLVA is in our laboratory the routine typing method for Salmonella enterica subsp. enterica serovar Typhimurium and Escherichia coli O157. The gram-positive bacteria Listeria monocytogenes, while not isolated as frequent as S. Typhimurium and E. coli, causes severe illness with an overall mortality rate of 30%. Thus, it is important that any outbreak of this pathogen is detected early and a fast trace to the source can be performed. In view of this, we have used the information provided by two fully sequenced L. monocytogenes strains to develop a MLVA assay coupled with high-resolution capillary electrophoresis and compared it to pulsed-field gel electrophoresis (PFGE) in two sets of isolates, one Norwegian (79 isolates) and one Swedish (61 isolates) set. The MLVA assay could resolve all of the L. monocytogenes serotypes tested, and was slightly more discriminatory than PFGE for the Norwegian isolates (28 MLVA profiles and 24 PFGE profiles) and opposite for the Swedish isolates (42 MLVA profiles and 43 PFGE profiles).
Topological characteristics of helical repeat proteins.

PubMed

Groves, M R; Barford, D

1999-06-01

The recent elucidation of protein structures based upon repeating amino acid motifs, including the armadillo motif, the HEAT motif and tetratricopeptide repeats, reveals that they belong to the class of helical repeat proteins. These proteins share the common property of being assembled from tandem repeats of an alpha-helical structural unit, creating extended superhelical structures that are ideally suited to create a protein recognition interface.
Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

PubMed

Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

2014-01-01

Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.
Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers

PubMed Central

Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

2014-01-01

Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551
Direct repeat sequences in the Streptomyces chitinase-63 promoter direct both glucose repression and chitin induction

PubMed Central

Ni, Xiangyang; Westpheling, Janet

1997-01-01

The chi63 promoter directs glucose-sensitive, chitin-dependent transcription of a gene involved in the utilization of chitin as carbon source. Analysis of 5′ and 3′ deletions of the promoter region revealed that a 350-bp segment is sufficient for wild-type levels of expression and regulation. The analysis of single base changes throughout the promoter region, introduced by random and site-directed mutagenesis, identified several sequences to be important for activity and regulation. Single base changes at −10, −12, −32, −33, −35, and −37 upstream of the transcription start site resulted in loss of activity from the promoter, suggesting that bases in these positions are important for RNA polymerase interaction. The sequences centered around −10 (TATTCT) and −35 (TTGACC) in this promoter are, in fact, prototypical of eubacterial promoters. Overlapping the RNA polymerase binding site is a perfect 12-bp direct repeat sequence. Some base changes within this direct repeat resulted in constitutive expression, suggesting that this sequence is an operator for negative regulation. Other base changes resulted in loss of glucose repression while retaining the requirement for chitin induction, suggesting that this sequence is also involved in glucose repression. The fact that cis-acting mutations resulted in glucose resistance but not inducer independence rules out the possibility that glucose repression acts exclusively by inducer exclusion. The fact that mutations that affect glucose repression and chitin induction fall within the same direct repeat sequence module suggests that the direct repeat sequence facilitates both chitin induction and glucose repression. PMID:9371809
Highly Discriminatory Variable-Number Tandem-Repeat Markers for Genotyping of Trichophyton interdigitale Strains

PubMed Central

Drira, Ines; Hadrich, Ines; Neji, Sourour; Mahfouth, Nedia; Trabelsi, Houaida; Sellami, Hayet; Makni, Fattouma

2014-01-01

Trichophyton interdigitale is the second most frequent cause of superficial fungal infections of various parts of the human body. Studying the population structure and genotype differentiation of T. interdigitale strains may lead to significant improvements in clinical practice. The present study aimed to develop and select suitable variable-number tandem-repeat (VNTR) markers for 92 clinical strains of T. interdigitale. On the basis of an analysis of four VNTR markers, four to eight distinct alleles were detected for each marker. The marker with the highest discriminatory power had eight alleles and a D value of 0.802. The combination of all four markers yielded a D value of 0.969 with 29 distinct multilocus genotypes. VNTR typing revealed the genetic diversity of the strains, identifying three populations according to their colonization sites. A correlation between phenotypic characteristics and multilocus genotypes was observed. Seven patients harbored T. interdigitale strains with different genotypes. Typing of clinical T. interdigitale samples by VNTR markers displayed excellent discriminatory power and 100% reproducibility. PMID:24989614
CRISPRcompar: a website to compare clustered regularly interspaced short palindromic repeats.

PubMed

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2008-07-01

Clustered regularly interspaced short palindromic repeat (CRISPR) elements are a particular family of tandem repeats present in prokaryotic genomes, in almost all archaea and in about half of bacteria, and which participate in a mechanism of acquired resistance against phages. They consist in a succession of direct repeats (DR) of 24-47 bp separated by similar sized unique sequences (spacers). In the large majority of cases, the direct repeats are highly conserved, while the number and nature of the spacers are often quite diverse, even among strains of a same species. Furthermore, the acquisition of new units (DR + spacer) was shown to happen almost exclusively on one side of the locus. Therefore, the CRISPR presents an interesting genetic marker for comparative and evolutionary analysis of closely related bacterial strains. CRISPRcompar is a web service created to assist biologists in the CRISPR typing process. Two tools facilitates the in silico investigation: CRISPRcomparison and CRISPRtionary. This website is freely accessible at http://crispr.u-psud.fr/CRISPRcompar/.
Development of expressed sequence tag-simple sequence repeat markers for genetic characterization and population structure analysis of Praxelis clematidea (Asteraceae).

PubMed

Wang, Q Z; Huang, M; Downie, S R; Chen, Z X

2016-05-23

Invasive plants tend to spread aggressively in new habitats and an understanding of their genetic diversity and population structure is useful for their management. In this study, expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed for the invasive plant species Praxelis clematidea (Asteraceae) from 5548 Stevia rebaudiana (Asteraceae) expressed sequence tags (ESTs). A total of 133 microsatellite-containing ESTs (2.4%) were identified, of which 56 (42.1%) were hexanucleotide repeat motifs and 50 (37.6%) were trinucleotide repeat motifs. Of the 24 primer pairs designed from these 133 ESTs, 7 (29.2%) resulted in significant polymorphisms. The number of alleles per locus ranged from 5 to 9. The relatively high genetic diversity (H = 0.2667, I = 0.4212, and P = 100%) of P. clematidea was related to high gene flow (Nm = 1.4996) among populations. The coefficient of population differentiation (GST = 0.2500) indicated that most genetic variation occurred within populations. A Mantel test suggested that there was significant correlation between genetic distance and geographical distribution (r = 0.3192, P = 0.012). These results further support the transferability of EST-SSR markers between closely related genera of the same family.
Differentiation of “Candidatus Liberibacter asiaticus” Isolates by Variable-Number Tandem-Repeat Analysis ▿

PubMed Central

Katoh, Hiroshi; Subandiyah, Siti; Tomimura, Kenta; Okuda, Mitsuru; Su, Hong-Ji; Iwanami, Toru

2011-01-01

Four highly polymorphic simple sequence repeat (SSR) loci were selected and used to differentiate 84 Japanese isolates of “Candidatus Liberibacter asiaticus.” The Nei's measure of genetic diversity values for these four SSRs ranged from 0.60 to 0.86. The four SSR loci were also highly polymorphic in four isolates from Taiwan and 12 isolates from Indonesia. PMID:21239554

Neutral polymorphisms in putative housekeeping genes and tandem repeats unravels the population genetics and evolutionary history of Plasmodium vivax in India.

PubMed

Prajapati, Surendra K; Joshi, Hema; Carlton, Jane M; Rizvi, M Alam

2013-01-01

The evolutionary history and age of Plasmodium vivax has been inferred as both recent and ancient by several studies, mainly using mitochondrial genome diversity. Here we address the age of P. vivax on the Indian subcontinent using selectively neutral housekeeping genes and tandem repeat loci. Analysis of ten housekeeping genes revealed a substantial number of SNPs (n = 75) from 100 P. vivax isolates collected from five geographical regions of India. Neutrality tests showed a majority of the housekeeping genes were selectively neutral, confirming the suitability of housekeeping genes for inferring the evolutionary history of P. vivax. In addition, a genetic differentiation test using housekeeping gene polymorphism data showed a lack of geographical structuring between the five regions of India. The coalescence analysis of the time to the most recent common ancestor estimate yielded an ancient TMRCA (232,228 to 303,030 years) and long-term population history (79,235 to 104,008) of extant P. vivax on the Indian subcontinent. Analysis of 18 tandem repeat loci polymorphisms showed substantial allelic diversity and heterozygosity per locus, and analysis of potential bottlenecks revealed the signature of a stable P. vivax population, further corroborating our ancient age estimates. For the first time we report a comparable evolutionary history of P. vivax inferred by nuclear genetic markers (putative housekeeping genes) to that inferred from mitochondrial genome diversity.
Chromosome ends: different sequences may provide conserved functions.

PubMed

Louis, Edward J; Vershinin, Alexander V

2005-07-01

The structures of specific chromosome regions, centromeres and telomeres, present a number of puzzles. As functions performed by these regions are ubiquitous and essential, their DNA, proteins and chromatin structure are expected to be conserved. Recent studies of centromeric DNA from human, Drosophila and plant species have demonstrated that a hidden universal centromere-specific sequence is highly unlikely. The DNA of telomeres is more conserved consisting of a tandemly repeated 6-8 bp Arabidopsis-like sequence in a majority of organisms as diverse as protozoan, fungi, mammals and plants. However, there are alternatives to short DNA repeats at the ends of chromosomes and for telomere elongation by telomerase. Here we focus on the similarities and diversity that exist among the structural elements, DNA sequences and proteins, that make up terminal domains (telomeres and subtelomeres), and how organisms use these in different ways to fulfil the functions of end-replication and end-protection. Copyright (c) 2005 Wiley Periodicals, Inc.
Determination of Sources of Escherichia coli on Beef by Multiple-Locus Variable-Number Tandem Repeat Analysis.

PubMed

Yang, Xianqin; Tran, Frances; Youssef, Mohamed K; Gill, Colin O

2015-07-01

The possible origin of Escherichia coli found on cuts and trimmings in the breaking facility of a beef packing plant was examined using multiple-locus variable-number tandem repeat analysis. Coliforms and E. coli were enumerated in samples obtained from 160 carcasses that would enter the breaking facility when work commenced and after each of the three production breaks throughout the day, from the conveyor belt before work and after each break, and from cuts and trimmings when work commenced and after each break. Most samples yielded no E. coli, irrespective of the surface types. E. coli was recovered from 7 (<5%) carcasses, at numbers mostly ≤1.0 log CFU/160,000 cm(2). The log total numbers of E. coli recovered from the conveyor belt, cuts, and trimmings were mostly between 1 and 2 log CFU/80,000 cm(2). A total of 554 E. coli isolates were recovered. Multiple-locus variable-number tandem repeat analysis of 327 selected isolates identified 80 distinct genotypes, with 37 (46%) each containing one isolate. However, 28% of the isolates were of genotypes that were recovered from more than one sampling day. Of the 80 genotypes, 65 and 2% were found in one or all four sampling periods throughout the day. However, they represented 23 and 14% of the isolates, respectively. Of the genotypes identified for each surface type, at least one contained ≥9 isolates. No unique genotypes were associated with carcasses, but 10, 17, and 19 were uniquely associated with cuts, trimmings, and the belt, respectively. Of the isolates recovered from cuts, 49, 3, and 19% were of genotypes that were found among isolates recovered from the belt, carcasses, or both the belt and carcasses, respectively. A similar composition was found for isolates recovered from trimmings. These findings show that the E. coli found on cuts and trimmings at this beef packing plant mainly originated from the conveyor belt and that small number of E. coli strains survived the daily cleaning and sanitation
Short tandem repeat profiling: part of an overall strategy for reducing the frequency of cell misidentification.

PubMed

Nims, Raymond W; Sykes, Greg; Cottrill, Karin; Ikonomi, Pranvera; Elmore, Eugene

2010-12-01

The role of cell authentication in biomedical science has received considerable attention, especially within the past decade. This quality control attribute is now beginning to be given the emphasis it deserves by granting agencies and by scientific journals. Short tandem repeat (STR) profiling, one of a few DNA profiling technologies now available, is being proposed for routine identification (authentication) of human cell lines, stem cells, and tissues. The advantage of this technique over methods such as isoenzyme analysis, karyotyping, human leukocyte antigen typing, etc., is that STR profiling can establish identity to the individual level, provided that the appropriate number and types of loci are evaluated. To best employ this technology, a standardized protocol and a data-driven, quality-controlled, and publically searchable database will be necessary. This public STR database (currently under development) will enable investigators to rapidly authenticate human-based cultures to the individual from whom the cells were sourced. Use of similar approaches for non-human animal cells will require developing other suitable loci sets. While implementing STR analysis on a more routine basis should significantly reduce the frequency of cell misidentification, additional technologies may be needed as part of an overall authentication paradigm. For instance, isoenzyme analysis, PCR-based DNA amplification, and sequence-based barcoding methods enable rapid confirmation of a cell line's species of origin while screening against cross-contaminations, especially when the cells present are not recognized by the species-specific STR method. Karyotyping may also be needed as a supporting tool during establishment of an STR database. Finally, good cell culture practices must always remain a major component of any effort to reduce the frequency of cell misidentification.
Application of multilocus variable number tandem repeat analysis to monitor Verocytotoxin-producing Escherichia coli O157 phage type 8 in England and Wales: emergence of a profile associated with a national outbreak.

PubMed

Perry, N; Cheasty, T; Dallman, T; Launders, N; Willshaw, G

2013-10-01

Evaluation of multilocus variable number tandem repeat analysis (MLVA) to subtype all isolates of Vero cytotoxin-producing Escherichia coli O157 phage type 8 in England and Wales. Over a 13 month period from December 2010, 483 isolates of VTEC O157 PT8 were tested by MLVA; 39% were received in the first 4 months of 2011, when infections are generally low. One profile, or single locus variants of it, was present in 249 (52%) isolates but was not common previously. These cases represented a national increase in PT8, associated epidemiologically with soil-contaminated vegetables. Most of the 177 other MLVA profiles were unique to a single isolate. Profiles shared by >1 isolate included cases from two small community, food-borne outbreaks and 11 households. Several shared profiles were found among 23 isolates without known links. Apart from one group, isolates linked to travel abroad had very diverse profiles. Multilocus variable number tandem repeat analysis discriminated apparent sporadic isolates of the same PT and assisted in detection of cases in an emerging national outbreak. Multilocus variable number tandem repeat analysis is an epidemiologically valid complement to surveillance and applicable as a rapid, practical test for large numbers of isolates. © 2013 The Society for Applied Microbiology.
Genome-wide characterization and selection of expressed sequence tag simple sequence repeat primers for optimized marker distribution and reliability in peach

USDA-ARS?s Scientific Manuscript database

Expressed sequence tag (EST) simple sequence repeats (SSRs) in Prunus were mined, and flanking primers designed and used for genome-wide characterization and selection of primers to optimize marker distribution and reliability. A total of 12,618 contigs were assembled from 84,727 ESTs, along with 34...
Slipped-strand mispairing at noncontiguous repeats in Poecilia reticulata: a model for minisatellite birth.

PubMed Central

Taylor, J S; Breden, F

2000-01-01

The standard slipped-strand mispairing (SSM) model for the formation of variable number tandem repeats (VNTRs) proposes that a few tandem repeats, produced by chance mutations, provide the "raw material" for VNTR expansion. However, this model is unlikely to explain the formation of VNTRs with long motifs (e.g., minisatellites), because the likelihood of a tandem repeat forming by chance decreases rapidly as the length of the repeat motif increases. Phylogenetic reconstruction of the birth of a mitochondrial (mt) DNA minisatellite in guppies suggests that VNTRs with long motifs can form as a consequence of SSM at noncontiguous repeats. VNTRs formed in this manner have motifs longer than the noncontiguous repeat originally formed by chance and are flanked by one unit of the original, noncontiguous repeat. SSM at noncontiguous repeats can therefore explain the birth of VNTRs with long motifs and the "imperfect" or "short direct" repeats frequently observed adjacent to both mtDNA and nuclear VNTRs. PMID:10880490
Highly diverse variable number tandem repeat loci in the E. coli O157:H7 and O55:H7 genomes for high-resolution molecular typing.

PubMed

Keys, C; Kemper, S; Keim, P

2005-01-01

Evaluation of the Escherichia coli genome for variable number tandem repeat (VNTR) loci in order to provide a subtyping tool with greater discrimination and more efficient capacity. Twenty-nine putative VNTR loci were identified from the E. coli genomic sequence. Their variability was validated by characterizing the number of repeats at each locus in a set of 56 E. coli O157:H7/HN and O55:H7 isolates. An optimized multiplex assay system was developed to facility high capacity analysis. Locus diversity values ranged from 0.23 to 0.95 while the number of alleles ranged from two to 29. This multiple-locus VNTR analysis (MLVA) data was used to describe genetic relationships among these isolates and was compared with PFGE (pulse field gel electrophoresis) data from a subset of the same strains. Genetic similarity values were highly correlated between the two approaches, through MLVA was capable of discrimination amongst closely related isolates when PFGE similar values were equal to 1.0. Highly variable VNTR loci exist in the E. coli O157:H7 genome and are excellent estimators of genetic relationships, in particular for closely related isolates. Escherichia coli O157:H7 MLVA offers a complimentary analysis to the more traditional PFGE approach. Application of MLVA to an outbreak cluster could generate superior molecular epidemiology and result in a more effective public health response.
Mining and validation of pyrosequenced simple sequence repeats (SSRs) from American cranberry (Vaccinium macrocarpon Ait.).

PubMed

Zhu, H; Senalik, D; McCown, B H; Zeldin, E L; Speers, J; Hyman, J; Bassil, N; Hummer, K; Simon, P W; Zalapa, J E

2012-01-01

The American cranberry (Vaccinium macrocarpon Ait.) is a major commercial fruit crop in North America, but limited genetic resources have been developed for the species. Furthermore, the paucity of codominant DNA markers has hampered the advance of genetic research in cranberry and the Ericaceae family in general. Therefore, we used Roche 454 sequencing technology to perform low-coverage whole genome shotgun sequencing of the cranberry cultivar 'HyRed'. After de novo assembly, the obtained sequence covered 266.3 Mb of the estimated 540-590 Mb in cranberry genome. A total of 107,244 SSR loci were detected with an overall density across the genome of 403 SSR/Mb. The AG repeat was the most frequent motif in cranberry accounting for 35% of all SSRs and together with AAG and AAAT accounted for 46% of all loci discovered. To validate the SSR loci, we designed 96 primer-pairs using contig sequence data containing perfect SSR repeats, and studied the genetic diversity of 25 cranberry genotypes. We identified 48 polymorphic SSR loci with 2-15 alleles per locus for a total of 323 alleles in the 25 cranberry genotypes. Genetic clustering by principal coordinates and genetic structure analyzes confirmed the heterogeneous nature of cranberries. The parentage composition of several hybrid cultivars was evident from the structure analyzes. Whole genome shotgun 454 sequencing was a cost-effective and efficient way to identify numerous SSR repeats in the cranberry sequence for marker development.
Inverted repeats in the promoter as an autoregulatory sequence for TcrX in Mycobacterium tuberculosis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhattacharya, Monolekha; Das, Amit Kumar, E-mail: amitk@hijli.iitkgp.ernet.in

Highlights: Black-Right-Pointing-Pointer The regulatory sequences recognized by TcrX have been identified. Black-Right-Pointing-Pointer The regulatory region comprises of inverted repeats segregated by 30 bp region. Black-Right-Pointing-Pointer The mode of binding of TcrX with regulatory sequence is unique. Black-Right-Pointing-Pointer In silico TcrX-DNA docked model binds one of the inverted repeats. Black-Right-Pointing-Pointer Both phosphorylated and unphosphorylated TcrX binds regulatory sequence in vitro. -- Abstract: TcrY, a histidine kinase, and TcrX, a response regulator, constitute a two-component system in Mycobacterium tuberculosis. tcrX, which is expressed during iron scarcity, is instrumental in the survival of iron-dependent M. tuberculosis. However, the regulator of tcrX/Y has notmore » been fully characterized. Crosslinking studies of TcrX reveal that it can form oligomers in vitro. Electrophoretic mobility shift assays (EMSAs) show that TcrX recognizes two regions in the promoter that are comprised of inverted repeats separated by {approx}30 bp. The dimeric in silico model of TcrX predicts binding to one of these inverted repeat regions. Site-directed mutagenesis and radioactive phosphorylation indicate that D54 of TcrX is phosphorylated by H256 of TcrY. However, phosphorylated and unphosphorylated TcrX bind the regulatory sequence with equal efficiency, which was shown with an EMSA using the D54A TcrX mutant.« less
Massively parallel sequencing of forensic STRs: Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements.

PubMed

Parson, Walther; Ballard, David; Budowle, Bruce; Butler, John M; Gettings, Katherine B; Gill, Peter; Gusmão, Leonor; Hares, Douglas R; Irwin, Jodi A; King, Jonathan L; Knijff, Peter de; Morling, Niels; Prinz, Mechthild; Schneider, Peter M; Neste, Christophe Van; Willuweit, Sascha; Phillips, Christopher

2016-05-01

The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established
Genetic diversity of Y-short tandem repeats in Chinese native cattle breeds.

PubMed

Xin, Y P; Zan, L S; Liu, Y F; Tian, W Q; Wang, H B; Cheng, G; Li, A N; Yang, W C

2014-11-14

The aim of this study is to use Y-chromosome gene polymorphism method to investigate regional differences in genetic variation and population evolution history of the Chinese native cattle breeds. Six Y-chromosome short tandem repeat (Y-STR) loci (UMN0929, UMN0108, UMN0920, INRA124, UMN2404, and UMN0103) were analyzed using 1016 healthy and heterogenetic males and 90 females of 9 native cattle breeds (Qinchuan, Jinnan, Zaosheng, Luxi, Nanyang, Jiaxian, Dabieshan, Yanbian, and Menggu) in China. Allele frequency and gene diversity were calculated for the various populations. The results indicated that Y-STRs in the 6 loci have polymorphisms and genetic diversity in Chinese cattle populations. The genetic diversity analysis revealed that the Chinese cattle populations have a close genetic relationship. The analysis of INRA124, UMN2404, and UMN0103 loci revealed the original history of Chinese cattle because of which cattle belonging to Bos taurus or Bos indicus could be determined. Interestingly, a declining zebu introgression was displayed from South to North and from East to West in the Chinese geographical distribution, which implied that cattle population from various regions of China had been subjected to somewhat different evolutionary history. This conclusion supported other evidences such as earlier archaeological, historical research, and blood protein polymorphism analysis.
Evaluation of advanced multiplex short tandem repeat systems in pairwise kinship analysis.

PubMed

Tamura, Tomonori; Osawa, Motoki; Ochiai, Eriko; Suzuki, Takanori; Nakamura, Takashi

2015-09-01

The AmpFLSTR Identifiler Kit, comprising 15 autosomal short tandem repeat (STR) loci, is commonly employed in forensic practice for calculating match probabilities and parentage testing. The conventional system exhibits insufficient estimation for kinship analysis such as sibship testing because of shortness of examined loci. This study evaluated the power of the PowerPlex Fusion System, GlobalFiler Kit, and PowerPlex 21 System, which comprise more than 20 autosomal STR loci, to estimate pairwise blood relatedness (i.e., parent-child, full siblings, second-degree relatives, and first cousins). The genotypes of all 24 STR loci in 10,000 putative pedigrees were constructed by simulation. The likelihood ratio for each locus was calculated from joint probabilities for relatives and non-relatives. The combined likelihood ratio was calculated according to the product rule. The addition of STR loci improved separation between relatives and non-relatives. However, these systems were less effectively extended to the inference for first cousins. In conclusion, these advanced systems will be useful in forensic personal identification, especially in the evaluation of full siblings and second-degree relatives. Moreover, the additional loci may give rise to two major issues of more frequent mutational events and several pairs of linked loci on the same chromosome. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Orthogonal tandem catalysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lohr, Tracy L.; Marks, Tobin J.

2015-05-20

Tandem catalysis is a growing field that is beginning to yield important scientific and technological advances toward new and more efficient catalytic processes. 'One-pot' tandem reactions, where multiple catalysts and reagents, combined in a single reaction vessel undergo a sequence of precisely staged catalytic steps, are highly attractive from the standpoint of reducing both waste and time. Orthogonal tandem catalysis is a subset of one-pot reactions in which more than one catalyst is used to promote two or more mechanistically distinct reaction steps. This Perspective summarizes and analyses some of the recent developments and successes in orthogonal tandem catalysis, withmore » particular focus on recent strategies to address catalyst incompatibility. We also highlight the concept of thermodynamic leveraging by coupling multiple catalyst cycles to effect challenging transformations not observed in single-step processes, and to encourage application of this technique to energetically unfavourable or demanding reactions.« less
Repeated-Sprint Sequences During Female Soccer Matches Using Fixed and Individual Speed Thresholds.

PubMed

Nakamura, Fábio Y; Pereira, Lucas A; Loturco, Irineu; Rosseti, Marcelo; Moura, Felipe A; Bradley, Paul S

2017-07-01

Nakamura, FY, Pereira, LA, Loturco, I, Rosseti, M, Moura, FA, and Bradley, PS. Repeated-sprint sequences during female soccer matches using fixed and individual speed thresholds. J Strength Cond Res 31(7): 1802-1810, 2017-The main objective of this study was to characterize the occurrence of single sprint and repeated-sprint sequences (RSS) during elite female soccer matches, using fixed (20 km·h) and individually based speed thresholds (>90% of the mean speed from a 20-m sprint test). Eleven elite female soccer players from the same team participated in the study. All players performed a 20-m linear sprint test, and were assessed in up to 10 official matches using Global Positioning System technology. Magnitude-based inferences were used to test for meaningful differences. Results revealed that irrespective of adopting fixed or individual speed thresholds, female players produced only a few RSS during matches (2.3 ± 2.4 sequences using the fixed threshold and 3.3 ± 3.0 sequences using the individually based threshold), with most sequences composing of just 2 sprints. Additionally, central defenders performed fewer sprints (10.2 ± 4.1) than other positions (fullbacks: 28.1 ± 5.5; midfielders: 21.9 ± 10.5; forwards: 31.9 ± 11.1; with the differences being likely to almost certainly associated with effect sizes ranging from 1.65 to 2.72), and sprinting ability declined in the second half. The data do not support the notion that RSS occurs frequently during soccer matches in female players, irrespective of using fixed or individual speed thresholds to define sprint occurrence. However, repeated-sprint ability development cannot be ruled out from soccer training programs because of its association with match-related performance.
Megabase sequencing of human genome by ordered-shotgun-sequencing (OSS) strategy

NASA Astrophysics Data System (ADS)

Chen, Ellson Y.

1997-05-01

So far we have used OSS strategy to sequence over 2 megabases DNA in large-insert clones from regions of human X chromosomes with different characteristic levels of GC content. The method starts by randomly fragmenting a BAC, YAC or PAC to 8-12 kb pieces and subcloning those into lambda phage. Insert-ends of these clones are sequenced and overlapped to create a partial map. Complete sequencing is then done on a minimal tiling path of selected subclones, recursively focusing on those at the edges of contigs to facilitate mergers of clones across the entire target. To reduce manual labor, PCR processes have been adapted to prepare sequencing templates throughout the entire operation. The streamlined process can thus lend itself to further automation. The OSS approach is suitable for large- scale genomic sequencing, providing considerable flexibility in the choice of subclones or regions for more or less intensive sequencing. For example, subclones containing contaminating host cell DNA or cloning vector can be recognized and ignored with minimal sequencing effort; regions overlapping a neighboring clone already sequenced need not be redone; and segments containing tandem repeats or long repetitive sequences can be spotted early on and targeted for additional attention.
Multiple-locus variable-number tandem repeat analysis for strain discrimination of non-O157 Shiga toxin-producing Escherichia coli.

PubMed

Timmons, Chris; Trees, Eija; Ribot, Efrain M; Gerner-Smidt, Peter; LaFon, Patti; Im, Sung; Ma, Li Maria

2016-06-01

Non-O157 Shiga toxin-producing Escherichia coli (STEC) are foodborne pathogens of growing concern worldwide that have been associated with several recent multistate and multinational outbreaks of foodborne illness. Rapid and sensitive molecular-based bacterial strain discrimination methods are critical for timely outbreak identification and contaminated food source traceback. One such method, multiple-locus variable-number tandem repeat analysis (MLVA), is being used with increasing frequency in foodborne illness outbreak investigations to augment the current gold standard bacterial subtyping technique, pulsed-field gel electrophoresis (PFGE). The objective of this study was to develop a MLVA assay for intra- and inter-serogroup discrimination of six major non-O157 STEC serogroups-O26, O111, O103, O121, O45, and O145-and perform a preliminary internal validation of the method on a limited number of clinical isolates. The resultant MLVA scheme consists of ten variable number tandem repeat (VNTR) loci amplified in three multiplex PCR reactions. Sixty-five unique MLVA types were obtained among 84 clinical non-O157 STEC strains comprised of geographically diverse sporadic and outbreak related isolates. Compared to PFGE, the developed MLVA scheme allowed similar discrimination among serogroups O26, O111, O103, and O121 but not among O145 and O45. To more fully compare the discriminatory power of this preliminary MLVA method to PFGE and to determine its epidemiological congruence, a thorough internal and external validation needs to be performed on a carefully selected large panel of strains, including multiple isolates from single outbreaks. Copyright © 2016. Published by Elsevier B.V.
Laser Desorption Mass Spectrometry for DNA Sequencing and Analysis

NASA Astrophysics Data System (ADS)

Chen, C. H. Winston; Taranenko, N. I.; Golovlev, V. V.; Isola, N. R.; Allman, S. L.

1998-03-01

Rapid DNA sequencing and/or analysis is critically important for biomedical research. In the past, gel electrophoresis has been the primary tool to achieve DNA analysis and sequencing. However, gel electrophoresis is a time-consuming and labor-extensive process. Recently, we have developed and used laser desorption mass spectrometry (LDMS) to achieve sequencing of ss-DNA longer than 100 nucleotides. With LDMS, we succeeded in sequencing DNA in seconds instead of hours or days required by gel electrophoresis. In addition to sequencing, we also applied LDMS for the detection of DNA probes for hybridization LDMS was also used to detect short tandem repeats for forensic applications. Clinical applications for disease diagnosis such as cystic fibrosis caused by base deletion and point mutation have also been demonstrated. Experimental details will be presented in the meeting. abstract.
A new family of dispersed repeats from Brassica nigra: characterization and localization.

PubMed

Kapila, R; Negi, M S; This, P; Delseny, M; Srivastava, P S; Lakshmikumaran, M

1996-11-01

The 459-bp HindIII (pBN-4) and the 1732-bp Eco RI (pBNE8) fragments from the Brassica nigra genome were cloned and shown to be members of a dispersed repeat family. Of the three major diploid Brassica species, the repeat pBN-4 was found to be highly specific for the B. nigra genome. The family also hybridized to Sinapis arvensis showing that B. nigra had a closer relationship with the S. arvensis genome than with B. oleracea or B. campestris. The clone pBNE8 showed homology to a number of tRNA species indicating that this family of repeats may have originated from a tRNA sequence. The species-specific 459-bp repeat pBN-4 was localized on the B. nigra chromosomes using monosomic addition lines. In addition to the localization of pBN-4, the chromosomal distribution of two other species-specific repeats, pBN34 and pBNBH35 (reported earlier), was studied. The dispersed repeats pBN-4 and pBNBH35 were found to be present on all of the chromosomes, whereas the tandem repeat pBN34 was localized on two chromosomes.
NIST mixed stain study 3: signal intensity balance in commercial short tandem repeat multiplexes.

PubMed

Duewer, David L; Kline, Margaret C; Redman, Janette W; Butler, John M

2004-12-01

Short-tandem repeat (STR) allelic intensities were collected from more than 60 forensic laboratories for a suite of seven samples as part of the National Institute of Standards and Technology-coordinated 2001 Mixed Stain Study 3 (MSS3). These interlaboratory challenge data illuminate the relative importance of intrinsic and user-determined factors affecting the locus-to-locus balance of signal intensities for currently used STR multiplexes. To varying degrees, seven of the eight commercially produced multiplexes used by MSS3 participants displayed very similar patterns of intensity differences among the different loci probed by the multiplexes for all samples, in the hands of multiple analysts, with a variety of supplies and instruments. These systematic differences reflect intrinsic properties of the individual multiplexes, not user-controllable measurement practices. To the extent that quality systems specify minimum and maximum absolute intensities for data acceptability and data interpretation schema require among-locus balance, these intrinsic intensity differences may decrease the utility of multiplex results and surely increase the cost of analysis.

Lymphatic filarial species differentiation using evolutionarily modified tandem repeats: generation of new genetic markers.

PubMed

Sakthidevi, Moorthy; Murugan, Vadivel; Hoti, Sugeerappa Laxmanappa; Kaliraj, Perumal

2010-05-01

Polymerase chain reaction based methods are promising tools for the monitoring and evaluation of the Global Program for the Elimination of Lymphatic Filariasis. The currently available PCR methods do not differentiate the DNA of Wuchereria bancrofti or Brugia malayi by a single PCR and hence are cumbersome. Therefore, we designed a single step PCR strategy for differentiating Bancroftian infection from Brugian infection based on a newly identified gene from the W. bancrofti genome, abundant larval transcript-2 (alt-2), which is abundantly expressed. The difference in PCR product sizes generated from the presence or absence of evolutionarily altered tandem repeats in alt-2 intron-3 differentiated W. bancrofti from B. malayi. The analysis was performed on the genomic DNA of microfilariae from a number of patient blood samples or microfilariae positive slides from different Indian geographical regions. The assay gave consistent results, differentiating the two filarial parasite species accurately. This alt-2 intron-3 based PCR assay can be a potential tool for the diagnosis and differentiation of co-infections by lymphatic filarial parasites. Copyright (c) 2010 Elsevier B.V. All rights reserved.
Plasmid P1 replication: negative control by repeated DNA sequences.

PubMed Central

Chattoraj, D; Cordes, K; Abeles, A

1984-01-01

The incompatibility locus, incA, of the unit-copy plasmid P1 is contained within a fragment that is essentially a set of nine 19-base-pair repeats. One or more copies of the fragment destabilizes the plasmid when present in trans. Here we show that extra copies of incA interfere with plasmid DNA replication and that a deletion of most of incA increases plasmid copy number. Thus, incA is not essential for replication but is required for its control. When cloned in a high-copy-number vector, pieces of the incA fragment that each contain only three repeats destabilize P1 plasmids efficiently. This result makes it unlikely that incA specifies a regulatory product. Our in vivo results suggest that the repeating DNA sequence itself negatively controls replication by titrating a P1-determined protein, RepA, that is essential for replication. Consistent with this hypothesis is the observation that the RepA protein binds to the incA fragment in vitro. Images PMID:6387706
Toward Male Individualization with Rapidly Mutating Y-Chromosomal Short Tandem Repeats

PubMed Central

Ballantyne, Kaye N; Ralf, Arwin; Aboukhalid, Rachid; Achakzai, Niaz M; Anjos, Maria J; Ayub, Qasim; Balažic, Jože; Ballantyne, Jack; Ballard, David J; Berger, Burkhard; Bobillo, Cecilia; Bouabdellah, Mehdi; Burri, Helen; Capal, Tomas; Caratti, Stefano; Cárdenas, Jorge; Cartault, François; Carvalho, Elizeu F; Carvalho, Monica; Cheng, Baowen; Coble, Michael D; Comas, David; Corach, Daniel; D'Amato, Maria E; Davison, Sean; de Knijff, Peter; De Ungria, Maria Corazon A; Decorte, Ronny; Dobosz, Tadeusz; Dupuy, Berit M; Elmrghni, Samir; Gliwiński, Mateusz; Gomes, Sara C; Grol, Laurens; Haas, Cordula; Hanson, Erin; Henke, Jürgen; Henke, Lotte; Herrera-Rodríguez, Fabiola; Hill, Carolyn R; Holmlund, Gunilla; Honda, Katsuya; Immel, Uta-Dorothee; Inokuchi, Shota; Jobling, Mark A; Kaddura, Mahmoud; Kim, Jong S; Kim, Soon H; Kim, Wook; King, Turi E; Klausriegler, Eva; Kling, Daniel; Kovačević, Lejla; Kovatsi, Leda; Krajewski, Paweł; Kravchenko, Sergey; Larmuseau, Maarten H D; Lee, Eun Young; Lessig, Ruediger; Livshits, Ludmila A; Marjanović, Damir; Minarik, Marek; Mizuno, Natsuko; Moreira, Helena; Morling, Niels; Mukherjee, Meeta; Munier, Patrick; Nagaraju, Javaregowda; Neuhuber, Franz; Nie, Shengjie; Nilasitsataporn, Premlaphat; Nishi, Takeki; Oh, Hye H; Olofsson, Jill; Onofri, Valerio; Palo, Jukka U; Pamjav, Horolma; Parson, Walther; Petlach, Michal; Phillips, Christopher; Ploski, Rafal; Prasad, Samayamantri P R; Primorac, Dragan; Purnomo, Gludhug A; Purps, Josephine; Rangel-Villalobos, Hector; Rębała, Krzysztof; Rerkamnuaychoke, Budsaba; Gonzalez, Danel Rey; Robino, Carlo; Roewer, Lutz; Rosa, Alexandra; Sajantila, Antti; Sala, Andrea; Salvador, Jazelyn M; Sanz, Paula; Schmitt, Cornelia; Sharma, Anil K; Silva, Dayse A; Shin, Kyoung-Jin; Sijen, Titia; Sirker, Miriam; Siváková, Daniela; Škaro, Vedrana; Solano-Matamoros, Carlos; Souto, Luis; Stenzl, Vlastimil; Sudoyo, Herawati; Syndercombe-Court, Denise; Tagliabracci, Adriano; Taylor, Duncan; Tillmar, Andreas; Tsybovsky, Iosif S; Tyler-Smith, Chris; van der Gaag, Kristiaan J; Vanek, Daniel; Völgyi, Antónia; Ward, Denise; Willemse, Patricia; Yap, Eric PH; Yong, Rita YY; Pajnič, Irena Zupanič; Kayser, Manfred

2014-01-01

Relevant for various areas of human genetics, Y-chromosomal short tandem repeats (Y-STRs) are commonly used for testing close paternal relationships among individuals and populations, and for male lineage identification. However, even the widely used 17-loci Yfiler set cannot resolve individuals and populations completely. Here, 52 centers generated quality-controlled data of 13 rapidly mutating (RM) Y-STRs in 14,644 related and unrelated males from 111 worldwide populations. Strikingly, >99% of the 12,272 unrelated males were completely individualized. Haplotype diversity was extremely high (global: 0.9999985, regional: 0.99836–0.9999988). Haplotype sharing between populations was almost absent except for six (0.05%) of the 12,156 haplotypes. Haplotype sharing within populations was generally rare (0.8% nonunique haplotypes), significantly lower in urban (0.9%) than rural (2.1%) and highest in endogamous groups (14.3%). Analysis of molecular variance revealed 99.98% of variation within populations, 0.018% among populations within groups, and 0.002% among groups. Of the 2,372 newly and 156 previously typed male relative pairs, 29% were differentiated including 27% of the 2,378 father–son pairs. Relative to Yfiler, haplotype diversity was increased in 86% of the populations tested and overall male relative differentiation was raised by 23.5%. Our study demonstrates the value of RM Y-STRs in identifying and separating unrelated and related males and provides a reference database. PMID:24917567
PTGBase: an integrated database to study tandem duplicated genes in plants.

PubMed

Yu, Jingyin; Ke, Tao; Tehrim, Sadia; Sun, Fengming; Liao, Boshou; Hua, Wei

2015-01-01

Tandem duplication is a wide-spread phenomenon in plant genomes and plays significant roles in evolution and adaptation to changing environments. Tandem duplicated genes related to certain functions will lead to the expansion of gene families and bring increase of gene dosage in the form of gene cluster arrays. Many tandem duplication events have been studied in plant genomes; yet, there is a surprising shortage of efforts to systematically present the integration of large amounts of information about publicly deposited tandem duplicated gene data across the plant kingdom. To address this shortcoming, we developed the first plant tandem duplicated genes database, PTGBase. It delivers the most comprehensive resource available to date, spanning 39 plant genomes, including model species and newly sequenced species alike. Across these genomes, 54 130 tandem duplicated gene clusters (129 652 genes) are presented in the database. Each tandem array, as well as its member genes, is characterized in complete detail. Tandem duplicated genes in PTGBase can be explored through browsing or searching by identifiers or keywords of functional annotation and sequence similarity. Users can download tandem duplicated gene arrays easily to any scale, up to the complete annotation data set for an entire plant genome. PTGBase will be updated regularly with newly sequenced plant species as they become available. © The Author(s) 2015. Published by Oxford University Press.
The sequence and de novo assembly of the giant panda genome

PubMed Central

Li, Ruiqiang; Fan, Wei; Tian, Geng; Zhu, Hongmei; He, Lin; Cai, Jing; Huang, Quanfei; Cai, Qingle; Li, Bo; Bai, Yinqi; Zhang, Zhihe; Zhang, Yaping; Wang, Wen; Li, Jun; Wei, Fuwen; Li, Heng; Jian, Min; Li, Jianwen; Zhang, Zhaolei; Nielsen, Rasmus; Li, Dawei; Gu, Wanjun; Yang, Zhentao; Xuan, Zhaoling; Ryder, Oliver A.; Leung, Frederick Chi-Ching; Zhou, Yan; Cao, Jianjun; Sun, Xiao; Fu, Yonggui; Fang, Xiaodong; Guo, Xiaosen; Wang, Bo; Hou, Rong; Shen, Fujun; Mu, Bo; Ni, Peixiang; Lin, Runmao; Qian, Wubin; Wang, Guodong; Yu, Chang; Nie, Wenhui; Wang, Jinhuan; Wu, Zhigang; Liang, Huiqing; Min, Jiumeng; Wu, Qi; Cheng, Shifeng; Ruan, Jue; Wang, Mingwei; Shi, Zhongbin; Wen, Ming; Liu, Binghang; Ren, Xiaoli; Zheng, Huisong; Dong, Dong; Cook, Kathleen; Shan, Gao; Zhang, Hao; Kosiol, Carolin; Xie, Xueying; Lu, Zuhong; Zheng, Hancheng; Li, Yingrui; Steiner, Cynthia C.; Lam, Tommy Tsan-Yuk; Lin, Siyuan; Zhang, Qinghui; Li, Guoqing; Tian, Jing; Gong, Timing; Liu, Hongde; Zhang, Dejin; Fang, Lin; Ye, Chen; Zhang, Juanbin; Hu, Wenbo; Xu, Anlong; Ren, Yuanyuan; Zhang, Guojie; Bruford, Michael W.; Li, Qibin; Ma, Lijia; Guo, Yiran; An, Na; Hu, Yujie; Zheng, Yang; Shi, Yongyong; Li, Zhiqiang; Liu, Qing; Chen, Yanling; Zhao, Jing; Qu, Ning; Zhao, Shancen; Tian, Feng; Wang, Xiaoling; Wang, Haiyin; Xu, Lizhi; Liu, Xiao; Vinar, Tomas; Wang, Yajun; Lam, Tak-Wah; Yiu, Siu-Ming; Liu, Shiping; Zhang, Hemin; Li, Desheng; Huang, Yan; Wang, Xia; Yang, Guohua; Jiang, Zhi; Wang, Junyi; Qin, Nan; Li, Li; Li, Jingxiang; Bolund, Lars; Kristiansen, Karsten; Wong, Gane Ka-Shu; Olson, Maynard; Zhang, Xiuqing; Li, Songgang; Yang, Huanming; Wang, Jian; Wang, Jun

2013-01-01

Using next-generation sequencing technology alone, we have successfully generated and assembled a draft sequence of the giant panda genome. The assembled contigs (2.25 gigabases (Gb)) cover approximately 94% of the whole genome, and the remaining gaps (0.05 Gb) seem to contain carnivore-specific repeats and tandem repeats. Comparisons with the dog and human showed that the panda genome has a lower divergence rate. The assessment of panda genes potentially underlying some of its unique traits indicated that its bamboo diet might be more dependent on its gut microbiome than its own genetic composition. We also identified more than 2.7 million heterozygous single nucleotide polymorphisms in the diploid genome. Our data and analyses provide a foundation for promoting mammalian genetic research, and demonstrate the feasibility for using next-generation sequencing technologies for accurate, cost-effective and rapid de novo assembly of large eukaryotic genomes. PMID:20010809
The Impact of Multilocus Variable-Number Tandem-Repeat Analysis on PulseNet Canada Escherichia coli O157:H7 Laboratory Surveillance and Outbreak Support, 2008-2012.

PubMed

Rumore, Jillian Leigh; Tschetter, Lorelee; Nadon, Celine

2016-05-01

The lack of pattern diversity among pulsed-field gel electrophoresis (PFGE) profiles for Escherichia coli O157:H7 in Canada does not consistently provide optimal discrimination, and therefore, differentiating temporally and/or geographically associated sporadic cases from potential outbreak cases can at times impede investigations. To address this limitation, DNA sequence-based methods such as multilocus variable-number tandem-repeat analysis (MLVA) have been explored. To assess the performance of MLVA as a supplemental method to PFGE from the Canadian perspective, a retrospective analysis of all E. coli O157:H7 isolated in Canada from January 2008 to December 2012 (inclusive) was conducted. A total of 2285 E. coli O157:H7 isolates and 63 clusters of cases (by PFGE) were selected for the study. Based on the qualitative analysis, the addition of MLVA improved the categorization of cases for 60% of clusters and no change was observed for ∼40% of clusters investigated. In such situations, MLVA serves to confirm PFGE results, but may not add further information per se. The findings of this study demonstrate that MLVA data, when used in combination with PFGE-based analyses, provide additional resolution to the detection of clusters lacking PFGE diversity as well as demonstrate good epidemiological concordance. In addition, MLVA is able to identify cluster-associated isolates with variant PFGE pattern combinations that may have been previously missed by PFGE alone. Optimal laboratory surveillance in Canada is achieved with the application of PFGE and MLVA in tandem for routine surveillance, cluster detection, and outbreak response.
ACMES: fast multiple-genome searches for short repeat sequences with concurrent cross-species information retrieval

PubMed Central

Reneker, Jeff; Shyu, Chi-Ren; Zeng, Peiyu; Polacco, Joseph C.; Gassmann, Walter

2004-01-01

We have developed a web server for the life sciences community to use to search for short repeats of DNA sequence of length between 3 and 10 000 bases within multiple species. This search employs a unique and fast hash function approach. Our system also applies information retrieval algorithms to discover knowledge of cross-species conservation of repeat sequences. Furthermore, we have incorporated a part of the Gene Ontology database into our information retrieval algorithms to broaden the coverage of the search. Our web server and tutorial can be found at http://acmes.rnet.missouri.edu. PMID:15215469
Skewing of the genetic architecture at the ZMYM3 human-specific 5' UTR short tandem repeat in schizophrenia.

PubMed

Alizadeh, F; Bozorgmehr, A; Tavakkoly-Bazzaz, J; Ohadi, M

2018-06-01

Differential expansion of a number of human short tandem repeats (STRs) at the critical core promoter and 5' untranslated region (UTR) support the hypothesis that at least some of these STRs may provide a selective advantage in human evolution. Following a genome-wide screen of all human protein-coding gene 5' UTRs based on the Ensembl database ( http://www.ensembl.org ), we previously reported that the longest STR in this interval is a (GA) 32 , which belongs to the X-linked zinc finger MYM-type containing 3 (ZMYM3) gene. In the present study, we analyzed the evolutionary implication of this region across evolution and examined the allele and genotype distribution of the "exceptionally long" STR by direct sequencing of 486 Iranian unrelated male subjects consisting of 196 cases of schizophrenia (SCZ) and 290 controls. We found that the ZMYM3 transcript containing the STR is human-specific (ENST00000373998.5). A significant allele variance difference was observed between the cases and controls (Levene's test for equality of variances F = 4.00, p < 0.03). In addition, six alleles were observed in the SCZ patients that were not detected in the control group ("disease-only" alleles) (mid p exact < 0.0003). Those alleles were at the extreme short and long ends of the allele distribution curve and composed 4% of the genotypes in the SCZ group. In conclusion, we found skewing of the genetic architecture at the ZMYM3 STR in SCZ. Further, we found a bell-shaped distribution of alleles and selection against alleles at the extreme ends of this STR. The ZMYM3 STR sets a prototype, the evolutionary course of which determines the range of alleles in a particular species. Extreme "disease-only" alleles and genotypes may change our perspective of adaptive evolution and complex disorders. The ZMYM3 gene "exceptionally long" STR should be sequenced in SCZ and other human-specific phenotypes/characteristics.
Clostridium botulinum Group I Strain Genotyping by 15-Locus Multilocus Variable-Number Tandem-Repeat Analysis ▿ †

PubMed Central

Fillo, Silvia; Giordani, Francesco; Anniballi, Fabrizio; Gorgé, Olivier; Ramisse, Vincent; Vergnaud, Gilles; Riehm, Julia M.; Scholz, Holger C.; Splettstoesser, Wolf D.; Kieboom, Jasper; Olsen, Jaran-Strand; Fenicia, Lucia; Lista, Florigio

2011-01-01

Clostridium botulinum is a taxonomic designation that encompasses a broad variety of spore-forming, Gram-positive bacteria producing the botulinum neurotoxin (BoNT). C. botulinum is the etiologic agent of botulism, a rare but severe neuroparalytic disease. Fine-resolution genetic characterization of C. botulinum isolates of any BoNT type is relevant for both epidemiological studies and forensic microbiology. A 10-locus multiple-locus variable-number tandem-repeat analysis (MLVA) was previously applied to isolates of C. botulinum type A. The present study includes five additional loci designed to better address proteolytic B and F serotypes. We investigated 79 C. botulinum group I strains isolated from human and food samples in several European countries, including types A (28), B (36), AB (4), and F (11) strains, and 5 nontoxic Clostridium sporogenes. Additional data were deduced from in silico analysis of 10 available fully sequenced genomes. This 15-locus MLVA (MLVA-15) scheme identified 86 distinct genotypes that clustered consistently with the results of amplified fragment length polymorphism (AFLP) and MLVA genotyping in previous reports. An MLVA-7 scheme, a subset of the MLVA-15, performed on a lab-on-a-chip device using a nonfluorescent subset of primers, is also proposed as a first-line assay. The phylogenetic grouping obtained with the MLVA-7 does not differ significantly from that generated by the MLVA-15. To our knowledge, this report is the first to analyze genetic variability among all of the C. botulinum group I serotypes by MLVA. Our data provide new insights into the genetic variability of group I C. botulinum isolates worldwide and demonstrate that this group is genetically highly diverse. PMID:22012011
Optical mapping and sequencing of the Escherichia coli KO11 genome reveal extensive chromosomal rearrangements, and multiple tandem copies of the Zymomonas mobilis pdc and adhB genes.

PubMed

Turner, Peter C; Yomano, Lorraine P; Jarboe, Laura R; York, Sean W; Baggett, Christy L; Moritz, Brélan E; Zentz, Emily B; Shanmugam, K T; Ingram, Lonnie O

2012-04-01

Escherichia coli KO11 (ATCC 55124) was engineered in 1990 to produce ethanol by chromosomal insertion of the Zymomonas mobilis pdc and adhB genes into E. coli W (ATCC 9637). KO11FL, our current laboratory version of KO11, and its parent E. coli W were sequenced, and contigs assembled into genomic sequences using optical NcoI restriction maps as templates. E. coli W contained plasmids pRK1 (102.5 kb) and pRK2 (5.4 kb), but KO11FL only contained pRK2. KO11FL optical maps made with AflII and with BamHI showed a tandem repeat region, consisting of at least 20 copies of a 10-kb unit. The repeat region was located at the insertion site for the pdc, adhB, and chloramphenicol-resistance genes. Sequence coverage of these genes was about 25-fold higher than average, consistent with amplification of the foreign genes that were inserted as circularized DNA. Selection for higher levels of chloramphenicol resistance originally produced strains with higher pdc and adhB expression, and hence improved fermentation performance, by increasing the gene copy number. Sequence data for an earlier version of KO11, ATCC 55124, indicated that multiple copies of pdc adhB were present. Comparison of the W and KO11FL genomes showed large inversions and deletions in KO11FL, mostly enabled by IS10, which is absent from W but present at 30 sites in KO11FL. The early KO11 strain ATCC 55124 had no rearrangements, contained only one IS10, and lacked most accumulated single nucleotide polymorphisms (SNPs) present in KO11FL. Despite rearrangements and SNPs in KO11FL, fermentation performance was equal to that of ATCC 55124.
Typing of artiodactyl MHC-DRB genes with the help of intronic simple repeated DNA sequences.

PubMed

Schwaiger, F W; Buitkamp, J; Weyers, E; Epplen, J T

1993-02-01

An efficient oligonucleotide typing method for the highly polymorphic MHC-DRB genes is described for artiodactyls like cattle, sheep and goat. By means of the polymerase chain reaction, the second exon of MHC-DRB is amplified as well as part of the adjacent intron containing a mixed simple repeat sequence. Using this primer combination we were able to amplify the MHC-DRB exons 2 and adjacent introns from all of the investigated 10 species of the family of Bovidae and giraffes. Therefore, the DRB genes of novel artiodactyl species can also be readily studied. Oligonucleotide probes specific for the polymorphisms of ungulate DRB genes are used with which sequences differing in at least one single base can be distinguished. Exonic polymorphism was found to be correlated with the allele lengths and the patterns of the repeat structures. Hence oligonucleotide probes specific for different simple repeats and polymorphic positions serve also for typing across species barriers. The strict correlation of sequence length and exonic polymorphism permits a preselection of specific oligonucleotides for hybridization. Thus more than 20 alleles can already be differentiated from each of the three species.
Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution

PubMed Central

2012-01-01

Background Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis. PMID:23020678
Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

PubMed Central

Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

2012-01-01

The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697
Association between the dopamine D4 receptor gene exon III variable number of tandem repeats and political attitudes in female Han Chinese

PubMed Central

Ebstein, Richard P.; Monakhov, Mikhail V.; Lu, Yunfeng; Jiang, Yushi; Lai, Poh San; Chew, Soo Hong

2015-01-01

Twin and family studies suggest that political attitudes are partially determined by an individual's genotype. The dopamine D4 receptor gene (DRD4) exon III repeat region that has been extensively studied in connection with human behaviour, is a plausible candidate to contribute to individual differences in political attitudes. A first United States study provisionally identified this gene with political attitude along a liberal–conservative axis albeit contingent upon number of friends. In a large sample of 1771 Han Chinese university students in Singapore, we observed a significant main effect of association between the DRD4 exon III variable number of tandem repeats and political attitude. Subjects with two copies of the 4-repeat allele (4R/4R) were significantly more conservative. Our results provided evidence for a role of the DRD4 gene variants in contributing to individual differences in political attitude particularly in females and more generally suggested that associations between individual genes, and neurochemical pathways, contributing to traits relevant to the social sciences can be provisionally identified. PMID:26246555
Association between the dopamine D4 receptor gene exon III variable number of tandem repeats and political attitudes in female Han Chinese.

PubMed

Ebstein, Richard P; Monakhov, Mikhail V; Lu, Yunfeng; Jiang, Yushi; Lai, Poh San; Chew, Soo Hong

2015-08-22

Twin and family studies suggest that political attitudes are partially determined by an individual's genotype. The dopamine D4 receptor gene (DRD4) exon III repeat region that has been extensively studied in connection with human behaviour, is a plausible candidate to contribute to individual differences in political attitudes. A first United States study provisionally identified this gene with political attitude along a liberal-conservative axis albeit contingent upon number of friends. In a large sample of 1771 Han Chinese university students in Singapore, we observed a significant main effect of association between the DRD4 exon III variable number of tandem repeats and political attitude. Subjects with two copies of the 4-repeat allele (4R/4R) were significantly more conservative. Our results provided evidence for a role of the DRD4 gene variants in contributing to individual differences in political attitude particularly in females and more generally suggested that associations between individual genes, and neurochemical pathways, contributing to traits relevant to the social sciences can be provisionally identified. © 2015 The Author(s).
The primitive code and repeats of base oligomers as the primordial protein-encoding sequence.

PubMed Central

Ohno, S; Epplen, J T

1983-01-01

Even if the prebiotic self-replication of nucleic acids and the subsequent emergence of primitive, enzyme-independent tRNAs are accepted as plausible, the origin of life by spontaneous generation still appears improbable. This is because the just-emerged primitive translational machinery had to cope with base sequences that were not preselected for their coding potentials. Particularly if the primitive mitochondria-like code with four chain-terminating base triplets preceded the universal code, the translation of long, randomly generated, base sequences at this critical stage would have merely resulted in the production of short oligopeptides instead of long polypeptide chains. We present the base sequence of a mouse transcript containing tetranucleotide repeats conserved during evolution. Even if translated in accordance with the primitive mitochondria-like code, this transcript in its three reading frames can yield 245-, 246-, and 251-residue-long tetrapeptidic periodical polypeptides that are already acquiring longer periodicities. We contend that the first set of base sequences translated at the beginning of life were such oligonucleotide repeats. By quickly acquiring longer periodicities, their products must have soon gained characteristic secondary structures--alpha-helical or beta-sheet or both. PMID:6574491
Disease-associated repeat instability and mismatch repair.

PubMed

Schmidt, Monika H M; Pearson, Christopher E

2016-02-01

Expanded tandem repeat sequences in DNA are associated with at least 40 human genetic neurological, neurodegenerative, and neuromuscular diseases. Repeat expansion can occur during parent-to-offspring transmission, and arise at variable rates in specific tissues throughout the life of an affected individual. Since the ongoing somatic repeat expansions can affect disease age-of-onset, severity, and progression, targeting somatic expansion holds potential as a therapeutic target. Thus, understanding the factors that regulate this mutation is crucial. DNA repair, in particular mismatch repair (MMR), is the major driving force of disease-associated repeat expansions. In contrast to its anti-mutagenic roles, mammalian MMR curiously drives the expansion mutations of disease-associated (CAG)·(CTG) repeats. Recent advances have broadened our knowledge of both the MMR proteins involved in disease repeat expansions, including: MSH2, MSH3, MSH6, MLH1, PMS2, and MLH3, as well as the types of repeats affected by MMR, now including: (CAG)·(CTG), (CGG)·(CCG), and (GAA)·(TTC) repeats. Mutagenic slipped-DNA structures have been detected in patient tissues, and the size of the slip-out and their junction conformation can determine the involvement of MMR. Furthermore, the formation of other unusual DNA and R-loop structures is proposed to play a key role in MMR-mediated instability. A complex correlation is emerging between tissues showing varying amounts of repeat instability and MMR expression levels. Notably, naturally occurring polymorphic variants of DNA repair genes can have dramatic effects upon the levels of repeat instability, which may explain the variation in disease age-of-onset, progression and severity. An increasing grasp of these factors holds prognostic and therapeutic potential. Copyright © 2015 Elsevier B.V. All rights reserved.
[Association of aggressive behaviors of schizophrenia with short tandem repeats loci].

PubMed

Yang, Chun; Ba, Huajie; Tan, Xingqi; Zhao, Hanqing; Zhang, Shuyou; Yu, Haiying

2017-12-10

To assess the association of short tandem repeats (STRs) loci with aggressive behaviors of schizophrenia. Blood samples from 123 schizophrenic patients with aggressive behaviors and 489 schizophrenic patients without aggressive behaviors were collected. DNA from all samples was amplified with a PowerPlex 21 system and separated by electrophoresis to determine the genotypes and allelic frequencies of 20 STR loci including D3S1368, D1S1656, D6S1043, D13S317, Penta E, D16S639, D18S51, D2S1338, CSF1PO, Penta D, TH01, vWA, D21S11, D7S820, D5S818, TPOX, D8S1179, D12S391, D19S433, and FGA. All of the 20 STR loci have reached Hardy-Weinberg equilibrium in both groups. A significant difference was found in allelic and genotypic frequencies of loci Penta D between the two groups (alleles: P=0.042; genotypes: P=0.014) but not for the remaining 19 loci (P> 0.05). Univariate analysis also showed a significant difference for allele 10 and genotypes 10-12 of Penta D between the two groups (P=0.0027, P=0.0001), with the OR being 1.81 (95%CI: 1.22-2.67) and 4.33 (95%CI: 1.95-9.59), respectively. Penta D may be associated with aggressive behaviors of schizophrenia. Allele 10 and genotypes 10-12 of Penta D may confer a risk for the disease.
Inheritance patterns of ATCCT repeat interruptions in spinocerebellar ataxia type 10 (SCA10) expansions.

PubMed

Landrian, Ivette; McFarland, Karen N; Liu, Jilin; Mulligan, Connie J; Rasmussen, Astrid; Ashizawa, Tetsuo

2017-01-01

Spinocerebellar ataxia type 10 (SCA10), an autosomal dominant cerebellar ataxia disorder, is caused by a non-coding ATTCT microsatellite repeat expansion in the ataxin 10 gene. In a subset of SCA10 families, the 5'-end of the repeat expansion contains a complex sequence of penta- and heptanucleotide interruption motifs which is followed by a pure tract of tandem ATCCT repeats of unknown length at its 3'-end. Intriguingly, expansions that carry these interruption motifs correlate with an epileptic seizure phenotype and are unstable despite the theory that interruptions are expected to stabilize expanded repeats. To examine the apparent contradiction of unstable, interruption-positive SCA10 expansion alleles and to determine whether the instability originates outside of the interrupted region, we sequenced approximately 1 kb of the 5'-end of SCA10 expansions using the ATCCT-PCR product in individuals across multiple generations from four SCA10 families. We found that the greatest instability within this region occurred in paternal transmissions of the allele in stretches of pure ATTCT motifs while the intervening interrupted sequences were stable. Overall, the ATCCT interruption changes by only one to three repeat units and therefore cannot account for the instability across the length of the disease allele. We conclude that the AT-rich interruptions locally stabilize the SCA10 expansion at the 5'-end but do not completely abolish instability across the entire span of the expansion. In addition, analysis of the interruption alleles across these families support a parsimonious single origin of the mutation with a shared distant ancestor.
Development of Genomic Simple Sequence Repeats (SSR) by Enrichment Libraries in Date Palm.

PubMed

Al-Faifi, Sulieman A; Migdadi, Hussein M; Algamdi, Salem S; Khan, Mohammad Altaf; Al-Obeed, Rashid S; Ammar, Megahed H; Jakse, Jerenj

2017-01-01

Development of highly informative markers such as simple sequence repeats (SSR) for cultivar identification and germplasm characterization and management is essential for date palms genetic studies. The present study documents the development of SSR markers and assesses genetic relationships of commonly grown date palm (Phoenix dactylifera L.) cultivars in different geographical regions of Saudi Arabia. A total of 93 novel simple sequence repeat (SSR) markers were screened for their ability to detect polymorphism in date palm. Around 71% of genomic SSRs are dinucleotide, 25% trinucleotide, 3% tetranucleotide, and 1% pentanucleotide motives and show 100% polymorphism. The Unweighted Pair Group Method with Arithmetic Mean (UPGMA) cluster analysis illustrates that cultivars trend to group according to their class of maturity, region of cultivation, and fruit color. Analysis of molecular variations (AMOVA) reveals genetic variation among and within cultivars of 27% and 73%, respectively, according to the geographical distribution of the cultivars. Developed microsatellite markers are of additional value to date palm characterization, tools which can be used by researchers in population genetics, cultivar identification, as well as genetic resource exploration and management. The cultivars tested exhibited a significant amount of genetic diversity and could be suitable for successful breeding programs. Genomic sequences generated from this study are available at the National Center for Biotechnology Information (NCBI), Sequence Read Archive (Accession numbers. LIBGSS_039019).

Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.)

USDA-ARS?s Scientific Manuscript database

Background: Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed S...
Development and characterization of simple sequence repeats for Bipolaris sokiniana and cross transferability to related species

USDA-ARS?s Scientific Manuscript database

Simple sequence repeats (SSR) markers were developed from a small insert genomic library for Bipolaris sorokiniana, a mitosporic fungal pathogen that causes spot blotch and root rot in switchgrass. About 59% of sequenced clones (n=384) harbored various SSR motifs. After eliminating the redundant seq...
Developmental Validation of Short Tandem Repeat Reagent Kit for Forensic DNA Profiling of Canine Biological Materials

PubMed Central

Dayton, Melody; Koskinen, Mikko T; Tom, Bradley K; Mattila, Anna-Maria; Johnston, Eric; Halverson, Joy; Fantin, Dennis; DeNise, Sue; Budowle, Bruce; Smith, David Glenn; Kanthaswamy, Sree

2009-01-01

Aim To develop a reagent kit that enables multiplex polymerase chain reaction (PCR) amplification of 18 short tandem repeats (STR) and the canine sex-determining Zinc Finger marker. Methods Validation studies to determine the robustness and reliability in forensic DNA typing of this multiplex assay included sensitivity testing, reproducibility studies, intra- and inter-locus color balance studies, annealing temperature and cycle number studies, peak height ratio determination, characterization of artifacts such as stutter percentages and dye blobs, mixture analyses, species-specificity, case type samples analyses and population studies. Results The kit robustly amplified domesticated dog samples and consistently generated full 19-locus profiles from as little as 125 pg of dog DNA. In addition, wolf DNA samples could be analyzed with the kit. Conclusion The kit, which produces robust, reliable, and reproducible results, will be made available for the forensic research community after modifications based on this study’s evaluation to comply with the quality standards expected for forensic casework. PMID:19480022
The association of 22 Y chromosome short tandem repeat loci with initiative-aggressive behavior.

PubMed

Yang, Chun; Ba, Huajie; Zhang, Wei; Zhang, Shuyou; Zhao, Hanqing; Yu, Haiying; Gao, Zhiqin; Wang, Binbin

2018-05-15

Aggressive behavior represents an important public concern and a clinical challenge to behaviorists and psychiatrists. Aggression in humans is known to have an important genetic basis, so to investigate the association of Y chromosome short tandem repeat (Y-STR) loci with initiative-aggressive behavior, we compared allelic and haplotypic distributions of 22 Y-STRs in a group of Chinese males convicted of premeditated extremely violent crimes (n = 271) with a normal control group (n = 492). Allelic distributions of DYS533 and DYS437 loci differed significantly between the two groups (P < 0.05). The case group had higher frequencies of DYS533 allele 14, DYS437 allele 14, and haplotypes 11-14 of DYS533-DYS437 compared with the control group. Additionally, the DYS437 allele 15 frequency was significantly lower in cases than controls. No frequency differences were observed in the other 20 Y-STR loci between these two groups. Our results indicate a genetic role for Y-STR loci in the development of initiative aggression in non-psychiatric subjects. Copyright © 2018 Elsevier B.V. All rights reserved.
Entropic fluctuations in DNA sequences

NASA Astrophysics Data System (ADS)

Thanos, Dimitrios; Li, Wentian; Provata, Astero

2018-03-01

The Local Shannon Entropy (LSE) in blocks is used as a complexity measure to study the information fluctuations along DNA sequences. The LSE of a DNA block maps the local base arrangement information to a single numerical value. It is shown that despite this reduction of information, LSE allows to extract meaningful information related to the detection of repetitive sequences in whole chromosomes and is useful in finding evolutionary differences between organisms. More specifically, large regions of tandem repeats, such as centromeres, can be detected based on their low LSE fluctuations along the chromosome. Furthermore, an empirical investigation of the appropriate block sizes is provided and the relationship of LSE properties with the structure of the underlying repetitive units is revealed by using both computational and mathematical methods. Sequence similarity between the genomic DNA of closely related species also leads to similar LSE values at the orthologous regions. As an application, the LSE covariance function is used to measure the evolutionary distance between several primate genomes.
ChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants

PubMed Central

Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh

2014-01-01

Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1–6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ PMID:25380781
ChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants.

PubMed

Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh

2014-01-01

Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1-6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ © The Author(s) 2014. Published by Oxford University Press.
Genome-Wide Characterization and Linkage Mapping of Simple Sequence Repeats in Mei (Prunus mume Sieb. et Zucc.)

PubMed Central

Sun, Lidan; Yang, Weiru; Zhang, Qixiang; Cheng, Tangren; Pan, Huitang; Xu, Zongda; Zhang, Jie; Chen, Chuguang

2013-01-01

Because of its popularity as an ornamental plant in East Asia, mei (Prunus mume Sieb. et Zucc.) has received increasing attention in genetic and genomic research with the recent shotgun sequencing of its genome. Here, we performed the genome-wide characterization of simple sequence repeats (SSRs) in the mei genome and detected a total of 188,149 SSRs occurring at a frequency of 794 SSR/Mb. Mononucleotide repeats were the most common type of SSR in genomic regions, followed by di- and tetranucleotide repeats. Most of the SSRs in coding sequences (CDS) were composed of tri- or hexanucleotide repeat motifs, but mononucleotide repeats were always the most common in intergenic regions. Genome-wide comparison of SSR patterns among the mei, strawberry (Fragaria vesca), and apple (Malus×domestica) genomes showed mei to have the highest density of SSRs, slightly higher than that of strawberry (608 SSR/Mb) and almost twice as high as that of apple (398 SSR/Mb). Mononucleotide repeats were the dominant SSR motifs in the three Rosaceae species. Using 144 SSR markers, we constructed a 670 cM-long linkage map of mei delimited into eight linkage groups (LGs), with an average marker distance of 5 cM. Seventy one scaffolds covering about 27.9% of the assembled mei genome were anchored to the genetic map, depending on which the macro-colinearity between the mei genome and Prunus T×E reference map was identified. The framework map of mei constructed provides a first step into subsequent high-resolution genetic mapping and marker-assisted selection for this ornamental species. PMID:23555708
Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species

PubMed Central

Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha

2011-01-01

Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309
High Genetic Diversity Revealed by Variable-Number Tandem Repeat Genotyping and Analysis of hsp65 Gene Polymorphism in a Large Collection of “Mycobacterium canettii” Strains Indicates that the M. tuberculosis Complex Is a Recently Emerged Clone of “M. canettii”

PubMed Central

Fabre, Michel; Koeck, Jean-Louis; Le Flèche, Philippe; Simon, Fabrice; Hervé, Vincent; Vergnaud, Gilles; Pourcel, Christine

2004-01-01

We have analyzed, using complementary molecular methods, the diversity of 43 strains of “Mycobacterium canettii” originating from the Republic of Djibouti, on the Horn of Africa, from 1998 to 2003. Genotyping by multiple-locus variable-number tandem repeat analysis shows that all the strains belong to a single but very distant group when compared to strains of the Mycobacterium tuberculosis complex (MTBC). Thirty-one strains cluster into one large group with little variability and five strains form another group, whereas the other seven are more diverged. In total, 14 genotypes are observed. The DR locus analysis reveals additional variability, some strains being devoid of a direct repeat locus and others having unique spacers. The hsp65 gene polymorphism was investigated by restriction enzyme analysis and sequencing of PCR amplicons. Four new single nucleotide polymorphisms were discovered. One strain was characterized by three nucleotide changes in 441 bp, creating new restriction enzyme polymorphisms. As no sequence variability was found for hsp65 in the whole MTBC, and as a single point mutation separates M. tuberculosis from the closest “M. canettii” strains, this diversity within “M. canettii” subspecies strongly suggests that it is the most probable source species of the MTBC rather than just another branch of the MTBC. PMID:15243089
Conservation of human chromosome 13 polymorphic microsatellite (CA){sub n} repeats in chimpanzees

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deka, R.; Shriver, M.D.; Yu, L.M.

Tandemly repeated (dC-dA){sub n} {center_dot} (dG-dT){sub n} sequences occur abundantly and are found in most eukaryotic genomes. To investigate the level of conservation of these repeat sequences in nonhuman primates, the authors have analyzed seven human chromosome 13 dinucleotide (CA){sub n} repeat loci in chimpanzees by DNA amplification using primers designed for analysis of human loci. Comparable levels of polymorphism at these loci in the two species, revealed by the number of alleles, heterozygosity, and allele sizes, suggest that the (CA){sub n} repeat arrays and their genomic locations are highly conserved. Even though the proportion of shared alleles between themore » two species varies enormously and the modal alleles are not the same, allelic lengths at each locus in the chimpanzees are detected within the bounds of the allele size range observed in humans. A similar observation has been noted in a limited number of gorillas and orangutans. Using a new measure of genetic distance that takes into account the size of alleles, they have compared the genetic distance between humans and chimpanzees. The genetic distance between these two species was found to be ninefold smaller than expected assuming there is no selection or mutational bias toward retention of (CA){sub n} repeat arrays. These findings suggest a functional significance for these microsatellite loci. 34 refs., 1 fig., 2 tabs.« less
Peptide Analysis Using Tandem Mass Spectrometry

DTIC Science & Technology

1989-06-01

to give pyroglutamic acid during storage, eliminating ammonia. It is almost absent in the spectrum of a freshly-prepared sample and is not seen in...USING TANDEM MASS SPECTROMETRY INTRODUCTION S The objective of the project was to determine the complete amino acid sequence of the large polypeptide...Ubiquitin by use of fast atom bombardment (FAB) ionization and tandem mass spectrometry. The peptide containing 76 amino acid residues was available
Multiple-Locus Variable-Number Tandem Repeat Analysis of Dutch Bordetella pertussis Strains Reveals Rapid Genetic Changes with Clonal Expansion during the Late 1990s

PubMed Central

Schouls, Leo M.; van der Heide, Han G. J.; Vauterin, Luc; Vauterin, Paul; Mooi, Frits R.

2004-01-01

Bordetella pertussis, the causative agent of whooping cough, has remained endemic in The Netherlands despite extensive nationwide vaccination since 1953. In the 1990s, several epidemic periods have resulted in many cases of pertussis. We have proposed that strain variation has played a major role in the upsurges of this disease in The Netherlands. Therefore, molecular characterization of strains is important in identifying the causes of pertussis epidemiology. For this reason, we have developed a multiple-locus variable-number tandem repeat analysis (MLVA) typing system for B. pertussis. By combining the MLVA profile with the allelic profile based on multiple-antigen sequence typing, we were able to further differentiate strains. The relationships between the various genotypes were visualized by constructing a minimum spanning tree. MLVA of Dutch strains of B. pertussis revealed that the genotypes of the strains isolated in the prevaccination period were diverse and clearly distinct from the strains isolated in the 1990s. Furthermore, there was a decrease in diversity in the strains from the late 1990s, with a remarkable clonal expansion that coincided with the epidemic periods. Using this genotyping, we have been able to show that B. pertussis is much more dynamic than expected. PMID:15292152
Non-RVD mutations that enhance the dynamics of the TAL repeat array along the superhelical axis improve TALEN genome editing efficacy

PubMed Central

Tochio, Naoya; Umehara, Kohei; Uewaki, Jun-ichi; Flechsig, Holger; Kondo, Masaharu; Dewa, Takehisa; Sakuma, Tetsushi; Yamamoto, Takashi; Saitoh, Takashi; Togashi, Yuichi; Tate, Shin-ichi

2016-01-01

Transcription activator-like effector (TALE) nuclease (TALEN) is widely used as a tool in genome editing. The DNA binding part of TALEN consists of a tandem array of TAL-repeats that form a right-handed superhelix. Each TAL-repeat recognises a specific base by the repeat variable diresidue (RVD) at positions 12 and 13. TALEN comprising the TAL-repeats with periodic mutations to residues at positions 4 and 32 (non-RVD sites) in each repeat (VT-TALE) exhibits increased efficacy in genome editing compared with a counterpart without the mutations (CT-TALE). The molecular basis for the elevated efficacy is unknown. In this report, comparison of the physicochemical properties between CT- and VT-TALEs revealed that VT-TALE has a larger amplitude motion along the superhelical axis (superhelical motion) compared with CT-TALE. The greater superhelical motion in VT-TALE enabled more TAL-repeats to engage in the target sequence recognition compared with CT-TALE. The extended sequence recognition by the TAL-repeats improves site specificity with limiting the spatial distribution of FokI domains to facilitate their dimerization at the desired site. Molecular dynamics simulations revealed that the non-RVD mutations alter inter-repeat hydrogen bonding to amplify the superhelical motion of VT-TALE. The TALEN activity is associated with the inter-repeat hydrogen bonding among the TAL repeats. PMID:27883072
Multi-laboratory validation study of multilocus variable-number tandem repeat analysis (MLVA) for Salmonella enterica serovar Enteritidis, 2015

PubMed Central

Peters, Tansy; Bertrand, Sophie; Björkman, Jonas T; Brandal, Lin T; Brown, Derek J; Erdõsi, Tímea; Heck, Max; Ibrahem, Salha; Johansson, Karin; Kornschober, Christian; Kotila, Saara M; Le Hello, Simon; Lienemann, Taru; Mattheus, Wesley; Nielsen, Eva Møller; Ragimbeau, Catherine; Rumore, Jillian; Sabol, Ashley; Torpdahl, Mia; Trees, Eija; Tuohy, Alma; de Pinna, Elizabeth

2017-01-01

Multilocus variable-number tandem repeat analysis (MLVA) is a rapid and reproducible typing method that is an important tool for investigation, as well as detection, of national and multinational outbreaks of a range of food-borne pathogens. Salmonella enterica serovar Enteritidis is the most common Salmonella serovar associated with human salmonellosis in the European Union/European Economic Area and North America. Fourteen laboratories from 13 countries in Europe and North America participated in a validation study for MLVA of S. Enteritidis targeting five loci. Following normalisation of fragment sizes using a set of reference strains, a blinded set of 24 strains with known allele sizes was analysed by each participant. The S. Enteritidis 5-loci MLVA protocol was shown to produce internationally comparable results as more than 90% of the participants reported less than 5% discrepant MLVA profiles. All 14 participating laboratories performed well, even those where experience with this typing method was limited. The raw fragment length data were consistent throughout, and the inter-laboratory validation helped to standardise the conversion of raw data to repeat numbers with at least two countries updating their internal procedures. However, differences in assigned MLVA profiles remain between well-established protocols and should be taken into account when exchanging data. PMID:28277220
Multi-laboratory validation study of multilocus variable-number tandem repeat analysis (MLVA) for Salmonella enterica serovar Enteritidis, 2015.

PubMed

Peters, Tansy; Bertrand, Sophie; Björkman, Jonas T; Brandal, Lin T; Brown, Derek J; Erdõsi, Tímea; Heck, Max; Ibrahem, Salha; Johansson, Karin; Kornschober, Christian; Kotila, Saara M; Le Hello, Simon; Lienemann, Taru; Mattheus, Wesley; Nielsen, Eva Møller; Ragimbeau, Catherine; Rumore, Jillian; Sabol, Ashley; Torpdahl, Mia; Trees, Eija; Tuohy, Alma; de Pinna, Elizabeth

2017-03-02

Multilocus variable-number tandem repeat analysis (MLVA) is a rapid and reproducible typing method that is an important tool for investigation, as well as detection, of national and multinational outbreaks of a range of food-borne pathogens. Salmonella enterica serovar Enteritidis is the most common Salmonella serovar associated with human salmonellosis in the European Union/European Economic Area and North America. Fourteen laboratories from 13 countries in Europe and North America participated in a validation study for MLVA of S. Enteritidis targeting five loci. Following normalisation of fragment sizes using a set of reference strains, a blinded set of 24 strains with known allele sizes was analysed by each participant. The S. Enteritidis 5-loci MLVA protocol was shown to produce internationally comparable results as more than 90% of the participants reported less than 5% discrepant MLVA profiles. All 14 participating laboratories performed well, even those where experience with this typing method was limited. The raw fragment length data were consistent throughout, and the inter-laboratory validation helped to standardise the conversion of raw data to repeat numbers with at least two countries updating their internal procedures. However, differences in assigned MLVA profiles remain between well-established protocols and should be taken into account when exchanging data. This article is copyright of The Authors, 2017.
Multi-locus variable number tandem repeat analysis of 7th pandemic Vibrio cholerae

PubMed Central

2012-01-01

Background Seven pandemics of cholera have been recorded since 1817, with the current and ongoing pandemic affecting almost every continent. Cholera remains endemic in developing countries and is still a significant public health issue. In this study we use multilocus variable number of tandem repeats (VNTRs) analysis (MLVA) to discriminate between isolates of the 7th pandemic clone of Vibrio cholerae. Results MLVA of six VNTRs selected from previously published data distinguished 66 V. cholerae isolates collected between 1961–1999 into 60 unique MLVA profiles. Only 4 MLVA profiles consisted of more than 2 isolates. The discriminatory power was 0.995. Phylogenetic analysis showed that, except for the closely related profiles, the relationships derived from MLVA profiles were in conflict with that inferred from Single Nucleotide Polymorphism (SNP) typing. The six SNP groups share consensus VNTR patterns and two SNP groups contained isolates which differed by only one VNTR locus. Conclusions MLVA is highly discriminatory in differentiating 7th pandemic V. cholerae isolates and MLVA data was most useful in resolving the genetic relationships among isolates within groups previously defined by SNPs. Thus MLVA is best used in conjunction with SNP typing in order to best determine the evolutionary relationships among the 7th pandemic V. cholerae isolates and for longer term epidemiological typing. PMID:22624829
Genetic markers, genotyping methods & next generation sequencing in Mycobacterium tuberculosis

PubMed Central

Desikan, Srinidhi; Narayanan, Sujatha

2015-01-01

Molecular epidemiology (ME) is one of the main areas in tuberculosis research which is widely used to study the transmission epidemics and outbreaks of tubercle bacilli. It exploits the presence of various polymorphisms in the genome of the bacteria that can be widely used as genetic markers. Many DNA typing methods apply these genetic markers to differentiate various strains and to study the evolutionary relationships between them. The three widely used genotyping tools to differentiate Mycobacterium tuberculosis strains are IS6110 restriction fragment length polymorphism (RFLP), spacer oligotyping (Spoligotyping), and mycobacterial interspersed repeat units - variable number of tandem repeats (MIRU-VNTR). A new prospect towards ME was introduced with the development of whole genome sequencing (WGS) and the next generation sequencing (NGS) methods, where the entire genome is sequenced that not only helps in pointing out minute differences between the various sequences but also saves time and the cost. NGS is also found to be useful in identifying single nucleotide polymorphisms (SNPs), comparative genomics and also various aspects about transmission dynamics. These techniques enable the identification of mycobacterial strains and also facilitate the study of their phylogenetic and evolutionary traits. PMID:26205019
Cis-acting regulatory sequences promote high-frequency gene conversion between repeated sequences in mammalian cells.

PubMed

Raynard, Steven J; Baker, Mark D

2004-01-01

In mammalian cells, little is known about the nature of recombination-prone regions of the genome. Previously, we reported that the immunoglobulin heavy chain (IgH) mu locus behaved as a hotspot for mitotic, intrachromosomal gene conversion (GC) between repeated mu constant (Cmu) regions in mouse hybridoma cells. To investigate whether elements within the mu gene regulatory region were required for hotspot activity, gene targeting was used to delete a 9.1 kb segment encompassing the mu gene promoter (Pmu), enhancer (Emu) and switch region (Smu) from the locus. In these cell lines, GC between the Cmu repeats was significantly reduced, indicating that this 'recombination-enhancing sequence' (RES) is necessary for GC hotspot activity at the IgH locus. Importantly, the RES fragment stimulated GC when appended to the same Cmu repeats integrated at ectopic genomic sites. We also show that deletion of Emu and flanking matrix attachment regions (MARs) from the RES abolishes GC hotspot activity at the IgH locus. However, no stimulation of ectopic GC was observed with the Emu/MARs fragment alone. Finally, we provide evidence that no correlation exists between the level of transcription and GC promoted by the RES. We suggest a model whereby Emu/MARS enhances mitotic GC at the endogenous IgH mu locus by effecting chromatin modifications in adjacent DNA.
Ligand binding by repeat proteins: natural and designed

PubMed Central

Grove, Tijana Z; Cortajarena, Aitziber L; Regan, Lynne

2012-01-01

Repeat proteins contain tandem arrays of small structural motifs. As a consequence of this architecture, they adopt non-globular, extended structures that present large, highly specific surfaces for ligand binding. Here we discuss recent advances toward understanding the functional role of this unique modular architecture. We showcase specific examples of natural repeat proteins interacting with diverse ligands and also present examples of designed repeat protein–ligand interactions. PMID:18602006

Genome Wide Characterization of Short Tandem Repeat Markers in Sweet Orange (Citrus sinensis)

PubMed Central

Biswas, Manosh Kumar; Xu, Qiang; Mayer, Christoph; Deng, Xiuxin

2014-01-01

Sweet orange (Citrus sinensis) is one of the major cultivated and most-consumed citrus species. With the goal of enhancing the genomic resources in citrus, we surveyed, developed and characterized microsatellite markers in the ≈347 Mb sequence assembly of the sweet orange genome. A total of 50,846 SSRs were identified with a frequency of 146.4 SSRs/Mbp. Dinucleotide repeats are the most frequent repeat class and the highest density of SSRs was found in chromosome 4. SSRs are non-randomly distributed in the genome and most of the SSRs (62.02%) are located in the intergenic regions. We found that AT-rich SSRs are more frequent than GC-rich SSRs. A total number of 21,248 SSR primers were successfully developed, which represents 89 SSR markers per Mb of the genome. A subset of 950 developed SSR primer pairs were synthesized and tested by wet lab experiments on a set of 16 citrus accessions. In total we identified 534 (56.21%) polymorphic SSR markers that will be useful in citrus improvement. The number of amplified alleles ranges from 2 to 12 with an average of 4 alleles per marker and an average PIC value of 0.75. The newly developed sweet orange primer sequences, their in silico PCR products, exact position in the genome assembly and putative function are made publicly available. We present the largest number of SSR markers ever developed for a citrus species. Almost two thirds of the markers are transferable to 16 citrus relatives and may be used for constructing a high density linkage map. In addition, they are valuable for marker-assisted selection studies, population structure analyses and comparative genomic studies of C. sinensis with other citrus related species. Altogether, these markers provide a significant contribution to the citrus research community. PMID:25148383
Genome wide characterization of short tandem repeat markers in sweet orange (Citrus sinensis).

PubMed

Biswas, Manosh Kumar; Xu, Qiang; Mayer, Christoph; Deng, Xiuxin

2014-01-01

Sweet orange (Citrus sinensis) is one of the major cultivated and most-consumed citrus species. With the goal of enhancing the genomic resources in citrus, we surveyed, developed and characterized microsatellite markers in the ≈347 Mb sequence assembly of the sweet orange genome. A total of 50,846 SSRs were identified with a frequency of 146.4 SSRs/Mbp. Dinucleotide repeats are the most frequent repeat class and the highest density of SSRs was found in chromosome 4. SSRs are non-randomly distributed in the genome and most of the SSRs (62.02%) are located in the intergenic regions. We found that AT-rich SSRs are more frequent than GC-rich SSRs. A total number of 21,248 SSR primers were successfully developed, which represents 89 SSR markers per Mb of the genome. A subset of 950 developed SSR primer pairs were synthesized and tested by wet lab experiments on a set of 16 citrus accessions. In total we identified 534 (56.21%) polymorphic SSR markers that will be useful in citrus improvement. The number of amplified alleles ranges from 2 to 12 with an average of 4 alleles per marker and an average PIC value of 0.75. The newly developed sweet orange primer sequences, their in silico PCR products, exact position in the genome assembly and putative function are made publicly available. We present the largest number of SSR markers ever developed for a citrus species. Almost two thirds of the markers are transferable to 16 citrus relatives and may be used for constructing a high density linkage map. In addition, they are valuable for marker-assisted selection studies, population structure analyses and comparative genomic studies of C. sinensis with other citrus related species. Altogether, these markers provide a significant contribution to the citrus research community.
The complete mitochondrial genome sequence of the Tibetan red fox (Vulpes vulpes montana).

PubMed

Zhang, Jin; Zhang, Honghai; Zhao, Chao; Chen, Lei; Sha, Weilai; Liu, Guangshuai

2015-01-01

In this study, the complete mitochondrial genome of the Tibetan red fox (Vulpes Vulpes montana) was sequenced for the first time using blood samples obtained from a wild female red fox captured from Lhasa in Tibet, China. Qinghai--Tibet Plateau is the highest plateau in the world with an average elevation above 3500 m. Sequence analysis showed it contains 12S rRNA gene, 16S rRNA gene, 22 tRNA genes, 13 protein-coding genes and 1 control region (CR). The variable tandem repeats in CR is the main reason of the length variability of mitochondrial genome among canide animals.
DNA Cloning of Plasmodium falciparum Circumsporozoite Gene: Amino Acid Sequence of Repetitive Epitope

NASA Astrophysics Data System (ADS)

Enea, Vincenzo; Ellis, Joan; Zavala, Fidel; Arnot, David E.; Asavanich, Achara; Masuda, Aoi; Quakyi, Isabella; Nussenzweig, Ruth S.

1984-08-01

A clone of complementary DNA encoding the circumsporozoite (CS) protein of the human malaria parasite Plasmodium falciparum has been isolated by screening an Escherichia coli complementary DNA library with a monoclonal antibody to the CS protein. The DNA sequence of the complementary DNA insert encodes a four-amino acid sequence: proline-asparagine-alanine-asparagine, tandemly repeated 23 times. The CS β -lactamase fusion protein specifically binds monoclonal antibodies to the CS protein and inhibits the binding of these antibodies to native Plasmodium falciparum CS protein. These findings provide a basis for the development of a vaccine against Plasmodium falciparum malaria.
Assessing the 5S ribosomal RNA heterogeneity in Arabidopsis thaliana using short RNA next generation sequencing data.

PubMed

Szymanski, Maciej; Karlowski, Wojciech M

2016-01-01

In eukaryotes, ribosomal 5S rRNAs are products of multigene families organized within clusters of tandemly repeated units. Accumulation of genomic data obtained from a variety of organisms demonstrated that the potential 5S rRNA coding sequences show a large number of variants, often incompatible with folding into a correct secondary structure. Here, we present results of an analysis of a large set of short RNA sequences generated by the next generation sequencing techniques, to address the problem of heterogeneity of the 5S rRNA transcripts in Arabidopsis and identification of potentially functional rRNA-derived fragments.
Crystal structure of tandem type III fibronectin domains from Drosophila neuroglian at 2.0 A.

PubMed

Huber, A H; Wang, Y M; Bieber, A J; Bjorkman, P J

1994-04-01

We report the crystal structure of two adjacent fibronectin type III repeats from the Drosophila neural cell adhesion molecule neuroglian. Each domain consists of two antiparallel beta sheets and is folded topologically identically to single fibronectin type III domains from the extracellular matrix proteins tenascin and fibronectin. beta bulges and left-handed polyproline II helices disrupt the regular beta sheet structure of both neuroglian domains. The hydrophobic interdomain interface includes a metal-binding site, presumably involved in stabilizing the relative orientation between domains and predicted by sequence comparision to be present in the vertebrate homolog molecule L1. The neuroglian domains are related by a near perfect 2-fold screw axis along the longest molecular dimension. Using this relationship, a model for arrays of tandem fibronectin type III repeats in neuroglian and other molecules is proposed.
Length variation and sequence divergence in mitochondrial control region of Schizothoracine (Teleostei: Cyperinidae) species.

PubMed

Syed, Mudasir Ahmad; Bhat, Farooz Ahmad; Balkhi, Masood-ul Hassan; Bhat, Bilal Ahmad

2016-01-01

Schizothoracine fish commonly called snow trouts inhibit the entire network of snow and spring fed cool waters of Kashmir, India. Over 10 species reported earlier, only five species have been found, these include Schizothorax niger, Schizothorax esocinus, Schizothorax plagiostomus, Schizothorax curvifrons and Schizothorax labiatus. The relationship between these species is contradicting. To understand the evolutionary relation of these species, we examined the sequence information of mitochondrial D-loop of 25 individuals representing five species. Sequence alignment showed D-loop region highly variable and length variation was observed in di-nucleotide (TA)n microsatellite between and within species. Interestingly, all these species have (TA)n microsatellite not associated with longer tandem repeats at the 3' end of the mitochondrial control region and do not show heteroplasmy. Our analysis also indicates the presence of four conserved sequence blocks (CSB), CSB-D, CSB-1, CSB-II and CSB-III, four (Termination Associated Sequence) TAS motifs and 15bp pyrimidine block within the mitochondrial control region, that are highly conserved within genus Schizothorax when compared with other species. The phylogenetic analysis carried by Maximum likelihood (ML), Neighbor Joining (NJ) and Bayesian inference (BI) generated almost identical results. The resultant BI tree showed a close genetic relationship of all the five species and supports two distinct grouping of S. esocinus species. Besides the species relation, the presence of length variation in tandem repeats is attributed to differences in predicting the stability of secondary structures. The role of CSBs and TASs, reported so far as main regulatory signals, would explain the conservation of these elements in evolution.
The repeating nucleotide sequence in the repetitive mitochondrial DNA from a "low-density" petite mutant of yeast.

PubMed Central

Van Kreijl, C F; Bos, J L

1977-01-01

The repeating nucleotide sequence of 68 base pairs in the mtDNA from an ethidium-induced cytoplasmic petite mutant of yeast has been determined. For sequence analysis specifically primed and terminated RNA copies, obtained by in vitro transcription of the separated strands, were use. The sequence consists of 66 consecutive AT base pairs flanked by two GC pairs and comprises nearly all of the mutant mitochondrial genome. The sequence, moreover, also represents the first part of wild-type mtDNA sequence so far. Images PMID:198740
Short tandem repeat DNA typing provides an international reference standard for authentication of human cell lines.

PubMed

Dirks, Wilhelm Gerhard; Faehnrich, Silke; Estella, Isabelle Annick Janine; Drexler, Hans Guenter

2005-01-01

Cell lines have wide applications as model systems in the medical and pharmaceutical industry. Much drug and chemical testing is now first carried out exhaustively on in vitro systems, reducing the need for complicated and invasive animal experiments. The basis for any research, development or production program involving cell lines is the choice of an authentic cell line. Microsatellites in the human genome that harbour short tandem repeat (STR) DNA markers allow individualisation of established cell lines at the DNA level. Fluorescence polymerase chain reaction amplification of eight highly polymorphic microsatellite STR loci plus gender determination was found to be the best tool to screen the uniqueness of DNA profiles in a fingerprint database. Our results demonstrate that cross-contamination and misidentification remain chronic problems in the use of human continuous cell lines. The combination of rapidly generated DNA types based on single-locus STR and their authentication or individualisation by screening the fingerprint database constitutes a highly reliable and robust method for the identification and verification of cell lines.
The profile of repeat-associated histone lysine methylation states in the mouse epigenome

PubMed Central

Martens, Joost H A; O'Sullivan, Roderick J; Braunschweig, Ulrich; Opravil, Susanne; Radolf, Martin; Steinlein, Peter; Jenuwein, Thomas

2005-01-01

Histone lysine methylation has been shown to index silenced chromatin regions at, for example, pericentric heterochromatin or of the inactive X chromosome. Here, we examined the distribution of repressive histone lysine methylation states over the entire family of DNA repeats in the mouse genome. Using chromatin immunoprecipitation in a cluster analysis representing repetitive elements, our data demonstrate the selective enrichment of distinct H3-K9, H3-K27 and H4-K20 methylation marks across tandem repeats (e.g. major and minor satellites), DNA transposons, retrotransposons, long interspersed nucleotide elements and short interspersed nucleotide elements. Tandem repeats, but not the other repetitive elements, give rise to double-stranded (ds) RNAs that are further elevated in embryonic stem (ES) cells lacking the H3-K9-specific Suv39h histone methyltransferases. Importantly, although H3-K9 tri- and H4-K20 trimethylation appear stable at the satellite repeats, many of the other repeat-associated repressive marks vary in chromatin of differentiated ES cells or of embryonic trophoblasts and fibroblasts. Our data define a profile of repressive histone lysine methylation states for the repetitive complement of four distinct mouse epigenomes and suggest tandem repeats and dsRNA as primary triggers for more stable chromatin imprints. PMID:15678104
Antihypertensive activity of transgenic rice seed containing an 18-repeat novokinin peptide localized in the nucleolus of endosperm cells.

PubMed

Wakasa, Yuhya; Zhao, Hui; Hirose, Sakiko; Yamauchi, Daiki; Yamada, Yuko; Yang, Lijun; Ohinata, Kousaku; Yoshikawa, Masaaki; Takaiwa, Fumio

2011-09-01

Novokinin (Arg-Pro-Leu-Lys-Pro-Trp, RPLKPW) is a new potent antihypertensive peptide based on the sequence of ovokinin (2-7) derived from ovalbumin. We previously generated transgenic rice seeds in which eight novokinin were fused to storage protein glutelins (GluA2 and GluC) for expression. Oral administration of these seeds to spontaneously hypertensive rats (SHRs) reduced systolic blood pressures at a dose of 1 g seed/kg of SHR. Here, 10- or 18-tandem repeats of novokinin with an endoplasmic reticulum (ER) retention signal (Lys-Asp-Glu-Leu, KDEL) at the C terminus were directly expressed in rice under the control of the glutelin promoter containing its signal peptide. Only small amounts of the 18-repeat novokinin accumulated, and it was unexpectedly deposited in the nucleolus. This abnormal intracellular localization was explained by an endogenous signal for nuclear localization. The GFP reporter protein fused to this sequence targeted to nuclei by a transient assay using onion epidermal cells. Transgenic seed expressing the 18-repeat novokinin exhibited significantly higher antihypertensive activity after a single oral dose to SHR even at one-quarter the amount (0.25 g/kg) of the transgenic rice seed expressing the fusion construct; though, its novokinin content was much lower (1/5). Furthermore, in a long-term administration for 5 weeks, even a smaller dose (0.0625 g/kg) of transgenic seeds could confer antihypertensive activity. This high antihypertensive activity may be attributed to differences in digestibility of expressed products by gastrointestinal enzymes and the unique intracellular localization. These results indicate that accumulation of novokinin as a tandemly repeated structure in transgenic rice is more effective than as a fusion-type structure. © 2010 The Authors. Plant Biotechnology Journal © 2010 Society for Experimental Biology and Blackwell Publishing Ltd.
Sequence analysis of the pyruvylated galactan sulfate-derived oligosaccharides by negative-ion electrospray tandem mass spectrometry.

PubMed

Li, Na; Mao, Wenjun; Liu, Xue; Wang, Shuyao; Xia, Zheng; Cao, Sujian; Li, Lin; Zhang, Qi; Liu, Shan

2016-10-04

Five sulfated oligosaccharide fragments, F1-F5, were prepared from a pyruvylated galactan sulfate from the green alga Codium divaricatum, by partial depolymerization using mild acid hydrolysis and purification with gel-permeation chromatography. Negative-ion electrospray tandem mass spectrometry with collision-induced dissociation (ES-CID-MS/MS) is attempted for sequence determination of the sulfated oligosaccharides. The sequence of F1 with homogeneous disaccharide composition was first characterized to be Galp-(4SO4)-(1 → 3)-Galp by detailed nuclear magnetic resonance spectroscopic analyses. The fragmentation pattern of F1 in the product ion spectra was established on the basis of negative-ion ES-CID MS/MS, which was then applied to sequence analysis of other sulfated oligosaccharides. The sequences of F2 and F3 were deduced to be Galp-(4SO4)-(1 → 3)-Galp-(1 → 3)-Galp-(1 → 3)-Galp and 3,4-O-(1-carboxyethylidene)-Galp-(6SO4)-(1 → 3)-Galp, respectively. The sequences of major fragments in F4 and F5 were also deduced. The investigation demonstrated that negative-ion ES-CID-MS/MS was an efficient method for the sequence analysis of the pyruvylated galactan sulfate-derived oligosaccharides which revealed the patterns of substitution and glycosidic linkages. The pyruvylated galactan sulfate-derived oligosaccharides were novel sulfated oligosaccharides different from other algal polysaccharide-derived oligosaccharides. Copyright © 2016 Elsevier Ltd. All rights reserved.
Molecular basis of length polymorphism in the human zeta-globin gene complex.

PubMed Central

Goodbourn, S E; Higgs, D R; Clegg, J B; Weatherall, D J

1983-01-01

The length polymorphism between the human zeta-globin gene and its pseudogene is caused by an allele-specific variation in the copy number of a tandemly repeating 36-base-pair sequence. This sequence is related to a tandemly repeated 14-base-pair sequence in the 5' flanking region of the human insulin gene, which is known to cause length polymorphism, and to a repetitive sequence in intervening sequence (IVS) 1 of the pseudo-zeta-globin gene. Evidence is presented that the latter is also of variable length, probably because of differences in the copy number of the tandem repeat. The homology between the three length polymorphisms may be an indication of the presence of a more widespread group of related sequences in the human genome, which might be useful for generalized linkage studies. PMID:6308667
Characterization and Amplification of Gene-Based Simple Sequence Repeat (SSR) Markers in Date Palm.

PubMed

Zhao, Yongli; Keremane, Manjunath; Prakash, Channapatna S; He, Guohao

2017-01-01

The paucity of molecular markers limits the application of genetic and genomic research in date palm (Phoenix dactylifera L.). Availability of expressed sequence tag (EST) sequences in date palm may provide a good resource for developing gene-based markers. This study characterizes a substantial fraction of transcriptome sequences containing simple sequence repeats (SSRs) from the EST sequences in date palm. The EST sequences studied are mainly homologous to those of Elaeis guineensis and Musa acuminata. A total of 911 gene-based SSR markers, characterized with functional annotations, have provided a useful basis not only for discovering candidate genes and understanding genetic basis of traits of interest but also for developing genetic and genomic tools for molecular research in date palm, such as diversity study, quantitative trait locus (QTL) mapping, and molecular breeding. The procedures of DNA extraction, polymerase chain reaction (PCR) amplification of these gene-based SSR markers, and gel electrophoresis of PCR products are described in this chapter.
Characterization of genetic sequence variation of 58 STR loci in four major population groups.

PubMed

Novroski, Nicole M M; King, Jonathan L; Churchill, Jennifer D; Seah, Lay Hong; Budowle, Bruce

2016-11-01

Massively parallel sequencing (MPS) can identify sequence variation within short tandem repeat (STR) alleles as well as their nominal allele lengths that traditionally have been obtained by capillary electrophoresis. Using the MiSeq FGx Forensic Genomics System (Illumina), STRait Razor, and in-house excel workbooks, genetic variation was characterized within STR repeat and flanking regions of 27 autosomal, 7 X-chromosome and 24 Y-chromosome STR markers in 777 unrelated individuals from four population groups. Seven hundred and forty six autosomal, 227 X-chromosome, and 324 Y-chromosome STR alleles were identified by sequence compared with 357 autosomal, 107 X-chromosome, and 189 Y-chromosome STR alleles that were identified by length. Within the observed sequence variation, 227 autosomal, 156 X-chromosome, and 112 Y-chromosome novel alleles were identified and described. One hundred and seventy six autosomal, 123 X-chromosome, and 93 Y-chromosome sequence variants resided within STR repeat regions, and 86 autosomal, 39 X-chromosome, and 20 Y-chromosome variants were located in STR flanking regions. Three markers, D18S51, DXS10135, and DYS385a-b had 1, 4, and 1 alleles, respectively, which contained both a novel repeat region variant and a flanking sequence variant in the same nucleotide sequence. There were 50 markers that demonstrated a relative increase in diversity with the variant sequence alleles compared with those of traditional nominal length alleles. These population data illustrate the genetic variation that exists in the commonly used STR markers in the selected population samples and provide allele frequencies for statistical calculations related to STR profiling with MPS data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Identification of apple cultivars on the basis of simple sequence repeat markers.

PubMed

Liu, G S; Zhang, Y G; Tao, R; Fang, J G; Dai, H Y

2014-09-12

DNA markers are useful tools that play an important role in plant cultivar identification. They are usually based on polymerase chain reaction (PCR) and include simple sequence repeats (SSRs), inter-simple sequence repeats, and random amplified polymorphic DNA. However, DNA markers were not used effectively in the complete identification of plant cultivars because of the lack of known DNA fingerprints. Recently, a novel approach called the cultivar identification diagram (CID) strategy was developed to facilitate the use of DNA markers for separate plant individuals. The CID was designed whereby a polymorphic maker was generated from each PCR that directly allowed for cultivar sample separation at each step. Therefore, it could be used to identify cultivars and varieties easily with fewer primers. In this study, 60 apple cultivars, including a few main cultivars in fields and varieties from descendants (Fuji x Telamon) were examined. Of the 20 pairs of SSR primers screened, 8 pairs gave reproducible, polymorphic DNA amplification patterns. The banding patterns obtained from these 8 primers were used to construct a CID map. Each cultivar or variety in this study was distinguished from the others completely, indicating that this method can be used for efficient cultivar identification. The result contributed to studies on germplasm resources and the seedling industry in fruit trees.
Interleukin-1 Receptor Antagonist and Interleukin-4 Genes Variable Number Tandem Repeats Are Associated with Adiposity in Malaysian Subjects

PubMed Central

Kok, Yung-Yean; Ong, Hing-Huat

2017-01-01

Interleukin-1 receptor antagonist (IL1RA) intron 2 86 bp repeat and interleukin-4 (IL4) intron 3 70 bp repeat are variable number tandem repeats (VNTRs) that have been associated with various diseases, but their role in obesity is elusive. The objective of this study was to investigate the association of IL1RA and IL4 VNTRs with obesity and adiposity in 315 Malaysian subjects (128 M/187 F; 23 Malays/251 ethnic Chinese/41 ethnic Indians). The allelic distributions of IL1RA and IL4 were significantly different among ethnicities, and the alleles were associated with total body fat (TBF) classes. Individuals with IL1RA I/II genotype or allele II had greater risk of having higher overall adiposity, relative to those having the I/I genotype or I allele, respectively, even after controlling for ethnicity [Odds Ratio (OR) of I/II genotype = 12.21 (CI = 2.54, 58.79; p = 0.002); II allele = 5.78 (CI = 1.73, 19.29; p = 0.004)]. However, IL4 VNTR B2 allele was only significantly associated with overall adiposity status before adjusting for ethnicity [OR = 1.53 (CI = 1.04, 2.23; p = 0.03)]. Individuals with IL1RA II allele had significantly higher TBF than those with I allele (31.79 ± 2.52 versus 23.51 ± 0.40; p = 0.005). Taken together, IL1RA intron 2 VNTR seems to be a genetic marker for overall adiposity status in Malaysian subjects. PMID:28293435
Interleukin-1 Receptor Antagonist and Interleukin-4 Genes Variable Number Tandem Repeats Are Associated with Adiposity in Malaysian Subjects.

PubMed

Kok, Yung-Yean; Ong, Hing-Huat; Say, Yee-How

2017-01-01

Interleukin-1 receptor antagonist ( IL1RA ) intron 2 86 bp repeat and interleukin-4 ( IL4 ) intron 3 70 bp repeat are variable number tandem repeats (VNTRs) that have been associated with various diseases, but their role in obesity is elusive. The objective of this study was to investigate the association of IL1RA and IL4 VNTRs with obesity and adiposity in 315 Malaysian subjects (128 M/187 F; 23 Malays/251 ethnic Chinese/41 ethnic Indians). The allelic distributions of IL1RA and IL4 were significantly different among ethnicities, and the alleles were associated with total body fat (TBF) classes. Individuals with IL1RA I/II genotype or allele II had greater risk of having higher overall adiposity, relative to those having the I/I genotype or I allele, respectively, even after controlling for ethnicity [Odds Ratio (OR) of I/II genotype = 12.21 (CI = 2.54, 58.79; p = 0.002); II allele = 5.78 (CI = 1.73, 19.29; p = 0.004)]. However, IL4 VNTR B2 allele was only significantly associated with overall adiposity status before adjusting for ethnicity [OR = 1.53 (CI = 1.04, 2.23; p = 0.03)]. Individuals with IL1RA II allele had significantly higher TBF than those with I allele (31.79 ± 2.52 versus 23.51 ± 0.40; p = 0.005). Taken together, IL1RA intron 2 VNTR seems to be a genetic marker for overall adiposity status in Malaysian subjects.
Sub-typing of extended-spectrum-β-lactamase-producing isolates from a nosocomial outbreak: application of a 10-loci generic Escherichia coli multi-locus variable number tandem repeat analysis.

PubMed

Karami, Nahid; Helldal, Lisa; Welinder-Olsson, Christina; Ahrén, Christina; Moore, Edward R B

2013-01-01

Extended-spectrum β-lactamase producing Escherichia coli (ESBL-E. coli) were isolated from infants hospitalized in a neonatal, post-surgery ward during a four-month-long nosocomial outbreak and six-month follow-up period. A multi-locus variable number tandem repeat analysis (MLVA), using 10 loci (GECM-10), for 'generic' (i.e., non-STEC) E. coli was applied for sub-species-level (i.e., sub-typing) delineation and characterization of the bacterial isolates. Ten distinct GECM-10 types were detected among 50 isolates, correlating with the types defined by pulsed-field gel electrophoresis (PFGE), which is recognized to be the 'gold-standard' method for clinical epidemiological analyses. Multi-locus sequence typing (MLST), multiplex PCR genotyping of bla CTX-M, bla TEM, bla OXA and bla SHV genes and antibiotic resistance profiling, as well as a PCR assay specific for detecting isolates of the pandemic O25b-ST131 strain, further characterized the outbreak isolates. Two clusters of isolates with distinct GECM-10 types (G06-04 and G07-02), corresponding to two major PFGE types and the MLST-based sequence types (STs) 131 and 1444, respectively, were confirmed to be responsible for the outbreak. The application of GECM-10 sub-typing provided reliable, rapid and cost-effective epidemiological characterizations of the ESBL-producing isolates from a nosocomial outbreak that correlated with and may be used to replace the laborious PFGE protocol for analyzing generic E. coli.
Isolation and mapping of telomeric pentanucleotide (TAACC)n repeats of the Pacific whiteleg shrimp, Penaeus vannamei, using fluorescence in situ hybridization.

PubMed

Alcivar-Warren, Acacia; Meehan-Meola, Dawn; Wang, Yongping; Guo, Ximing; Zhou, Linghua; Xiang, Jianhai; Moss, Shaun; Arce, Steve; Warren, William; Xu, Zhenkang; Bell, Kireina

2006-01-01

To develop genetic and physical maps for shrimp, accurate information on the actual number of chromosomes and a large number of genetic markers is needed. Previous reports have shown two different chromosome numbers for the Pacific whiteleg shrimp, Penaeus vannamei, the most important penaeid shrimp species cultured in the Western hemisphere. Preliminary results obtained by direct sequencing of clones from a Sau3A-digested genomic library of P. vannamei ovary identified a large number of (TAACC/GGTTA)-containing SSRs. The objectives of this study were to (1) examine the frequency of (TAACC)n repeats in 662 P. vannamei genomic clones that were directly sequenced, and perform homology searches of these clones, (2) confirm the number of chromosomes in testis of P. vannamei, and (3) localize the TAACC repeats in P. vannamei chromosome spreads using fluorescence in situ hybridization (FISH). Results for objective 1 showed that 395 out of the 662 clones sequenced contained single or multiple SSRs with three or more repeat motifs, 199 of which contained variable tandem repeats of the pentanucleotide (TAACC/GGTTA)n, with 3 to 14 copies per sequence. The frequency of (TAACC)n repeats in P. vannamei is 4.68 kb for SSRs with five or more repeat motifs. Sequence comparisons using the BLASTN nonredundant and expressed sequence tag (EST) databases indicated that most of the TAACC-containing clones were similar to either the core pentanucleotide repeat in PVPENTREP locus (GenBank accession no. X82619) or portions of 28S rRNA. Transposable elements (transposase for Tn1000 and reverse transcriptase family members), hypothetical or unnamed protein products, and genes of known function such as 18S and 28S rRNAs, heat shock protein 70, and thrombospondin were identified in non-TAACC-containing clones. For objective 2, the meiotic chromosome number of P. vannamei was confirmed as N = 44. For objective 3, four FISH probes (P1 to P4) containing different numbers of TAACC repeats produced

Nucleotide sequence of a cluster of early and late genes in a conserved segment of the vaccinia virus genome.

PubMed Central

Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E

1985-01-01

The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815
A Repeat Look at Repeating Patterns

ERIC Educational Resources Information Center

Markworth, Kimberly A.

2016-01-01

A "repeating pattern" is a cyclical repetition of an identifiable core. Children in the primary grades usually begin pattern work with fairly simple patterns, such as AB, ABC, or ABB patterns. The unique letters represent unique elements, whereas the sequence of letters represents the core that is repeated. Based on color, shape,…
Horseradish peroxidase-labeled oligonucleotides and fluorescent tyramides for rapid detection of chromosome-specific repeat sequences.

PubMed

van Gijlswijk, R P; Wiegant, J; Vervenne, R; Lasan, R; Tanke, H J; Raap, A K

1996-01-01

We present a sensitive and rapid fluorescence in situ hybridization (FISH) strategy for detecting chromosome-specific repeat sequences. It uses horseradish peroxidase (HRP)-labeled oligonucleotide sequences in combination with fluorescent tyramide-based detection. After in situ hybridization, the HRP conjugated to the oligonucleotide probe is used to deposit fluorescently labeled tyramide molecules at the site of hybridization. The method features full chemical synthesis of probes, strong FISH signals, and short processing periods, as well as multicolor capabilities.
Evidence for human meiotic recombination interference obtained through construction of a short tandem repeat-polymorphism linkage map of chromosome 19

PubMed Central

Weber, James L.; Wang, Zhenyuan; Hansen, Kevin; Stephenson, Matt; Kappel, Clarisse; Salzman, Sherry; Wilkie, Patricia J.; Keats, Bronya; Dracopoli, Nicholas C.; Brandriff, Brigitte F.; Olsen, Anne S.

1993-01-01

An improved linkage map for human chromosome 19 containing 35 short tandem repeat polymorphisms (STRPs) and one VNTR (D19S20) was constructed. The map included 12 new (GATA)n tetranucleotide STRPs. Although total lengths of the male (114 cM) and female (128 cM) maps were similar, at both ends of the chromosome male recombination exceeded female recombination, while in the interior portion of the map female recombination was in excess. Cosmid clones containing the STRP sequences were identified and were positioned along the chromosome by fluorescent in situ hybridization. Four rounds of careful checking and removal of genotyping errors allowed biologically relevant conclusions to be made concerning the numbers and distributions of recombination events on chromosome 19. The average numbers of recombinations per chromosome matched closely the lengths of the genetic maps computed by using the program CRIMAP. Significant numbers of chromosomes with zero, one, two, or three recombinations were detected as products of both female and male meioses. On the basis of the total number of observed pairs of recombination events in which only a single informative marker was situated between the two recombinations, a maximal estimate for the rate of meiotic STRP “gene” conversion without recombination was calculated as 3 × 10−4/meiosis. For distances up to 30 cM between recombinations, many fewer chromosomes which had undergone exactly two recombinations were observed than were expected on the basis of the assumption of independent recombination locations. This strong new evidence for human meiotic interference will help to improve the accuracy of interpretation of clinical DNA test results involving polymorphisms flanking a genetic abnormality. PMID:8213834
Subtyping of a Large Collection of Historical Listeria monocytogenes Strains from Ontario, Canada, by an Improved Multilocus Variable-Number Tandem-Repeat Analysis (MLVA)

PubMed Central

Saleh-Lakha, S.; Allen, V. G.; Li, J.; Pagotto, F.; Odumeru, J.; Taboada, E.; Lombos, M.; Tabing, K. C.; Blais, B.; Ogunremi, D.; Downing, G.; Lee, S.; Gao, A.; Nadon, C.

2013-01-01

Listeria monocytogenes is responsible for severe and often fatal food-borne infections in humans. A collection of 2,421 L. monocytogenes isolates originating from Ontario's food chain between 1993 and 2010, along with Ontario clinical isolates collected from 2004 to 2010, was characterized using an improved multilocus variable-number tandem-repeat analysis (MLVA). The MLVA method was established based on eight primer pairs targeting seven variable-number tandem-repeat (VNTR) loci in two 4-plex fluorescent PCRs. Diversity indices and amplification rates of the individual VNTR loci ranged from 0.38 to 0.92 and from 0.64 to 0.99, respectively. MLVA types and pulsed-field gel electrophoresis (PFGE) patterns were compared using Comparative Partitions analysis involving 336 clinical and 99 food and environmental isolates. The analysis yielded Simpson's diversity index values of 0.998 and 0.992 for MLVA and PFGE, respectively, and adjusted Wallace coefficients of 0.318 when MLVA was used as a primary subtyping method and 0.088 when PFGE was a primary typing method. Statistical data analysis using BioNumerics allowed for identification of at least 8 predominant and persistent L. monocytogenes MLVA types in Ontario's food chain. The MLVA method correctly clustered epidemiologically related outbreak strains and separated unrelated strains in a subset analysis. An MLVA database was established for the 2,421 L. monocytogenes isolates, which allows for comparison of data among historical and new isolates of different sources. The subtyping method coupled with the MLVA database will help in effective monitoring/prevention approaches to identify environmental contamination by pathogenic strains of L. monocytogenes and investigation of outbreaks. PMID:23956391
Length and repeat-sequence variation in 58 STRs and 94 SNPs in two Spanish populations.

PubMed

Casals, Ferran; Anglada, Roger; Bonet, Núria; Rasal, Raquel; van der Gaag, Kristiaan J; Hoogenboom, Jerry; Solé-Morata, Neus; Comas, David; Calafell, Francesc

2017-09-01

We have genotyped the 58 STRs (27 autosomal, 24 Y-STRs and 7 X-STRs) and 94 autosomal SNPs in Illumina ForenSeq™ Primer Mix A in 88 Spanish Roma (Gypsy) samples and 143 Catalans. Since this platform is based in massive parallel sequencing, we have used simple R scripts to uncover the sequence variation in the repeat region. Thus, we have found, across 58 STRs, 541 length-based alleles, which, after considering repeat-sequence variation, became 804 different alleles. All loci in both populations were in Hardy-Weinberg equilibrium. F ST between both populations was 0.0178 for autosomal SNPs, 0.0146 for autosomal STRs, 0.0101 for X-STRs and 0.1866 for Y-STRs. Combined a priori statistics showed quite large; for instance, pooling all the autosomal loci, the a priori probabilities of discriminating a suspect become 1-(2.3×10 -70 ) and 1-(5.9×10 -73 ), for Roma and Catalans respectively, and the chances of excluding a false father in a trio are 1-(2.6×10 -20 ) and 1-(2.0×10 -21 ). Copyright © 2017 Elsevier B.V. All rights reserved.
Allele Frequencies for 15 Short Tandem Repeat Loci in Representative Sample of Croatian Population

PubMed Central

Projić, Petar; Škaro, Vedrana; Šamija, Ivana; Pojskić, Naris; Durmić-Pašić, Adaleta; Kovačević, Lejla; Bakal, Narcisa; Primorac, Dragan; Marjanović, Damir

2007-01-01

Aim To study the distribution of allele frequencies of 15 short tandem repeat (STR) loci in a representative sample of the Croatian population. Methods A total of 195 unrelated Caucasian individuals born in Croatia, from 14 counties and the City of Zagreb, were sampled for the analysis. All the tested individuals were voluntary donors. Buccal swab was used as the DNA source. AmpFlSTR® Identifiler® was applied to simultaneously amplify 15 STR loci. Total reaction volume was 12.5 μL. The polymerase chain reaction (PCR) amplification was carried out in PE Gene Amp PCR System Thermal Cycler. Electrophoresis of the amplification products was preformed on an ABI PRISM 3130 Genetic Analyzer. After PCR amplification and separation by electrophoresis, raw data were compiled, analyzed, and numerical allele designations of the profiles were obtained. Deviation from Hardy-Weinberg equilibrium, observed and expected heterozygosity, power of discrimination, and power of exclusion were calculated. Bonferroni’s correction was used before each comparative analysis. Results We compared Croatian data with those obtained from geographically neighboring European populations. The significant difference (at P<0.01) in allele frequencies was recorded only between the Croatian and Slovenian populations for vWA locus. There was no significant deviation from Hardy-Weinberg equilibrium for all the observed loci. Conclusion Obtained population data concurred with the expected “STR data frame” for this part of Europe. PMID:17696301
Differential effects of simple repeating DNA sequences on gene expression from the SV40 early promoter.

PubMed

Amirhaeri, S; Wohlrab, F; Wells, R D

1995-02-17

The influence of simple repeat sequences, cloned into different positions relative to the SV40 early promoter/enhancer, on the transient expression of the chloramphenicol acetyltransferase (CAT) gene was investigated. Insertion of (G)29.(C)29 in either orientation into the 5'-untranslated region of the CAT gene reduced expression in CV-1 cells 50-100 fold when compared with controls with random sequence inserts. Analysis of CAT-specific mRNA levels demonstrated that the effect was due to a reduction of CAT mRNA production rather than to posttranscriptional events. In contrast, insertion of the same insert in either orientation upstream of the promoter-enhancer or downstream of the gene stimulated gene expression 2-3-fold. These effects could be reversed by cotransfection of a competitor plasmid carrying (G)25.(C)25 sequences. The results suggest that a G.C-binding transcription factor modulates gene expression in this system and that promoter strength can be regulated by providing protein-binding sites in trans. Although constructs containing longer tracts of alternating (C-G), (T-G), or (A-T) sequences inhibited CAT expression when inserted in the 5'-untranslated region of the CAT gene, the amount of CAT mRNA was unaffected. Hence, these inhibitions must be due to posttranscriptional events, presumably at the level of translation. These effects of microsatellite sequences on gene expression are discussed with respect to recent data on related simple repeat sequences which cause several human genetic diseases.
Repeats of base oligomers as the primordial coding sequences of the primeval earth and their vestiges in modern genes.

PubMed

Ohno, S

1984-01-01

Three outstanding properties uniquely qualify repeats of base oligomers as the primordial coding sequences of all polypeptide chains. First, when compared with randomly generated base sequences in general, they are more likely to have long open reading frames. Second, periodical polypeptide chains specified by such repeats are more likely to assume either alpha-helical or beta-sheet secondary structures than are polypeptide chains of random sequence. Third, provided that the number of bases in the oligomeric unit is not a multiple of 3, these internally repetitious coding sequences are impervious to randomly sustained base substitutions, deletions, and insertions. This is because the recurring periodicity of their polypeptide chains is given by three consecutive copies of the oligomeric unit translated in three different reading frames. Accordingly, when one reading frame is open, the other two are automatically open as well, all three being capable of coding for polypeptide chains of identical periodicity. Under this circumstance, a frame shift due to the deletion or insertion of a number of bases that is not a multiple of 3 fails to alter the down-stream amino acid sequence, and even a base change causing premature chain-termination can silence only one of the three potential coding units. Newly arisen coding sequences in modern organisms are oligomeric repeats, and most of the older genes retain various vestiges of their original internal repetitions. Some of the genes (e.g., oncogenes) have even inherited the property of being impervious to randomly sustained base changes.
[Family-based association study of a variable number of tandem repeat polymorphism of DAT1 gene with Tourette syndrome in a Chinese Han population].

PubMed

Zheng, Lanlan; Han, Zhen-liang; Zhang, Xin-hua; Wang, Xue-qin; Jiang, Wei-hua; Yi, Ming-ji; Liu, Shi-guo

2013-10-01

To assess the association of a 40 bp variable number of tandem repeat (VNTR) polymorphism within 3 untranslated region of dopamine transporter gene (DAT1) with Tourette syndrome (TS) in a Chinese Han population. A total of 160 TS patients and their parents were recruited. The VNTR polymorphism was detected with polymerase chain reaction-VNTR analysis, and its association with TS and its subtypes were assessed through a family-based association study comprising transmission disequilibrium test (TDT) and haplotype relative risk (HRR) analysis. The repeat numbers at the DAT1 40 bp locus were 11, 10, 9, 7.5 and 7 among the patients and their parents, with the most common type being a 10-repeat allele. No significant association was detected between the polymorphism and TS (TDT: X ² = 0.472, df = 1, P = 0.583; HRR: X ² = 0.313, P = 0.576, OR = 0.855, 95%CI: 0.493-1.481). Our data suggested that the VNTR polymorphism of DAT1 gene is not associated with susceptibility to TS in Chinese Han population. However, our results are to be validated in larger sets of patients collected from other populations.
Molecular typing of Salmonella enterica serovar typhi isolates from various countries in Asia by a multiplex PCR assay on variable-number tandem repeats.

PubMed

Liu, Yichun; Lee, May-Ann; Ooi, Eng-Eong; Mavis, Yeo; Tan, Ai-Ling; Quek, Hung-Hiang

2003-09-01

A multiplex PCR method incorporating primers flanking three variable-number tandem repeat (VNTR) loci (arbitrarily labeled TR1, TR2, and TR3) in the CT18 strain of Salmonella enterica serovar Typhi has been developed for molecular typing of S. enterica serovar Typhi clinical isolates from several Asian countries, including Singapore, Indonesia, India, Bangladesh, Malaysia, and Nepal. We have demonstrated that the multiplex PCR could be performed on crude cell lysates and that the VNTR banding profiles produced could be easily analyzed by visual inspection after conventional agarose gel electrophoresis. The assay was highly discriminative in identifying 49 distinct VNTR profiles among 59 individual isolates. A high level of VNTR profile heterogeneity was observed in isolates from within the same country and among countries. These VNTR profiles remained stable after the strains were passaged extensively under routine laboratory culture conditions. In contrast to the S. enterica serovar Typhi isolates, an absence of TR3 amplicons and a lack of length polymorphisms in TR1 and TR2 amplicons were observed for other S. enterica serovars, such as Salmonella enterica serovar Typhimurium, Salmonella enterica serovar Enteritidis, and Salmonella enterica serovar Paratyphi A, B, and C. DNA sequencing of the amplified VNTR regions substantiated these results, suggesting the high stability of the multiplex PCR assay. The multiplex-PCR-based VNTR profiling developed in this study provides a simple, rapid, reproducible, and high-resolution molecular tool for the epidemiological analysis of S. enterica serovar Typhi strains.
Substructure of a Tunisian Berber population as inferred from 15 autosomal short tandem repeat loci.

PubMed

Khodjet-El-Khil, Houssein; Fadhlaoui-Zid, Karima; Gusmão, Leonor; Alves, Cíntia; Benammar-Elgaaied, Amel; Amorim, Antonio

2008-08-01

Currently, language and cultural practices are the only criteria to distinguish between Berber autochthonous Tunisian populations. To evaluate these populations' possible genetic structure and differentiation, we have analyzed 15 autosomal short tandem repeat loci (CSF1PO, D3S1358, D5S818, D7S820, D8S1179, D13S317, D16S539, D18S51, D21S11, FGA, TH01, TPOX, VWA, D2S1338, and D19S433) in three southern Tunisian Berber groups: Sened, Matmata, and Chenini-Douiret. The exact test of population differentiation based on allele frequencies at the 15 loci shows significant P values at 7 loci between Chenini-Douiret and both Sened and Matmata, whereas just 5 loci show significant P values between Sened and Matmata. Comparative analyses between the three Berber groups based on genetic distances show that P values for F(ST) distances are significant between the three Berber groups. Population analysis performed using Structure shows a clear differentiation between these Berber groups, with strong genetic isolation of Chenini-Douiret. These results confirm at the autosomal level the high degree of heterogeneity of Tunisian Berber populations that had been previously reported for uniparental markers.
DIFFERENTIATION OF SCHISTOSOMA HAEMATOBIUM FROM RELATED SCHISTOSOMES BY PCR AMPLIFYING AN INTER-REPEAT SEQUENCE

PubMed Central

ABBASI, IBRAHIM; KING, CHARLES H.; STURROCK, ROBERT F.; KARIUKI, CURTIS; MUCHIRI, ERIC; HAMBURGER, JOSEPH

2008-01-01

Schistosoma haematobium infects nearly 150 million people, primarily in Africa, and is transmitted by select species of local bulinid snails. These snails can host other related trematode species as well, so that effective detection and monitoring of snails infected with S. haematobium requires a successful differentiation between S. haematobium and any closely related schistosome species. To enable differential detection of S. haematobium DNA by simple polymerase chain reaction (PCR), we designed and tested primer pairs from numerous newly identified Schistosoma DNA repeat sequences. However, all pairs tested were found unsuitable for this purpose. Differentiation of S. haematobium from S. bovis, S. mattheei, S. curassoni, and S. intercalatum (but not from S. margrebowiei) was ultimately accomplished by PCR using one primer from a newly identified repeat, Sh110, and a second primer from a known schistosomal splice-leader sequence. For evaluation of residual S. haematobium transmission after control interventions, this differentiation tool will enable accurate monitoring of infected snails in areas where S. haematobium is sympatric with the most prevalent other schistosome species. PMID:17488921
Molecular typing of Argentinian Mycobacterium avium subsp. paratuberculosis isolates by multiple-locus variable number-tandem repeat analysis

PubMed Central

Gioffré, Andrea; Correa Muñoz, Magnolia; Alvarado Pinedo, María F.; Vaca, Roberto; Morsella, Claudia; Fiorentino, María Andrea; Paolicchi, Fernando; Ruybal, Paula; Zumárraga, Martín; Travería, Gabriel E.; Romano, María Isabel

2015-01-01

Multiple-locus variable number-tandem repeat analysis (MLVA) of Mycobacterium avium subspecies paratuberculosis (MAP) isolates may contribute to the knowledge of strain diversity in Argentina. Although the diversity of MAP has been previously investigated in Argentina using IS900-RFLP, a small number of isolates were employed, and a low discriminative power was reached. The aim of the present study was to test the genetic diversity among MAP isolates using an MLVA approach based on 8 repetitive loci. We studied 97 isolates from cattle, goat and sheep and could describe 7 different patterns: INMV1, INMV2, INMV11, INMV13, INMV16, INMV33 and one incomplete pattern. INMV1 and INMV2 were the most frequent patterns, grouping 76.3% of the isolates. We were also able to demonstrate the coexistence of genotypes in herds and co-infection at the organism level. This study shows that all the patterns described are common to those described in Europe, suggesting an epidemiological link between the continents. PMID:26273274
Application of Molecular Typing Results in Source Attribution Models: The Case of Multiple Locus Variable Number Tandem Repeat Analysis (MLVA) of Salmonella Isolates Obtained from Integrated Surveillance in Denmark.

PubMed

de Knegt, Leonardo V; Pires, Sara M; Löfström, Charlotta; Sørensen, Gitte; Pedersen, Karl; Torpdahl, Mia; Nielsen, Eva M; Hald, Tine

2016-03-01

Salmonella is an important cause of bacterial foodborne infections in Denmark. To identify the main animal-food sources of human salmonellosis, risk managers have relied on a routine application of a microbial subtyping-based source attribution model since 1995. In 2013, multiple locus variable number tandem repeat analysis (MLVA) substituted phage typing as the subtyping method for surveillance of S. Enteritidis and S. Typhimurium isolated from animals, food, and humans in Denmark. The purpose of this study was to develop a modeling approach applying a combination of serovars, MLVA types, and antibiotic resistance profiles for the Salmonella source attribution, and assess the utility of the results for the food safety decisionmakers. Full and simplified MLVA schemes from surveillance data were tested, and model fit and consistency of results were assessed using statistical measures. We conclude that loci schemes STTR5/STTR10/STTR3 for S. Typhimurium and SE9/SE5/SE2/SE1/SE3 for S. Enteritidis can be used in microbial subtyping-based source attribution models. Based on the results, we discuss that an adjustment of the discriminatory level of the subtyping method applied often will be required to fit the purpose of the study and the available data. The issues discussed are also considered highly relevant when applying, e.g., extended multi-locus sequence typing or next-generation sequencing techniques. © 2015 Society for Risk Analysis.
Core genome conservation of Staphylococcus haemolyticus limits sequence based population structure analysis.

PubMed

Cavanagh, Jorunn Pauline; Klingenberg, Claus; Hanssen, Anne-Merethe; Fredheim, Elizabeth Aarag; Francois, Patrice; Schrenzel, Jacques; Flægstad, Trond; Sollid, Johanna Ericson

2012-06-01

The notoriously multi-resistant Staphylococcus haemolyticus is an emerging pathogen causing serious infections in immunocompromised patients. Defining the population structure is important to detect outbreaks and spread of antimicrobial resistant clones. Currently, the standard typing technique is pulsed-field gel electrophoresis (PFGE). In this study we describe novel molecular typing schemes for S. haemolyticus using multi locus sequence typing (MLST) and multi locus variable number of tandem repeats (VNTR) analysis. Seven housekeeping genes (MLST) and five VNTR loci (MLVF) were selected for the novel typing schemes. A panel of 45 human and veterinary S. haemolyticus isolates was investigated. The collection had diverse PFGE patterns (38 PFGE types) and was sampled over a 20 year-period from eight countries. MLST resolved 17 sequence types (Simpsons index of diversity [SID]=0.877) and MLVF resolved 14 repeat types (SID=0.831). We found a low sequence diversity. Phylogenetic analysis clustered the isolates in three (MLST) and one (MLVF) clonal complexes, respectively. Taken together, neither the MLST nor the MLVF scheme was suitable to resolve the population structure of this S. haemolyticus collection. Future MLVF and MLST schemes will benefit from addition of more variable core genome sequences identified by comparing different fully sequenced S. haemolyticus genomes. Copyright © 2012 Elsevier B.V. All rights reserved.
Development of simple sequence repeat markers and diversity analysis in alfalfa (Medicago sativa L.).

PubMed

Wang, Zan; Yan, Hongwei; Fu, Xinnian; Li, Xuehui; Gao, Hongwen

2013-04-01

Efficient and robust molecular markers are essential for molecular breeding in plant. Compared to dominant and bi-allelic markers, multiple alleles of simple sequence repeat (SSR) markers are particularly informative and superior in genetic linkage map and QTL mapping in autotetraploid species like alfalfa. The objective of this study was to enrich SSR markers directly from alfalfa expressed sequence tags (ESTs). A total of 12,371 alfalfa ESTs were retrieved from the National Center for Biotechnology Information. Total 774 SSR-containing ESTs were identified from 716 ESTs. On average, one SSR was found per 7.7 kb of EST sequences. Tri-nucleotide repeats (48.8 %) was the most abundant motif type, followed by di-(26.1 %), tetra-(11.5 %), penta-(9.7 %), and hexanucleotide (3.9 %). One hundred EST-SSR primer pairs were successfully designed and 29 exhibited polymorphism among 28 alfalfa accessions. The allele number per marker ranged from two to 21 with an average of 6.8. The PIC values ranged from 0.195 to 0.896 with an average of 0.608, indicating a high level of polymorphism of the EST-SSR markers. Based on the 29 EST-SSR markers, assessment of genetic diversity was conducted and found that Medicago sativa ssp. sativa was clearly different from the other subspecies. The high transferability of those EST-SSR markers was also found for relative species.
Single Amino Acid Repeats in the Proteome World: Structural, Functional, and Evolutionary Insights

PubMed Central

Kumar, Amitha Sampath; Sowpati, Divya Tej; Mishra, Rakesh K.

2016-01-01

Microsatellites or simple sequence repeats (SSR) are abundant, highly diverse stretches of short DNA repeats present in all genomes. Tandem mono/tri/hexanucleotide repeats in the coding regions contribute to single amino acids repeats (SAARs) in the proteome. While SSRs in the coding region always result in amino acid repeats, a majority of SAARs arise due to a combination of various codons representing the same amino acid and not as a consequence of SSR events. Certain amino acids are abundant in repeat regions indicating a positive selection pressure behind the accumulation of SAARs. By analysing 22 proteomes including the human proteome, we explored the functional and structural relationship of amino acid repeats in an evolutionary context. Only ~15% of repeats are present in any known functional domain, while ~74% of repeats are present in the disordered regions, suggesting that SAARs add to the functionality of proteins by providing flexibility, stability and act as linker elements between domains. Comparison of SAAR containing proteins across species reveals that while shorter repeats are conserved among orthologs, proteins with longer repeats, >15 amino acids, are unique to the respective organism. Lysine repeats are well conserved among orthologs with respect to their length and number of occurrences in a protein. Other amino acids such as glutamic acid, proline, serine and alanine repeats are generally conserved among the orthologs with varying repeat lengths. These findings suggest that SAARs have accumulated in the proteome under positive selection pressure and that they provide flexibility for optimal folding of functional/structural domains of proteins. The insights gained from our observations can help in effective designing and engineering of proteins with novel features. PMID:27893794
Targeting of Repeated Sequences Unique to a Gene Results in Significant Increases in Antisense Oligonucleotide Potency

PubMed Central

Vickers, Timothy A.; Freier, Susan M.; Bui, Huynh-Hoa; Watt, Andrew; Crooke, Stanley T.

2014-01-01

A new strategy for identifying potent RNase H-dependent antisense oligonucleotides (ASOs) is presented. Our analysis of the human transcriptome revealed that a significant proportion of genes contain unique repeated sequences of 16 or more nucleotides in length. Activities of ASOs targeting these repeated sites in several representative genes were compared to those of ASOs targeting unique single sites in the same transcript. Antisense activity at repeated sites was also evaluated in a highly controlled minigene system. Targeting both native and minigene repeat sites resulted in significant increases in potency as compared to targeting of non-repeated sites. The increased potency at these sites is a result of increased frequency of ASO/RNA interactions which, in turn, increases the probability of a productive interaction between the ASO/RNA heteroduplex and human RNase H1 in the cell. These results suggest a new, highly efficient strategy for rapid identification of highly potent ASOs. PMID:25334092
Transcription arrest by a G quadruplex forming-trinucleotide repeat sequence from the human c-myb gene.

PubMed

Broxson, Christopher; Beckett, Joshua; Tornaletti, Silvia

2011-05-17

Non canonical DNA structures correspond to genomic regions particularly susceptible to genetic instability. The transcription process facilitates formation of these structures and plays a major role in generating the instability associated with these genomic sites. However, little is known about how non canonical structures are processed when encountered by an elongating RNA polymerase. Here we have studied the behavior of T7 RNA polymerase (T7RNAP) when encountering a G quadruplex forming-(GGA)(4) repeat located in the human c-myb proto-oncogene. To make direct correlations between formation of the structure and effects on transcription, we have taken advantage of the ability of the T7 polymerase to transcribe single-stranded substrates and of G4 DNA to form in single-stranded G-rich sequences in the presence of potassium ions. Under physiological KCl concentrations, we found that T7 RNAP transcription was arrested at two sites that mapped to the c-myb (GGA)(4) repeat sequence. The extent of arrest did not change with time, indicating that the c-myb repeat represented an absolute block and not a transient pause to T7 RNAP. Consistent with G4 DNA formation, arrest was not observed in the absence of KCl or in the presence of LiCl. Furthermore, mutations in the c-myb (GGA)(4) repeat, expected to prevent transition to G4, also eliminated the transcription block. We show T7 RNAP arrest at the c-myb repeat in double-stranded DNA under conditions mimicking the cellular concentration of biomolecules and potassium ions, suggesting that the G4 structure formed in the c-myb repeat may represent a transcription roadblock in vivo. Our results support a mechanism of transcription-coupled DNA repair initiated by arrest of transcription at G4 structures.

The complete chloroplast genome sequence of the medicinal plant Salvia miltiorrhiza.

PubMed

Qian, Jun; Song, Jingyuan; Gao, Huanhuan; Zhu, Yingjie; Xu, Jiang; Pang, Xiaohui; Yao, Hui; Sun, Chao; Li, Xian'en; Li, Chuyuan; Liu, Juyan; Xu, Haibin; Chen, Shilin

2013-01-01

Salvia miltiorrhiza is an important medicinal plant with great economic and medicinal value. The complete chloroplast (cp) genome sequence of Salvia miltiorrhiza, the first sequenced member of the Lamiaceae family, is reported here. The genome is 151,328 bp in length and exhibits a typical quadripartite structure of the large (LSC, 82,695 bp) and small (SSC, 17,555 bp) single-copy regions, separated by a pair of inverted repeats (IRs, 25,539 bp). It contains 114 unique genes, including 80 protein-coding genes, 30 tRNAs and four rRNAs. The genome structure, gene order, GC content and codon usage are similar to the typical angiosperm cp genomes. Four forward, three inverted and seven tandem repeats were detected in the Salvia miltiorrhiza cp genome. Simple sequence repeat (SSR) analysis among the 30 asterid cp genomes revealed that most SSRs are AT-rich, which contribute to the overall AT richness of these cp genomes. Additionally, fewer SSRs are distributed in the protein-coding sequences compared to the non-coding regions, indicating an uneven distribution of SSRs within the cp genomes. Entire cp genome comparison of Salvia miltiorrhiza and three other Lamiales cp genomes showed a high degree of sequence similarity and a relatively high divergence of intergenic spacers. Sequence divergence analysis discovered the ten most divergent and ten most conserved genes as well as their length variation, which will be helpful for phylogenetic studies in asterids. Our analysis also supports that both regional and functional constraints affect gene sequence evolution. Further, phylogenetic analysis demonstrated a sister relationship between Salvia miltiorrhiza and Sesamum indicum. The complete cp genome sequence of Salvia miltiorrhiza reported in this paper will facilitate population, phylogenetic and cp genetic engineering studies of this medicinal plant.
History of CRISPR-Cas from Encounter with a Mysterious Repeated Sequence to Genome Editing Technology.

PubMed

Ishino, Yoshizumi; Krupovic, Mart; Forterre, Patrick

2018-04-01

Clustered regularly interspaced short palindromic repeat (CRISPR)-Cas systems are well-known acquired immunity systems that are widespread in archaea and bacteria. The RNA-guided nucleases from CRISPR-Cas systems are currently regarded as the most reliable tools for genome editing and engineering. The first hint of their existence came in 1987, when an unusual repetitive DNA sequence, which subsequently was defined as a CRISPR, was discovered in the Escherichia coli genome during an analysis of genes involved in phosphate metabolism. Similar sequence patterns were then reported in a range of other bacteria as well as in halophilic archaea, suggesting an important role for such evolutionarily conserved clusters of repeated sequences. A critical step toward functional characterization of the CRISPR-Cas systems was the recognition of a link between CRISPRs and the associated Cas proteins, which were initially hypothesized to be involved in DNA repair in hyperthermophilic archaea. Comparative genomics, structural biology, and advanced biochemistry could then work hand in hand, not only culminating in the explosion of genome editing tools based on CRISPR-Cas9 and other class II CRISPR-Cas systems but also providing insights into the origin and evolution of this system from mobile genetic elements denoted casposons. To celebrate the 30th anniversary of the discovery of CRISPR, this minireview briefly discusses the fascinating history of CRISPR-Cas systems, from the original observation of an enigmatic sequence in E. coli to genome editing in humans. Copyright © 2018 American Society for Microbiology.
Complete sequence and gene organization of the mitochondrial genome of Asio flammeus (Strigiformes, strigidae).

PubMed

Zhang, Yanan; Song, Tao; Pan, Tao; Sun, Xiaonan; Sun, Zhonglou; Qian, Lifu; Zhang, Baowei

2016-07-01

The complete sequence of the mitochondrial genome was determined for Asio flammeus, which is distributed widely in geography. The length of the complete mitochondrial genome was 18,966 bp, containing 2 rRNA genes, 22 tRNA genes, 13 protein-coding genes (PCGs), and 1 non-coding region (D-loop). All the genes were distributed on the H-strand, except for the ND6 subunit gene and eight tRNA genes which were encoded on the L-strand. The D-loop of A. flammeus contained many tandem repeats of varying lengths and repeat numbers. The molecular-based phylogeny showed that our species acted as the sister group to A. capensis and the supported Asio was the monophyletic group.
Identifying uniformly mutated segments within repeats.

PubMed

Sahinalp, S Cenk; Eichler, Evan; Goldberg, Paul; Berenbrink, Petra; Friedetzky, Tom; Ergun, Funda

2004-12-01

Given a long string of characters from a constant size alphabet we present an algorithm to determine whether its characters have been generated by a single i.i.d. random source. More specifically, consider all possible n-coin models for generating a binary string S, where each bit of S is generated via an independent toss of one of the n coins in the model. The choice of which coin to toss is decided by a random walk on the set of coins where the probability of a coin change is much lower than the probability of using the same coin repeatedly. We present a procedure to evaluate the likelihood of a n-coin model for given S, subject a uniform prior distribution over the parameters of the model (that represent mutation rates and probabilities of copying events). In the absence of detailed prior knowledge of these parameters, the algorithm can be used to determine whether the a posteriori probability for n=1 is higher than for any other n>1. Our algorithm runs in time O(l4logl), where l is the length of S, through a dynamic programming approach which exploits the assumed convexity of the a posteriori probability for n. Our test can be used in the analysis of long alignments between pairs of genomic sequences in a number of ways. For example, functional regions in genome sequences exhibit much lower mutation rates than non-functional regions. Because our test provides means for determining variations in the mutation rate, it may be used to distinguish functional regions from non-functional ones. Another application is in determining whether two highly similar, thus evolutionarily related, genome segments are the result of a single copy event or of a complex series of copy events. This is particularly an issue in evolutionary studies of genome regions rich with repeat segments (especially tandemly repeated segments).
Analysis of sequence diversity through internal transcribed spacers and simple sequence repeats to identify Dendrobium species.

PubMed

Liu, Y T; Chen, R K; Lin, S J; Chen, Y C; Chin, S W; Chen, F C; Lee, C Y

2014-04-08

The Orchidaceae is one of the largest and most diverse families of flowering plants. The Dendrobium genus has high economic potential as ornamental plants and for medicinal purposes. In addition, the species of this genus are able to produce large crops. However, many Dendrobium varieties are very similar in outward appearance, making it difficult to distinguish one species from another. This study demonstrated that the 12 Dendrobium species used in this study may be divided into 2 groups by internal transcribed spacer (ITS) sequence analysis. Red and yellow flowers may also be used to separate these species into 2 main groups. In particular, the deciduous characteristic is associated with the ITS genetic diversity of the A group. Of 53 designed simple sequence repeat (SSR) primer pairs, 7 pairs were polymorphic for polymerase chain reaction products that were amplified from a specific band. The results of this study demonstrate that these 7 SSR primer pairs may potentially be used to identify Dendrobium species and their progeny in future studies.
Repeated extragenic sequences in prokaryotic genomes: a proposal for the origin and dynamics of the RUP element in Streptococcus pneumoniae.

PubMed

Oggioni, M R; Claverys, J P

1999-10-01

A survey of all Streptococcus pneumoniae GenBank/EMBL DNA sequence entries and of the public domain sequence (representing more than 90% of the genome) of an S. pneumoniae type 4 strain allowed identification of 108 copies of a 107-bp-long highly repeated intergenic element called RUP (for repeat unit of pneumococcus). Several features of the element, revealed in this study, led to the proposal that RUP is an insertion sequence (IS)-derivative that could still be mobile. Among these features are: (1) a highly significant homology between the terminal inverted repeats (IRs) of RUPs and of IS630-Spn1, a new putative IS of S. pneumoniae; and (2) insertion at a TA dinucleotide, a characteristic target of several members of the IS630 family. Trans-mobilization of RUP is therefore proposed to be mediated by the transposase of IS630-Spn1. To account for the observation that RUPs are distributed among four subtypes which exhibit different degrees of sequence homogeneity, a scenario is invoked based on successive stages of RUP mobility and non-mobility, depending on whether an active transposase is present or absent. In the latter situation, an active transposase could be reintroduced into the species through natural transformation. Examination of sequences flanking RUP revealed a preferential association with ISs. It also provided evidence that RUPs promote sequence rearrangements, thereby contributing to genome flexibility. The possibility that RUP preferentially targets transforming DNA of foreign origin and subsequently favours disruption/rearrangement of exogenous sequences is discussed.
Simple sequence repeat marker development from bacterial artificial chromosome end sequences and expressed sequence tags of flax (Linum usitatissimum L.).

PubMed

Cloutier, Sylvie; Miranda, Evelyn; Ward, Kerry; Radovanovic, Natasa; Reimer, Elsa; Walichnowski, Andrzej; Datla, Raju; Rowland, Gordon; Duguid, Scott; Ragupathy, Raja

2012-08-01

Flax is an important oilseed crop in North America and is mostly grown as a fibre crop in Europe. As a self-pollinated diploid with a small estimated genome size of ~370 Mb, flax is well suited for fast progress in genomics. In the last few years, important genetic resources have been developed for this crop. Here, we describe the assessment and comparative analyses of 1,506 putative simple sequence repeats (SSRs) of which, 1,164 were derived from BAC-end sequences (BESs) and 342 from expressed sequence tags (ESTs). The SSRs were assessed on a panel of 16 flax accessions with 673 (58 %) and 145 (42 %) primer pairs being polymorphic in the BESs and ESTs, respectively. With 818 novel polymorphic SSR primer pairs reported in this study, the repertoire of available SSRs in flax has more than doubled from the combined total of 508 of all previous reports. Among nucleotide motifs, trinucleotides were the most abundant irrespective of the class, but dinucleotides were the most polymorphic. SSR length was also positively correlated with polymorphism. Two dinucleotide (AT/TA and AG/GA) and two trinucleotide (AAT/ATA/TAA and GAA/AGA/AAG) motifs and their iterations, different from those reported in many other crops, accounted for more than half of all the SSRs and were also more polymorphic (63.4 %) than the rest of the markers (42.7 %). This improved resource promises to be useful in genetic, quantitative trait loci (QTL) and association mapping as well as for anchoring the physical/genetic map with the whole genome shotgun reference sequence of flax.
Length and sequence heterogeneity in 5S rDNA of Populus deltoides.

PubMed

Negi, Madan S; Rajagopal, Jyothi; Chauhan, Neeti; Cronn, Richard; Lakshmikumaran, Malathi

2002-12-01

The 5S rRNA genes and their associated non-transcribed spacer (NTS) regions are present as repeat units arranged in tandem arrays in plant genomes. Length heterogeneity in 5S rDNA repeats was previously identified in Populus deltoides and was also observed in the present study. Primers were designed to amplify the 5S rDNA NTS variants from the P. deltoides genome. The PCR-amplified products from the two accessions of P. deltoides (G3 and G48) suggested the presence of length heterogeneity of 5S rDNA units within and among accessions, and the size of the spacers ranged from 385 to 434 bp. Sequence analysis of the non-transcribed spacer (NTS) revealed two distinct classes of 5S rDNA within both accessions: class 1, which contained GAA trinucleotide microsatellite repeats, and class 2, which lacked the repeats. The class 1 spacer shows length variation owing to the microsatellite, with two clones exhibiting 10 GAA repeat units and one clone exhibiting 16 such repeat units. However, distance analysis shows that class 1 spacer sequences are highly similar inter se, yielding nucleotide diversity (pi) estimates that are less than 0.15% of those obtained for class 2 spacers (pi = 0.0183 vs. 0.1433, respectively). The presence of microsatellite in the NTS region leading to variation in spacer length is reported and discussed for the first time in P. deltoides.
Hierarchical modeling of genome-wide Short Tandem Repeat (STR) markers infers native American prehistory.

PubMed

Lewis, Cecil M

2010-02-01

This study examines a genome-wide dataset of 678 Short Tandem Repeat loci characterized in 444 individuals representing 29 Native American populations as well as the Tundra Netsi and Yakut populations from Siberia. Using these data, the study tests four current hypotheses regarding the hierarchical distribution of neutral genetic variation in native South American populations: (1) the western region of South America harbors more variation than the eastern region of South America, (2) Central American and western South American populations cluster exclusively, (3) populations speaking the Chibchan-Paezan and Equatorial-Tucanoan language stock emerge as a group within an otherwise South American clade, (4) Chibchan-Paezan populations in Central America emerge together at the tips of the Chibchan-Paezan cluster. This study finds that hierarchical models with the best fit place Central American populations, and populations speaking the Chibchan-Paezan language stock, at a basal position or separated from the South American group, which is more consistent with a serial founder effect into South America than that previously described. Western (Andean) South America is found to harbor similar levels of variation as eastern (Equatorial-Tucanoan and Ge-Pano-Carib) South America, which is inconsistent with an initial west coast migration into South America. Moreover, in all relevant models, the estimates of genetic diversity within geographic regions suggest a major bottleneck or founder effect occurring within the North American subcontinent, before the peopling of Central and South America. 2009 Wiley-Liss, Inc.
Transcription of tandemly repetitive DNA: functional roles.

PubMed

Biscotti, Maria Assunta; Canapa, Adriana; Forconi, Mariko; Olmo, Ettore; Barucca, Marco

2015-09-01

A considerable fraction of the eukaryotic genome is made up of satellite DNA constituted of tandemly repeated sequences. These elements are mainly located at centromeres, pericentromeres, and telomeres and are major components of constitutive heterochromatin. Although originally satellite DNA was thought silent and inert, an increasing number of studies are providing evidence on its transcriptional activity supporting, on the contrary, an unexpected dynamicity. This review summarizes the multiple structural roles of satellite noncoding RNAs at chromosome level. Indeed, satellite noncoding RNAs play a role in the establishment of a heterochromatic state at centromere and telomere. These highly condensed structures are indispensable to preserve chromosome integrity and genome stability, preventing recombination events, and ensuring the correct chromosome pairing and segregation. Moreover, these RNA molecules seem to be involved also in maintaining centromere identity and in elongation, capping, and replication of telomere. Finally, the abnormal variation of centromeric and pericentromeric DNA transcription across major eukaryotic lineages in stress condition and disease has evidenced the critical role that these transcripts may play and the potentially dire consequences for the organism.
Pstl repeat: a family of short interspersed nucleotide element (SINE)-like sequences in the genomes of cattle, goat, and buffalo.

PubMed

Sheikh, Faruk G; Mukhopadhyay, Sudit S; Gupta, Prabhakar

2002-02-01

The PstI family of elements are short, highly repetitive DNA sequences interspersed throughout the genome of the Bovidae. We have cloned and sequenced some members of the PstI family from cattle, goat, and buffalo. These elements are approximately 500 bp, have a copy number of 2 x 10(5) - 4 x 10(5), and comprise about 4% of the haploid genome. Studies of nucleotide sequence homology indicate that the buffalo and goat PstI repeats (type II) are similar types of short interspersed nucleotide element (SINE) sequences, but the cattle PstI repeat (type I) is considerably more divergent. Additionally, the goat PstI sequence showed significant sequence homology with bovine serine tRNA, and is therefore likely derived from serine tRNA. Interestingly, Southern hybridization suggests that both types of SINEs (I and II) are present in all the species of Bovidae. Dendrogram analysis indicates that cattle PstI SINE is similar to bovine Alu-like SINEs. Goat and buffalo SINEs formed a separate cluster, suggesting that these two types of SINEs evolved separately in the genome of the Bovidae.
The SIDER2 elements, interspersed repeated sequences that populate the Leishmania genomes, constitute subfamilies showing chromosomal proximity relationship.

PubMed

Requena, Jose M; Folgueira, Cristina; López, Manuel C; Thomas, M Carmen

2008-06-02

Protozoan parasites of the genus Leishmania are causative agents of a diverse spectrum of human diseases collectively known as leishmaniasis. These eukaryotic pathogens that diverged early from the main eukaryotic lineage possess a number of unusual genomic, molecular and biochemical features. The completion of the genome projects for three Leishmania species has generated invaluable information enabling a direct analysis of genome structure and organization. By using DNA macroarrays, made with Leishmania infantum genomic clones and hybridized with total DNA from the parasite, we identified a clone containing a repeated sequence. An analysis of the recently completed genome sequence of L. infantum, using this repeated sequence as bait, led to the identification of a new class of repeated elements that are interspersed along the different L. infantum chromosomes. These elements turned out to be homologues of SIDER2 sequences, which were recently identified in the Leishmania major genome; thus, we adopted this nomenclature for the Leishmania elements described herein. Since SIDER2 elements are very heterogeneous in sequence, their precise identification is rather laborious. We have characterized 54 LiSIDER2 elements in chromosome 32 and 27 ones in chromosome 20. The mean size for these elements is 550 bp and their sequence is G+C rich (mean value of 66.5%). On the basis of sequence similarity, these elements can be grouped in subfamilies that show a remarkable relationship of proximity, i.e. SIDER2s of a given subfamily locate close in a chromosomal region without intercalating elements. For comparative purposes, we have identified the SIDER2 elements existing in L. major and Leishmania braziliensis chromosomes 32. While SIDER2 elements are highly conserved both in number and location between L. infantum and L. major, no such conservation exists when comparing with SIDER2s in L. braziliensis chromosome 32. SIDER2 elements constitute a relevant piece in the Leishmania
A Large Population Genetic Study of 15 Autosomal Short Tandem Repeat Loci for Establishment of Korean DNA Profile Database

PubMed Central

Yoo, Seong Yeon; Cho, Nam Soo; Park, Myung Jin; Seong, Ki Min; Hwang, Jung Ho; Song, Seok Bean; Han, Myun Soo; Lee, Won Tae; Chung, Ki Wha

2011-01-01

Genotyping of highly polymorphic short tandem repeat (STR) markers is widely used for the genetic identification of individuals in forensic DNA analyses and in paternity disputes. The National DNA Profile Databank recently established by the DNA Identification Act in Korea contains the computerized STR DNA profiles of individuals convicted of crimes. For the establishment of a large autosomal STR loci population database, 1805 samples were obtained at random from Korean individuals and 15 autosomal STR markers were analyzed using the AmpFlSTR Identifiler PCR Amplification kit. For the 15 autosomal STR markers, no deviations from the Hardy-Weinberg equilibrium were observed. The most informative locus in our data set was the D2S1338 with a discrimination power of 0.9699. The combined matching probability was 1.521 × 10-17. This large STR profile dataset including atypical alleles will be important for the establishment of the Korean DNA database and for forensic applications. PMID:21597912
A large-scale dataset of single and mixed-source short tandem repeat profiles to inform human identification strategies: PROVEDIt.

PubMed

Alfonse, Lauren E; Garrett, Amanda D; Lun, Desmond S; Duffy, Ken R; Grgicak, Catherine M

2018-01-01

DNA-based human identity testing is conducted by comparison of PCR-amplified polymorphic Short Tandem Repeat (STR) motifs from a known source with the STR profiles obtained from uncertain sources. Samples such as those found at crime scenes often result in signal that is a composite of incomplete STR profiles from an unknown number of unknown contributors, making interpretation an arduous task. To facilitate advancement in STR interpretation challenges we provide over 25,000 multiplex STR profiles produced from one to five known individuals at target levels ranging from one to 160 copies of DNA. The data, generated under 144 laboratory conditions, are classified by total copy number and contributor proportions. For the 70% of samples that were synthetically compromised, we report the level of DNA damage using quantitative and end-point PCR. In addition, we characterize the complexity of the signal by exploring the number of detected alleles in each profile. Copyright © 2017 Elsevier B.V. All rights reserved.
A large population genetic study of 15 autosomal short tandem repeat loci for establishment of Korean DNA profile database.

PubMed

Yoo, Seong Yeon; Cho, Nam Soo; Park, Myung Jin; Seong, Ki Min; Hwang, Jung Ho; Song, Seok Bean; Han, Myun Soo; Lee, Won Tae; Chung, Ki Wha

2011-07-01

Genotyping of highly polymorphic short tandem repeat (STR) markers is widely used for the genetic identification of individuals in forensic DNA analyses and in paternity disputes. The National DNA Profile Databank recently established by the DNA Identification Act in Korea contains the computerized STR DNA profiles of individuals convicted of crimes. For the establishment of a large autosomal STR loci population database, 1805 samples were obtained at random from Korean individuals and 15 autosomal STR markers were analyzed using the AmpFlSTR Identifiler PCR Amplification kit. For the 15 autosomal STR markers, no deviations from the Hardy-Weinberg equilibrium were observed. The most informative locus in our data set was the D2S1338 with a discrimination power of 0.9699. The combined matching probability was 1.521 × 10(-17). This large STR profile dataset including atypical alleles will be important for the establishment of the Korean DNA database and for forensic applications.
Characterization of Escherichia coli O157:H7 in New Zealand using multiple-locus variable-number tandem-repeat analysis.

PubMed

Dyet, K H; Robertson, I; Turbitt, E; Carter, P E

2011-03-01

Recently, multiple-locus variable-number tandem-repeat analysis (MLVA) has been proposed as an alternative to pulsed-field gel electrophoresis (PFGE) for characterization of Escherichia coli O157:H7. In this study we characterized 118 E. coli O157:H7 isolates from cases of gastrointestinal disease in New Zealand using XbaI PFGE profiles and a MLVA scheme that assessed variability in eight polymorphic loci. The 118 isolates characterized included all 80 E. coli O157:H7 referred to New Zealand's Enteric Reference Laboratory in 2006 and 29 phage-type 2 isolates from 2005. When applied to these isolates the discriminatory power of PFGE and MLVA was not significantly different. However, MLVA data may be more epidemiologically relevant as isolates from family clusters of disease had identical MLVA profiles, even when the XbaI PFGE profiles differed slightly. Furthermore, most isolates with indistinguishable XbaI PFGE profiles that did not appear to be epidemiologically related had distinct MLVA profiles.
Analysis of Two Cosmid Clones from Chromosome 4 of Drosophila melanogaster Reveals Two New Genes Amid an Unusual Arrangement of Repeated Sequences

PubMed Central

Locke, John; Podemski, Lynn; Roy, Ken; Pilgrim, David; Hodgetts, Ross

1999-01-01

Chromosome 4 from Drosophila melanogaster has several unusual features that distinguish it from the other chromosomes. These include a diffuse appearance in salivary gland polytene chromosomes, an absence of recombination, and the variegated expression of P-element transgenes. As part of a larger project to understand these properties, we are assembling a physical map of this chromosome. Here we report the sequence of two cosmids representing ∼5% of the polytenized region. Both cosmid clones contain numerous repeated DNA sequences, as identified by cross hybridization with labeled genomic DNA, BLAST searches, and dot matrix analysis, which are positioned between and within the transcribed sequences. The repetitive sequences include three copies of the mobile element Hoppel, one copy of the mobile element HB, and 18 DINE repeats. DINE is a novel, short repeated sequence dispersed throughout both cosmid sequences. One cosmid includes the previously described cubitus interruptus (ci) gene and two new genes: that a gene with a predicted amino acid sequence similar to ribosomal protein S3a which is consistent with the Minute(4)101 locus thought to be in the region, and a novel member of the protein family that includes plexin and met–hepatocyte growth factor receptor. The other cosmid contains only the two short 5′-most exons from the zinc-finger-homolog-2 (zfh-2) gene. This is the first extensive sequence analysis of noncoding DNA from chromosome 4. The distribution of the various repeats suggests its organization is similar to the β-heterochromatic regions near the base of the major chromosome arms. Such a pattern may account for the diffuse banding of the polytene chromosome 4 and the variegation of many P-element transgenes on the chromosome. PMID:10022978
The solution structure of the pentatricopeptide repeat protein PPR10 upon binding atpH RNA

PubMed Central

Gully, Benjamin S.; Cowieson, Nathan; Stanley, Will A.; Shearston, Kate; Small, Ian D.; Barkan, Alice; Bond, Charles S.

2015-01-01

The pentatricopeptide repeat (PPR) protein family is a large family of RNA-binding proteins that is characterized by tandem arrays of a degenerate 35-amino-acid motif which form an α-solenoid structure. PPR proteins influence the editing, splicing, translation and stability of specific RNAs in mitochondria and chloroplasts. Zea mays PPR10 is amongst the best studied PPR proteins, where sequence-specific binding to two RNA transcripts, atpH and psaJ, has been demonstrated to follow a recognition code where the identity of two amino acids per repeat determines the base-specificity. A recently solved ZmPPR10:psaJ complex crystal structure suggested a homodimeric complex with considerably fewer sequence-specific protein–RNA contacts than inferred previously. Here we describe the solution structure of the ZmPPR10:atpH complex using size-exclusion chromatography-coupled synchrotron small-angle X-ray scattering (SEC-SY-SAXS). Our results support prior evidence that PPR10 binds RNA as a monomer, and that it does so in a manner that is commensurate with a canonical and predictable RNA-binding mode across much of the RNA–protein interface. PMID:25609698
Multicolor-based discrimination of 21 short tandem repeats and amelogenin using four fluorescent universal primers.

PubMed

Asari, Masaru; Okuda, Katsuhiro; Hoshina, Chisato; Omura, Tomohiro; Tasaki, Yoshikazu; Shiono, Hiroshi; Matsubara, Kazuo; Shimizu, Keiko

2016-02-01

The aim of this study was to develop a cost-effective genotyping method using high-quality DNA for human identification. A total of 21 short tandem repeats (STRs) and amelogenin were selected, and fluorescent fragments at 22 loci were simultaneously amplified in a single-tube reaction using locus-specific primers with 24-base universal tails and four fluorescent universal primers. Several nucleotide substitutions in universal tails and fluorescent universal primers enabled the detection of specific fluorescent fragments from the 22 loci. Multiplex polymerase chain reaction (PCR) produced intense FAM-, VIC-, NED-, and PET-labeled fragments ranging from 90 to 400 bp, and these fragments were discriminated using standard capillary electrophoretic analysis. The selected 22 loci were also analyzed using two commercial kits (the AmpFLSTR Identifiler Kit and the PowerPlex ESX 17 System), and results for two loci (D19S433 and D16S539) were discordant between these kits due to mutations at the primer binding sites. All genotypes from the 100 samples were determined using 2.5 ng of DNA by our method, and the expected alleles were completely recovered. Multiplex 22-locus genotyping using four fluorescent universal primers effectively reduces the costs to less than 20% of genotyping using commercial kits, and our method would be useful to detect silent alleles from commercial kit analysis. Copyright © 2015 Elsevier Inc. All rights reserved.
Complexity: an internet resource for analysis of DNA sequence complexity

PubMed Central

Orlov, Y. L.; Potapov, V. N.

2004-01-01

The search for DNA regions with low complexity is one of the pivotal tasks of modern structural analysis of complete genomes. The low complexity may be preconditioned by strong inequality in nucleotide content (biased composition), by tandem or dispersed repeats or by palindrome-hairpin structures, as well as by a combination of all these factors. Several numerical measures of textual complexity, including combinatorial and linguistic ones, together with complexity estimation using a modified Lempel–Ziv algorithm, have been implemented in a software tool called ‘Complexity’ (http://wwwmgs.bionet.nsc.ru/mgs/programs/low_complexity/). The software enables a user to search for low-complexity regions in long sequences, e.g. complete bacterial genomes or eukaryotic chromosomes. In addition, it estimates the complexity of groups of aligned sequences. PMID:15215465

Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs

Treesearch

M.N. lslam-Faridi; C.D. Nelson; S.P. DiFazio; L.E. Gunter; G.A. Tuskan

2009-01-01

The 185-285 rDNA and 55 rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 185-285 rDNA sites and one 55 rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis-type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones...
Evaluation of a highly discriminating multiplex multi-locus variable-number of tandem-repeats (MLVA) analysis for Vibrio cholerae.

PubMed

Olsen, Jaran S; Aarskaug, Tone; Skogan, Gunnar; Fykse, Else Marie; Ellingsen, Anette Bauer; Blatny, Janet M

2009-09-01

Vibrio cholerae is the etiological agent of cholera and may be used in bioterror actions due to the easiness of its dissemination, and the public fear for acquiring the cholera disease. A simple and highly discriminating method for connecting clinical and environmental isolates of V. cholerae is needed in microbial forensics. Twelve different loci containing variable numbers of tandem-repeats (VNTRs) were evaluated in which six loci were polymorphic. Two multiplex reactions containing PCR primers targeting these six VNTRs resulted in successful DNA amplification of 142 various environmental and clinical V. cholerae isolates. The genetic distribution inside the V. cholerae strain collection was used to evaluate the discriminating power (Simpsons Diversity Index=0.99) of this new MLVA analysis, showing that the assay have a potential to differentiate between various strains, but also to identify those isolates which are collected from a common V. cholerae outbreak. This work has established a rapid and highly discriminating MLVA assay useful for track back analyses and/or forensic studies of V. cholerae infections.
Crystal structures of ryanodine receptor SPRY1 and tandem-repeat domains reveal a critical FKBP12 binding determinant

NASA Astrophysics Data System (ADS)

Yuchi, Zhiguang; Yuen, Siobhan M. Wong King; Lau, Kelvin; Underhill, Ainsley Q.; Cornea, Razvan L.; Fessenden, James D.; van Petegem, Filip

2015-08-01

Ryanodine receptors (RyRs) form calcium release channels located in the membranes of the sarcoplasmic and endoplasmic reticulum. RyRs play a major role in excitation-contraction coupling and other Ca2+-dependent signalling events, and consist of several globular domains that together form a large assembly. Here we describe the crystal structures of the SPRY1 and tandem-repeat domains at 1.2-1.5 Å resolution, which reveal several structural elements not detected in recent cryo-EM reconstructions of RyRs. The cryo-EM studies disagree on the position of SPRY domains, which had been proposed based on homology modelling. Computational docking of the crystal structures, combined with FRET studies, show that the SPRY1 domain is located next to FK506-binding protein (FKBP). Molecular dynamics flexible fitting and mutagenesis experiments suggest a hydrophobic cluster within SPRY1 that is crucial for FKBP binding. A RyR1 disease mutation, N760D, appears to directly impact FKBP binding through interfering with SPRY1 folding.
Genotyping and Molecular Identification of Date Palm Cultivars Using Inter-Simple Sequence Repeat (ISSR) Markers.

PubMed

Ayesh, Basim M

2017-01-01

Molecular markers are credible for the discrimination of genotypes and estimation of the extent of genetic diversity and relatedness in a set of genotypes. Inter-simple sequence repeat (ISSR) markers rapidly reveal high polymorphic fingerprints and have been used frequently to determine the genetic diversity among date palm cultivars. This chapter describes the application of ISSR markers for genotyping of date palm cultivars. The application involves extraction of genomic DNA from the target cultivars with reliable quality and quantity. Subsequently the extracted DNA serves as a template for amplification of genomic regions flanked by inverted simple sequence repeats using a single primer. The similarity of each pair of samples is measured by calculating the number of mono- and polymorphic bands revealed by gel electrophoresis. Matrices constructed for similarity and genetic distance are used to build a phylogenetic tree and cluster analysis, to determine the molecular relatedness of cultivars. The protocol describes 3 out of 9 tested primers consistently amplified 31 loci in 6 date palm cultivars, with 28 polymorphic loci.
Analysis of the 9p21.3 sequence associated with coronary artery disease reveals a tendency for duplication in a CAD patient

PubMed Central

Kouprina, Natalay; Noskov, Vladimir N.; Waterfall, Joshua J.; Walker, Robert L.; Meltzer, Paul S.; Topol, Eric J.; Larionov, Vladimir

2018-01-01

Tandem segmental duplications (SDs) greater than 10 kb are widespread in complex genomes. They provide material for gene divergence and evolutionary adaptation, while formation of specific de novo SDs is a hallmark of cancer and some human diseases. Most SDs map to distinct genomic regions termed ‘duplication blocks’. SDs organization within these blocks is often poorly characterized as they are mosaics of ancestral duplicons juxtaposed with younger duplicons arising from more recent duplication events. Structural and functional analysis of SDs is further hampered as long repetitive DNA structures are underrepresented in existing BAC and YAC libraries. We applied Transformation-Associated Recombination (TAR) cloning, a versatile technique for large DNA manipulation, to selectively isolate the coronary artery disease (CAD) interval sequence within the 9p21.3 chromosome locus from a patient with coronary artery disease and normal individuals. Four tandem head-to-tail duplicons, each ∼50 kb long, were recovered in the patient but not in normal individuals. Sequence analysis revealed that the repeats varied by 10-15 SNPs between each other and by 82 SNPs between the human genome sequence (version hg19). SNPs polymorphism within the junctions between repeats allowed two junction types to be distinguished, Type 1 and Type 2, which were found at a 2:1 ratio. The junction sequences contained an Alu element, a sequence previously shown to play a role in duplication. Knowledge of structural variation in the CAD interval from more patients could help link this locus to cardiovascular diseases susceptibility, and maybe relevant to other cases of regional amplification, including cancer. PMID:29632643
Initial sequence and comparative analysis of the cat genome

PubMed Central

Pontius, Joan U.; Mullikin, James C.; Smith, Douglas R.; Lindblad-Toh, Kerstin; Gnerre, Sante; Clamp, Michele; Chang, Jean; Stephens, Robert; Neelam, Beena; Volfovsky, Natalia; Schäffer, Alejandro A.; Agarwala, Richa; Narfström, Kristina; Murphy, William J.; Giger, Urs; Roca, Alfred L.; Antunes, Agostinho; Menotti-Raymond, Marilyn; Yuhki, Naoya; Pecon-Slattery, Jill; Johnson, Warren E.; Bourque, Guillaume; Tesler, Glenn; O’Brien, Stephen J.

2007-01-01

The genome sequence (1.9-fold coverage) of an inbred Abyssinian domestic cat was assembled, mapped, and annotated with a comparative approach that involved cross-reference to annotated genome assemblies of six mammals (human, chimpanzee, mouse, rat, dog, and cow). The results resolved chromosomal positions for 663,480 contigs, 20,285 putative feline gene orthologs, and 133,499 conserved sequence blocks (CSBs). Additional annotated features include repetitive elements, endogenous retroviral sequences, nuclear mitochondrial (numt) sequences, micro-RNAs, and evolutionary breakpoints that suggest historic balancing of translocation and inversion incidences in distinct mammalian lineages. Large numbers of single nucleotide polymorphisms (SNPs), deletion insertion polymorphisms (DIPs), and short tandem repeats (STRs), suitable for linkage or association studies were characterized in the context of long stretches of chromosome homozygosity. In spite of the light coverage capturing ∼65% of euchromatin sequence from the cat genome, these comparative insights shed new light on the tempo and mode of gene/genome evolution in mammals, promise several research applications for the cat, and also illustrate that a comparative approach using more deeply covered mammals provides an informative, preliminary annotation of a light (1.9-fold) coverage mammal genome sequence. PMID:17975172
Allele frequency distribution for the variable number of tandem repeat locus D10S28 in Tamil Nadu (south India) population.

PubMed

Pandian, S K; Kumar, S; Krishnan, M; Dharmalingam, K; Damodaran, C

1995-09-01

Allele frequencies were determined in unrelated individuals of Tamil speaking population from the Madras City (Tamil Nadu, South India) area for the polymorphic DNA locus D10S28 using the probe TBQ7. Membranes hybridized with the probe YNH24 were subjected to deprobing and were subsequently hybridized with random priming - labeled, purified inserts of TBQ7. The sizes of the fragments were grouped to 100 bp as well as to arbitrary fixed bins (Federal Bureau of Investigation / Royal Canadian Mounted Police). There were 14 bins in the latter with the most common bin being 11 (1789-1924 bp) with a frequency of 9.8%. We observed a heterozygosity of 92% comparable to Caucasian populations. The data presented here can be used as the basis for utilizing this variable number of tandem repeats (TNTR) DNA marker for paternity determinations and forensic investigations.
Noninvasive Prenatal Paternity Testing (NIPAT) through Maternal Plasma DNA Sequencing: A Pilot Study.

PubMed

Jiang, Haojun; Xie, Yifan; Li, Xuchao; Ge, Huijuan; Deng, Yongqiang; Mu, Haofang; Feng, Xiaoli; Yin, Lu; Du, Zhou; Chen, Fang; He, Nongyue

2016-01-01

Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels in order to verify the performance in clinical cases. Combining targeted deep sequencing of selective SNP and informative bioinformatics pipeline, we calculated the combined paternity index (CPI) of 17 cases to determine paternity. Sequencing-based NIPAT results fully agreed with invasive prenatal paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future.
The La-related protein 1-specific domain repurposes HEAT-like repeats to directly bind a 5'TOP sequence.

PubMed

Lahr, Roni M; Mack, Seshat M; Héroux, Annie; Blagden, Sarah P; Bousquet-Antonelli, Cécile; Deragon, Jean-Marc; Berman, Andrea J

2015-09-18

La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. A putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. These studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
The La-related protein 1-specific domain repurposes HEAT-like repeats to directly bind a 5'TOP sequence

DOE PAGES

Lahr, Roni M.; Mack, Seshat M.; Heroux, Annie; ...

2015-07-22

La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. Amore » putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. Ultimately, these studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis.« less
Multivalent binding of formin-binding protein 21 (FBP21)-tandem-WW domains fosters protein recognition in the pre-spliceosome.

PubMed

Klippel, Stefan; Wieczorek, Marek; Schümann, Michael; Krause, Eberhard; Marg, Berenice; Seidel, Thorsten; Meyer, Tim; Knapp, Ernst-Walter; Freund, Christian

2011-11-04

The high abundance of repetitive but nonidentical proline-rich sequences in spliceosomal proteins raises the question of how these known interaction motifs recruit their interacting protein domains. Whereas complex formation of these adaptors with individual motifs has been studied in great detail, little is known about the binding mode of domains arranged in tandem repeats and long proline-rich sequences including multiple motifs. Here we studied the interaction of the two adjacent WW domains of spliceosomal protein FBP21 with several ligands of different lengths and composition to elucidate the hallmarks of multivalent binding for this class of recognition domains. First, we show that many of the proteins that define the cellular proteome interacting with FBP21-WW1-WW2 contain multiple proline-rich motifs. Among these is the newly identified binding partner SF3B4. Fluorescence resonance energy transfer (FRET) analysis reveals the tandem-WW domains of FBP21 to interact with splicing factor 3B4 (SF3B4) in nuclear speckles where splicing takes place. Isothermal titration calorimetry and NMR shows that the tandem arrangement of WW domains and the multivalency of the proline-rich ligands both contribute to affinity enhancement. However, ligand exchange remains fast compared with the NMR time scale. Surprisingly, a N-terminal spin label attached to a bivalent ligand induces NMR line broadening of signals corresponding to both WW domains of the FBP21-WW1-WW2 protein. This suggests that distinct orientations of the ligand contribute to a delocalized and semispecific binding mode that should facilitate search processes within the spliceosome.
Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tuskan, Gerald A; Gunter, Lee E; DiFazio, Stephen P

The 18S-28S rDNA and 5S rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 18S-28S rDNA sites and one 5S rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis -type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones selected from 2 linkage groups based on genome sequence assembly (LG-I and LG-VI) were localized on 2 chromosomes, as expected. BACs from LG-I hybridized to the longest chromosome in the complement. All BAC positions were found to be concordant with sequencemore » assembly positions. BAC-FISH will be useful for delineating each of the Populus trichocarpa chromosomes and improving the sequence assembly of this model angiosperm tree species.« less
Physical organisation of simple sequence repeats (SSRs) in Triticeae: structural, functional and evolutionary implications.

PubMed

Cuadrado, A; Cardoso, M; Jouve, N

2008-01-01

A significant fraction of the nuclear DNA of all eukaryotes is occupied by simple sequence repeats (SSRs) or microsatellites. This type of sequence has sparked great interest as a means of studying genetic variation, linkage mapping, gene tagging and evolution. Although SSRs at different positions in a gene help determine the regulation of expression and the function of the protein produced, little attention has been paid to the chromosomal organisation and distribution of these sequences, even in model species. This review discusses the main achievements in the characterisation of long-range SSR organisation in the chromosomes of Triticum aestivum L., Secale cereale L., and Hordeum vulgare L. (all members of Triticeae). We have detected SSRs using an improved FISH technique based on the random primer labelling of synthetic oligonucleotides (15-24 bases) in multi-colour experiments. Detailed information on the presence and distribution of AC, AG and all the possible classes of trinucleotide repeats has been acquired. These data have revealed the motif-dependent and non-random chromosome distributions of SSRs in the different genomes, and allowed the correlation of particular SSRs with chromosome areas characterised by specific features (e.g., heterochromatin, euchromatin and centromeres) in all three species. The present review provides a detailed comparative study of the distribution of these SSRs in each of the seven chromosomes of the genomes A, B and D of wheat, H of barley and R of rye. The importance of SSRs in plant breeding and their possible role in chromosome structure, function and evolution is discussed. 2008 S. Karger AG, Basel
Complete mitochondrial genome of the whiter-spotted flower chafer, Protaetia brevitarsis (Coleoptera: Scarabaeidae).

PubMed

Kim, Min Jee; Im, Hyun Hwak; Lee, Kwang Youll; Han, Yeon Soo; Kim, Iksoo

2014-06-01

Abstract The complete nucleotide sequences of the mitochondrial genome from the whiter-spotted flower chafer, Protaetia brevitarsis (Coleoptera: Scarabaeidae), was determined. The 20,319-bp long circular genome is the longest among completely sequenced Coleoptera. As is typical in animals, the P. brevitarsis genome consisted of two ribosomal RNAs, 22 transfer RNAs, 13 protein-coding genes and one A + T-rich region. Although the size of the coding genes was typical, the non-coding A + T-rich region was 5654 bp, which is the longest in insects. The extraordinary length of this region was composed of 28,117-bp tandem repeats and 782-bp tandem repeats. These repeat sequences were encompassed by three non-repeat sequences constituting 1804 bp.
Multiple-locus variable-number tandem-repeat analysis of the swine dysentery pathogen, Brachyspira hyodysenteriae.

PubMed

Hidalgo, Alvaro; Carvajal, Ana; La, Tom; Naharro, Germán; Rubio, Pedro; Phillips, Nyree D; Hampson, David J

2010-08-01

The spirochete Brachyspira hyodysenteriae is the causative agent of swine dysentery, a severe colonic infection of pigs that has a considerable economic impact in many swine-producing countries. In spite of its importance, knowledge about the global epidemiology and population structure of B. hyodysenteriae is limited. Progress in this area has been hampered by the lack of a low-cost, portable, and discriminatory method for strain typing. The aim of the current study was to develop and test a multiple-locus variable-number tandem-repeat analysis (MLVA) method that could be used in basic veterinary diagnostic microbiology laboratories equipped with PCR technology or in more advanced laboratories with access to capillary electrophoresis. Based on eight loci, and when performed on isolates from different farms in different countries, as well as type and reference strains, the MLVA technique developed was highly discriminatory (Hunter and Gaston discriminatory index, 0.938 [95% confidence interval, 0.9175 to 0.9584]) while retaining a high phylogenetic value. Using the technique, the species was shown to be diverse (44 MLVA types from 172 isolates and strains), although isolates were stable in herds over time. The population structure appeared to be clonal. The finding of B. hyodysenteriae MLVA type 3 in piggeries in three European countries, as well as other, related, strains in different countries, suggests that spreading of the pathogen via carrier pigs is likely. MLVA overcame drawbacks associated with previous typing techniques for B. hyodysenteriae and was a powerful method for epidemiologic and population structure studies on this important pathogenic spirochete.
Tandem betatron

DOEpatents

Keinigs, Rhonald K.

1992-01-01

Two betatrons are provided in tandem for alternately accelerating an electron beam to avoid the single flux swing limitation of conventional betatrons and to accelerate the electron beam to high energies. The electron beam is accelerated in a first betatron during a period of increasing magnetic flux. The eletron beam is extracted from the first betatron as a peak magnetic flux is reached and then injected into a second betatron at a time of minimum magnetic flux in the second betatron. The cycle may be repeated until the desired electron beam energy is obtained. In one embodiment, the second betatron is axially offset from the first betatron to provide for electron beam injection directly at the axial location of the beam orbit in the second betatron.
Complete Sequence and Analysis of Coconut Palm (Cocos nucifera) Mitochondrial Genome

PubMed Central

Zhao, Yuhui; Zeng, Jingyao; Alamer, Ali; Alanazi, Ibrahim O.; Alawad, Abdullah O.; Al-Sadi, Abdullah M.; Hu, Songnian; Yu, Jun

2016-01-01

Coconut (Cocos nucifera L.), a member of the palm family (Arecaceae), is one of the most economically important crops in tropics, serving as an important source of food, drink, fuel, medicine, and construction material. Here we report an assembly of the coconut (C. nucifera, Oman local Tall cultivar) mitochondrial (mt) genome based on next-generation sequencing data. This genome, 678,653bp in length and 45.5% in GC content, encodes 72 proteins, 9 pseudogenes, 23 tRNAs, and 3 ribosomal RNAs. Within the assembly, we find that the chloroplast (cp) derived regions account for 5.07% of the total assembly length, including 13 proteins, 2 pseudogenes, and 11 tRNAs. The mt genome has a relatively large fraction of repeat content (17.26%), including both forward (tandem) and inverted (palindromic) repeats. Sequence variation analysis shows that the Ti/Tv ratio of the mt genome is lower as compared to that of the nuclear genome and neutral expectation. By combining public RNA-Seq data for coconut, we identify 734 RNA editing sites supported by at least two datasets. In summary, our data provides the second complete mt genome sequence in the family Arecaceae, essential for further investigations on mitochondrial biology of seed plants. PMID:27736909
Complete Sequence and Analysis of Coconut Palm (Cocos nucifera) Mitochondrial Genome.

PubMed

Aljohi, Hasan Awad; Liu, Wanfei; Lin, Qiang; Zhao, Yuhui; Zeng, Jingyao; Alamer, Ali; Alanazi, Ibrahim O; Alawad, Abdullah O; Al-Sadi, Abdullah M; Hu, Songnian; Yu, Jun

2016-01-01

Coconut (Cocos nucifera L.), a member of the palm family (Arecaceae), is one of the most economically important crops in tropics, serving as an important source of food, drink, fuel, medicine, and construction material. Here we report an assembly of the coconut (C. nucifera, Oman local Tall cultivar) mitochondrial (mt) genome based on next-generation sequencing data. This genome, 678,653bp in length and 45.5% in GC content, encodes 72 proteins, 9 pseudogenes, 23 tRNAs, and 3 ribosomal RNAs. Within the assembly, we find that the chloroplast (cp) derived regions account for 5.07% of the total assembly length, including 13 proteins, 2 pseudogenes, and 11 tRNAs. The mt genome has a relatively large fraction of repeat content (17.26%), including both forward (tandem) and inverted (palindromic) repeats. Sequence variation analysis shows that the Ti/Tv ratio of the mt genome is lower as compared to that of the nuclear genome and neutral expectation. By combining public RNA-Seq data for coconut, we identify 734 RNA editing sites supported by at least two datasets. In summary, our data provides the second complete mt genome sequence in the family Arecaceae, essential for further investigations on mitochondrial biology of seed plants.
Effects of GABA[subscript A] Modulators on the Repeated Acquisition of Response Sequences in Squirrel Monkeys

ERIC Educational Resources Information Center

Campbell, Una C.; Winsauer, Peter J.; Stevenson, Michael W.; Moerschbaecher, Joseph M.

2004-01-01

The present study investigated the effects of positive and negative GABA[subscript A] modulators under three different baselines of repeated acquisition in squirrel monkeys in which the monkeys acquired a three-response sequence on three keys under a second-order fixed-ratio (FR) schedule of food reinforcement. In two of these baselines, the…
[Use of multiple locus variable number tandem repeats analysis for the Brucella systematization].

PubMed

Kulakov, Iu K; Kovalev, D A; Misetova, E N; Golovneva, S I; Liapustina, L V; Zheludkov, M M

2012-01-01

The methods of molecular-genetic differentiation to strain level acquire increasing significance in the current system of struggle with brucellosis. MLVA (multiple locus variable number tandem repeats analysis) was selected for molecular-genetic differentiation to strain level and simultaneous establishment of the genetic relationship of investigated Brucella strains. The goal of this work was MLVA typing of three pathogenic Brucella species strains with the analysis of stability of chosen loci, discrimination power and concordance to conventional phenotypic methods of the Brucella differentiation for use in systematization of brucellosis causing agents. Twenty six Brucella strains representing reference (n = 15), vaccine (n = 2) and field strains of three pathogenic Brucella species were tested: B. melitensis (n = 3), B. abortus (n = 2), B. suis (n = 2), and isolates (n = 2) with unidentified taxonomic position using MLVA with 9 pairs primers on known variable loci of Brucella genome. The analysis of the stability of chosen loci, discrimination power on Hunter-Gaston discrimination index (HGDI) and consistency to phenotypic methods of identification was performed. MLVA was confirmed for the results of phenotypic methods of identification, stability of the chosen loci in majority reference, and vaccine strains with a high index of variability HGDI 0.9969 for all loci. A dendrogram was plotted on the basis of MLVA data on distributed Brucella strains in related clusters according to its taxonomic species and biovar positions and construction of 25 genotypes. B. melitensis strains formed cluster related to the reference strain of B. melitensis 63/9 biovar 2. Australian isolates of Brucella 83-4 and Brucella 83-6 isolated from rodents formed a cluster distant from other strains of Brucella. MLVA is a promising method for differentiation of Brucella strains with known and unresolved taxonomic status for their systematization and creation of MLVA genotype catalogue that

The upstream Variable Number Tandem Repeat polymorphism of the monoamine oxidase type A gene influences trigeminal pain-related evoked responses.

PubMed

Di Lorenzo, Cherubino; Daverio, Andrea; Pasqualetti, Patrizio; Coppola, Gianluca; Giannoudas, Ioannis; Barone, Ylenia; Grieco, Gaetano S; Niolu, Cinzia; Pascale, Esterina; Santorelli, Filippo M; Nicoletti, Ferdinando; Pierelli, Francesco; Siracusano, Alberto; Seri, Stefano; Di Lorenzo, Giorgio

2014-02-01

Monoamines have an important role in neural plasticity, a key factor in cortical pain processing that promotes changes in neuronal network connectivity. Monoamine oxidase type A (MAOA) is an enzyme that, due to its modulating role in monoaminergic activity, could play a role in cortical pain processing. The X-linked MAOA gene is characterized by an allelic variant of length, the MAOA upstream Variable Number Tandem Repeat (MAOA-uVNTR) region polymorphism. Two allelic variants of this gene are known, the high-activity MAOA (HAM) and low-activity MAOA (LAM). We investigated the role of MAOA-uVNTR in cortical pain processing in a group of healthy individuals measured by the trigeminal electric pain-related evoked potential (tPREP) elicited by repeated painful stimulation. A group of healthy volunteers was genotyped to detect MAOA-uVNTR polymorphism. Electrical tPREPs were recorded by stimulating the right supraorbital nerve with a concentric electrode. The N2 and P2 component amplitude and latency as well as the N2-P2 inter-peak amplitude were measured. The recording was divided into three blocks, each containing 10 consecutive stimuli and the N2-P2 amplitude was compared between blocks. Of the 67 volunteers, 37 were HAM and 30 were LAM. HAM subjects differed from LAM subjects in terms of amplitude of the grand-averaged and first-block N2-P2 responses (HAM>LAM). The N2-P2 amplitude decreased between the first and third block in HAM subjects but not LAM subjects. The MAOA-uVNTR polymorphism seemed to influence the brain response in a repeated tPREP paradigm and suggested a role of the MAOA as a modulator of neural plasticity related to cortical pain processing. © 2014 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
First Worldwide Proficiency Study on Variable-Number Tandem-Repeat Typing of Mycobacterium tuberculosis Complex Strains

PubMed Central

de Beer, Jessica L.; Kremer, Kristin; Ködmön, Csaba; Supply, Philip

2012-01-01

Although variable-number tandem-repeat (VNTR) typing has gained recognition as the new standard for the DNA fingerprinting of Mycobacterium tuberculosis complex (MTBC) isolates, external quality control programs have not yet been developed. Therefore, we organized the first multicenter proficiency study on 24-locus VNTR typing. Sets of 30 DNAs of MTBC strains, including 10 duplicate DNA samples, were distributed among 37 participating laboratories in 30 different countries worldwide. Twenty-four laboratories used an in-house-adapted method with fragment sizing by gel electrophoresis or an automated DNA analyzer, nine laboratories used a commercially available kit, and four laboratories used other methods. The intra- and interlaboratory reproducibilities of VNTR typing varied from 0% to 100%, with averages of 72% and 60%, respectively. Twenty of the 37 laboratories failed to amplify particular VNTR loci; if these missing results were ignored, the number of laboratories with 100% interlaboratory reproducibility increased from 1 to 5. The average interlaboratory reproducibility of VNTR typing using a commercial kit was better (88%) than that of in-house-adapted methods using a DNA analyzer (70%) or gel electrophoresis (50%). Eleven laboratories using in-house-adapted manual typing or automated typing scored inter- and intralaboratory reproducibilities of 80% or higher, which suggests that these approaches can be used in a reliable way. In conclusion, this first multicenter study has documented the worldwide quality of VNTR typing of MTBC strains and highlights the importance of international quality control to improve genotyping in the future. PMID:22170917
Repeatability and Reproducibility in Proteomic Identifications by Liquid Chromatography—Tandem Mass Spectrometry

PubMed Central

Tabb, David L.; Vega-Montoto, Lorenzo; Rudnick, Paul A.; Variyath, Asokan Mulayath; Ham, Amy-Joan L.; Bunk, David M.; Kilpatrick, Lisa E.; Billheimer, Dean D.; Blackman, Ronald K.; Cardasis, Helene L.; Carr, Steven A.; Clauser, Karl R.; Jaffe, Jacob D.; Kowalski, Kevin A.; Neubert, Thomas A.; Regnier, Fred E.; Schilling, Birgit; Tegeler, Tony J.; Wang, Mu; Wang, Pei; Whiteaker, Jeffrey R.; Zimmerman, Lisa J.; Fisher, Susan J.; Gibson, Bradford W.; Kinsinger, Christopher R.; Mesri, Mehdi; Rodriguez, Henry; Stein, Steven E.; Tempst, Paul; Paulovich, Amanda G.; Liebler, Daniel C.; Spiegelman, Cliff

2009-01-01

The complexity of proteomic instrumentation for LC-MS/MS introduces many possible sources of variability. Data-dependent sampling of peptides constitutes a stochastic element at the heart of discovery proteomics. Although this variation impacts the identification of peptides, proteomic identifications are far from completely random. In this study, we analyzed interlaboratory data sets from the NCI Clinical Proteomic Technology Assessment for Cancer to examine repeatability and reproducibility in peptide and protein identifications. Included data spanned 144 LC-MS/MS experiments on four Thermo LTQ and four Orbitrap instruments. Samples included yeast lysate, the NCI-20 defined dynamic range protein mix, and the Sigma UPS 1 defined equimolar protein mix. Some of our findings reinforced conventional wisdom, such as repeatability and reproducibility being higher for proteins than for peptides. Most lessons from the data, however, were more subtle. Orbitraps proved capable of higher repeatability and reproducibility, but aberrant performance occasionally erased these gains. Even the simplest protein digestions yielded more peptide ions than LC-MS/MS could identify during a single experiment. We observed that peptide lists from pairs of technical replicates overlapped by 35–60%, giving a range for peptide-level repeatability in these experiments. Sample complexity did not appear to affect peptide identification repeatability, even as numbers of identified spectra changed by an order of magnitude. Statistical analysis of protein spectral counts revealed greater stability across technical replicates for Orbitraps, making them superior to LTQ instruments for biomarker candidate discovery. The most repeatable peptides were those corresponding to conventional tryptic cleavage sites, those that produced intense MS signals, and those that resulted from proteins generating many distinct peptides. Reproducibility among different instruments of the same type lagged behind
Complete chloroplast genome and 45S nrDNA sequences of the medicinal plant species Glycyrrhiza glabra and Glycyrrhiza uralensis.

PubMed

Kang, Sang-Ho; Lee, Jeong-Hoon; Lee, Hyun Oh; Ahn, Byoung Ohg; Won, So Youn; Sohn, Seong-Han; Kim, Jung Sun

2017-10-06

Glycyrrhiza uralensis and G. glabra, members of the Fabaceae, are medicinally important species that are native to Asia and Europe. Extracts from these plants are widely used as natural sweeteners because of their much greater sweetness than sucrose. In this study, the three complete chloroplast genomes and five 45S nuclear ribosomal (nr)DNA sequences of these two licorice species and an interspecific hybrid are presented. The chloroplast genomes of G. glabra, G. uralensis and G. glabra × G. uralensis were 127,895 bp, 127,716 bp and 127,939 bp, respectively. The three chloroplast genomes harbored 110 annotated genes, including 76 protein-coding genes, 30 tRNA genes and 4 rRNA genes. The 45S nrDNA sequences were either 5,947 or 5,948 bp in length. Glycyrrhiza glabra and G. glabra × G. uralensis showed two types of nrDNA, while G. uralensis contained a single type. The complete 45S nrDNA sequence unit contains 18S rRNA, ITS1, 5.8S rRNA, ITS2 and 26S rRNA. We identified simple sequence repeat and tandem repeat sequences. We also developed four reliable markers for analysis of Glycyrrhiza diversity authentication.
Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.).

PubMed

Bushakra, Jill M; Lewers, Kim S; Staton, Margaret E; Zhebentyayeva, Tetyana; Saski, Christopher A

2015-10-26

Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed sequence tags (ESTs) are a source of SSRs that can be used to develop markers to facilitate plant breeding and for more basic research across genera and higher plant orders. Leaf and meristem tissue from 'Heritage' red raspberry (Rubus idaeus) and 'Bristol' black raspberry (R. occidentalis) were utilized for RNA extraction. After conversion to cDNA and library construction, ESTs were sequenced, quality verified, assembled and scanned for SSRs. Primers flanking the SSRs were designed and a subset tested for amplification, polymorphism and transferability across species. ESTs containing SSRs were functionally annotated using the GenBank non-redundant (nr) database and further classified using the gene ontology database. To accelerate development of EST-SSRs in the genus Rubus (Rosaceae), 1149 and 2358 cDNA sequences were generated from red raspberry and black raspberry, respectively. The cDNA sequences were screened using rigorous filtering criteria which resulted in the identification of 121 and 257 SSR loci for red and black raspberry, respectively. Primers were designed from the surrounding sequences resulting in 131 and 288 primer pairs, respectively, as some sequences contained more than one SSR locus. Sequence analysis revealed that the SSR-containing genes span a diversity of functions and share more sequence identity with strawberry genes than with other Rosaceous species. This resource of Rubus-specific, gene-derived markers will facilitate the construction of linkage maps composed of transferable markers for studying and manipulating important traits in this economically important genus.
Structural basis of DNA sequence recognition by the response regulator PhoP in Mycobacterium tuberculosis.

PubMed

He, Xiaoyuan; Wang, Liqin; Wang, Shuishu

2016-04-15

The transcriptional regulator PhoP is an essential virulence factor in Mycobacterium tuberculosis, and it presents a target for the development of new anti-tuberculosis drugs and attenuated tuberculosis vaccine strains. PhoP binds to DNA as a highly cooperative dimer by recognizing direct repeats of 7-bp motifs with a 4-bp spacer. To elucidate the PhoP-DNA binding mechanism, we determined the crystal structure of the PhoP-DNA complex. The structure revealed a tandem PhoP dimer that bound to the direct repeat. The surprising tandem arrangement of the receiver domains allowed the four domains of the PhoP dimer to form a compact structure, accounting for the strict requirement of a 4-bp spacer and the highly cooperative binding of the dimer. The PhoP-DNA interactions exclusively involved the effector domain. The sequence-recognition helix made contact with the bases of the 7-bp motif in the major groove, and the wing interacted with the adjacent minor groove. The structure provides a starting point for the elucidation of the mechanism by which PhoP regulates the virulence of M. tuberculosis and guides the design of screening platforms for PhoP inhibitors.
Independent movement, dimerization and stability of tandem repeats of chicken brain alpha-spectrin

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kusunoki, H.; Minasov, G.; Macdonald, R.I.

Previous X-ray crystal structures have shown that linkers of five amino acid residues connecting pairs of chicken brain {alpha}-spectrin and human erythroid {beta}-spectrin repeats can undergo bending without losing their {alpha}-helical structure. To test whether bending at one linker can influence bending at an adjacent linker, the structures of two and three repeat fragments of chicken brain {alpha}-spectrin have been determined by X-ray crystallography. The structure of the three-repeat fragment clearly shows that bending at one linker can occur independently of bending at an adjacent linker. This observation increases the possible trajectories of modeled chains of spectrin repeats. Furthermore, themore » three-repeat molecule crystallized as an antiparallel dimer with a significantly smaller buried interfacial area than that of {alpha}-actinin, a spectrin-related molecule, but large enough and of a type indicating biological specificity. Comparison of the structures of the spectrin and {alpha}-actinin dimers supports weak association of the former, which could not be detected by analytical ultracentrifugation, versus strong association of the latter, which has been observed by others. To correlate features of the structure with solution properties and to test a previous model of stable spectrin and dystrophin repeats, the number of inter-helical interactions in each repeat of several spectrin structures were counted and compared to their thermal stabilities. Inter-helical interactions, but not all interactions, increased in parallel with measured thermal stabilities of each repeat and in agreement with the thermal stabilities of two and three repeats and also partial repeats of spectrin.« less
A Mitochondrial Genome of Rhyparochromidae (Hemiptera: Heteroptera) and a Comparative Analysis of Related Mitochondrial Genomes.

PubMed

Li, Teng; Yang, Jie; Li, Yinwan; Cui, Ying; Xie, Qiang; Bu, Wenjun; Hillis, David M

2016-10-19

The Rhyparochromidae, the largest family of Lygaeoidea, encompasses more than 1,850 described species, but no mitochondrial genome has been sequenced to date. Here we describe the first mitochondrial genome for Rhyparochromidae: a complete mitochondrial genome of Panaorus albomaculatus (Scott, 1874). This mitochondrial genome is comprised of 16,345 bp, and contains the expected 37 genes and control region. The majority of the control region is made up of a large tandem-repeat region, which has a novel pattern not previously observed in other insects. The tandem-repeats region of P. albomaculatus consists of 53 tandem duplications (including one partial repeat), which is the largest number of tandem repeats among all the known insect mitochondrial genomes. Slipped-strand mispairing during replication is likely to have generated this novel pattern of tandem repeats. Comparative analysis of tRNA gene families in sequenced Pentatomomorpha and Lygaeoidea species shows that the pattern of nucleotide conservation is markedly higher on the J-strand. Phylogenetic reconstruction based on mitochondrial genomes suggests that Rhyparochromidae is not the sister group to all the remaining Lygaeoidea, and supports the monophyly of Lygaeoidea.
New paradigm in ankyrin repeats: Beyond protein-protein interaction module.

PubMed

Islam, Zeyaul; Nagampalli, Raghavendra Sashi Krishna; Fatima, Munazza Tamkeen; Ashraf, Ghulam Md

2018-04-01

Classically, ankyrin repeat (ANK) proteins are built from tandems of two or more repeats and form curved solenoid structures that are associated with protein-protein interactions. These are short, widespread structural motif of around 33 amino acids repeats in tandem, having a canonical helix-loop-helix fold, found individually or in combination with other domains. The multiplicity of structural pattern enables it to form assemblies of diverse sizes, required for their abilities to confer multiple binding and structural roles of proteins. Three-dimensional structures of these repeats determined to date reveal a degree of structural variability that translates into the considerable functional versatility of this protein superfamily. Recent work on the ANK has proposed novel structural information, especially protein-lipid, protein-sugar and protein-protein interaction. Self-assembly of these repeats was also shown to prevent the associated protein in forming filaments. In this review, we summarize the latest findings and how the new structural information has increased our understanding of the structural determinants of ANK proteins. We discussed latest findings on how these proteins participate in various interactions to diversify the ANK roles in numerous biological processes, and explored the emerging and evolving field of designer ankyrins and its framework for protein engineering emphasizing on biotechnological applications. Copyright © 2017 Elsevier B.V. All rights reserved.
Evaluating the Use of Multilocus Variable Number Tandem Repeat Analysis of Shiga Toxin-Producing Escherichia coli O157 as a Routine Public Health Tool in England

PubMed Central

Byrne, Lisa; Elson, Richard; Dallman, Timothy J.; Perry, Neil; Ashton, Philip; Wain, John; Adak, Goutam K.; Grant, Kathie A.; Jenkins, Claire

2014-01-01

Multilocus variable number tandem repeat analysis (MLVA) provides microbiological support for investigations of clusters of cases of infection with Shiga toxin-producing E. coli (STEC) O157. All confirmed STEC O157 isolated in England and submitted to the Gastrointestinal Bacteria Reference Unit (GBRU) during a six month period were typed using MLVA, with the aim of assessing the impact of this approach on epidemiological investigations. Of 539 cases investigated, 341 (76%) had unique (>2 single locus variants) MLVA profiles, 12% of profiles occurred more than once due to known household transmission and 12% of profiles occurred as part of 41 clusters, 21 of which were previously identified through routine public health investigation of cases. The remaining 20 clusters were not previously detected and STEC enhanced surveillance data for associated cases were retrospectively reviewed for epidemiological links including shared exposures, geography and/or time. Additional evidence of a link between cases was found in twelve clusters. Compared to phage typing, the number of sporadic cases was reduced from 69% to 41% and the diversity index for MLVA was 0.996 versus 0.782 for phage typing. Using MLVA generates more data on the spatial and temporal dispersion of cases, better defining the epidemiology of STEC infection than phage typing. The increased detection of clusters through MLVA typing highlights the challenges to health protection practices, providing a forerunner to the advent of whole genome sequencing as a diagnostic tool. PMID:24465775
Evaluating the use of multilocus variable number tandem repeat analysis of Shiga toxin-producing Escherichia coli O157 as a routine public health tool in England.

PubMed

Byrne, Lisa; Elson, Richard; Dallman, Timothy J; Perry, Neil; Ashton, Philip; Wain, John; Adak, Goutam K; Grant, Kathie A; Jenkins, Claire

2014-01-01

Multilocus variable number tandem repeat analysis (MLVA) provides microbiological support for investigations of clusters of cases of infection with Shiga toxin-producing E. coli (STEC) O157. All confirmed STEC O157 isolated in England and submitted to the Gastrointestinal Bacteria Reference Unit (GBRU) during a six month period were typed using MLVA, with the aim of assessing the impact of this approach on epidemiological investigations. Of 539 cases investigated, 341 (76%) had unique (>2 single locus variants) MLVA profiles, 12% of profiles occurred more than once due to known household transmission and 12% of profiles occurred as part of 41 clusters, 21 of which were previously identified through routine public health investigation of cases. The remaining 20 clusters were not previously detected and STEC enhanced surveillance data for associated cases were retrospectively reviewed for epidemiological links including shared exposures, geography and/or time. Additional evidence of a link between cases was found in twelve clusters. Compared to phage typing, the number of sporadic cases was reduced from 69% to 41% and the diversity index for MLVA was 0.996 versus 0.782 for phage typing. Using MLVA generates more data on the spatial and temporal dispersion of cases, better defining the epidemiology of STEC infection than phage typing. The increased detection of clusters through MLVA typing highlights the challenges to health protection practices, providing a forerunner to the advent of whole genome sequencing as a diagnostic tool.
[Analysis on genetic polymorphism of 5 STR loci selected from X chromosome].

PubMed

Liu, Qi-ji; Gong, Yao-qin; Zhang, Xi-yu; Gao, Gui-min; Li, Jiang-xia; Guo, Yi-shou

2005-02-01

To select short tandem repeats(STR) from X chromosome. STR is a universal genetic marker that has changeable polymorphism and stable heredity in human genome. It is a specific DNA segment composed of 2-6 base pairs as its core sequence. It is an ideal DNA marker used in linkage analysis and gene mapping. In this study, 8 short tandem repeats were selected from two genomic clones on X chromosome by using BCM Search Launcher. Primers amplifying the STR loci were designed by using Primer 3.0 according to the unique sequence flanking the STRs. Polymorphisms of the short tandem repeats in Chinese population were evaluated by PCR amplification and PAGE. Five of these STRs were polymorphic. Chi-square test indicated that the distribution of genotypes agreed with Hardy-Weinberg equilibrium (P>0.05). Five polymorphic short tandem repeats have been identified on chromosome X and will be useful for linkage analysis and gene mapping.
DNA CTG triplet repeats involved in dynamic mutations of neurologically related gene sequences form stable duplexes

NASA Technical Reports Server (NTRS)

Smith, G. K.; Jie, J.; Fox, G. E.; Gao, X.

1995-01-01

DNA triplet repeats, 5'-d(CTG)n and 5'-d(CAG)n, are present in genes which have been implicated in several neurodegenerative disorders. To investigate possible stable structures formed by these repeating sequences, we have examined d(CTG)n, d(CAG)n and d(CTG).d(CAG)n (n = 2 and 3) using NMR and UV optical spectroscopy. These studies reveal that single stranded (CTG)n (n > 2) forms stable, antiparallel helical duplexes, while the single stranded (CAG)n requires at least three repeating units to form a duplex. NMR and UV melting experiments show that the Tm increases in the order of [(CAG)3]2 < [(CTG)3]2 << (CAG)3.(CTG)3. The (CTG)3 duplex is stable and exhibits similar NMR spectra in solutions containing 0.1-4 M NaCl and at a pH range from 4.6 to 8.8. The (CTG)3 duplex, which contains multiple-T.T mismatches, displays many NMR spectral characteristics similar to those of B-form DNA. However, unique NOE and 1H-31P coupling patterns associated with the repetitive T.T mismatches in the CTG repeats are discerned. These results, in conjunction with recent in vitro studies suggest that longer CTG repeats may form hairpin structures, which can potentially cause interruption in replication, leading to dynamic expansion or deletion of triplet repeats.
Genetic characterization of UCS region of Pneumocystis jirovecii and construction of allelic profiles of Indian isolates based on sequence typing at three regions.

PubMed

Gupta, Rashmi; Mirdha, Bijay Ranjan; Guleria, Randeep; Kumar, Lalit; Luthra, Kalpana; Agarwal, Sanjay Kumar; Sreenivas, Vishnubhatla

2013-01-01

Pneumocystis jirovecii is an opportunistic pathogen that causes severe pneumonia in immunocompromised patients. To study the genetic diversity of P. jirovecii in India the upstream conserved sequence (UCS) region of Pneumocystis genome was amplified, sequenced and genotyped from a set of respiratory specimens obtained from 50 patients with a positive result for nested mitochondrial large subunit ribosomal RNA (mtLSU rRNA) PCR during the years 2005-2008. Of these 50 cases, 45 showed a positive PCR for UCS region. Variations in the tandem repeats in UCS region were characterized by sequencing all the positive cases. Of the 45 cases, one case showed five repeats, 11 cases showed four repeats, 29 cases showed three repeats and four cases showed two repeats. By running amplified DNA from all these cases on a high-resolution gel, mixed infection was observed in 12 cases (26.7%, 12/45). Forty three of 45 cases included in this study had previously been typed at mtLSU rRNA and internal transcribed spacer (ITS) region by our group. In the present study, the genotypes at those two regions were combined with UCS repeat patterns to construct allelic profiles of 43 cases. A total of 36 allelic profiles were observed in 43 isolates indicating high genetic variability. A statistically significant association was observed between mtLSU rRNA genotype 1, ITS type Ea and UCS repeat pattern 4. Copyright © 2012 Elsevier B.V. All rights reserved.
Sequence of a second gene encoding bovine submaxillary mucin: implication for mucin heterogeneity and cloning.

PubMed

Jiang, W; Woitach, J T; Gupta, D; Bhavanandan, V P

1998-10-20

Secreted epithelial mucins are extremely large and heterogeneous glycoproteins. We report the 5 kilobase DNA sequence of a second gene, BSM2, which encodes bovine submaxillary mucin. The determined nucleotide and deduced amino acid sequences of BSM2 are 95.2% and 92. 2% identical, respectively, to those of the previously described BSM1 gene isolated from the same cow. Further, the five predicted protein domains of the two genes are 100%, 94%, 93%, 77%, and 88% identical. Based on the above results, we propose that expression of multiple homologous core proteins from a single animal is a factor in generating diversity of saccharides in mucins and in providing resistance of the molecules to proteolysis. In addition, this work raises several important issues in mucin cloning such as assembling sequences from seemingly overlapping clones and deducing consensus sequences for nearly identical tandem repeats. Copyright 1998 Academic Press.
Rapid Identification of Laboratory Contamination with Mycobacterium tuberculosis Using Variable Number Tandem Repeat Analysis

PubMed Central

Gascoyne-Binzi, Deborah M.; Barlow, Rachael E. L.; Frothingham, Richard; Robinson, Grant; Collyns, Timothy A.; Gelletlie, Ruth; Hawkey, Peter M.

2001-01-01

Compared with solid media, broth-based mycobacterial culture systems have increased sensitivity but also have higher false-positive rates due to cross-contamination. Systematic strain typing is rarely undertaken because the techniques are technically demanding and the data are difficult to organize. Variable number tandem repeat (VNTR) analysis by PCR is rapid and reproducible. The digital profile is easily manipulated in a database. We undertook a retrospective study of Mycobacterium tuberculosis isolates collected over an 18-month period following the introduction of the BACTEC MGIT 960 system. VNTR allele profiles were determined with early positive broth cultures and entered into a database with the specimen processing date and other specimen data. We found 36 distinct VNTR profiles in cultures from 144 patients. Three common VNTR profiles accounted for 45% of true-positive cases. By combining VNTR results with specimen data, we identified nine cross-contamination incidents, six of which were previously unsuspected. These nine incidents resulted in 34 false-positive cultures for 29 patients. False-positive cultures were identified for three patients who had previously been culture positive for tuberculosis and were receiving treatment. Identification of cross-contamination incidents requires careful documentation of specimen data and good communication between clinical and laboratory staff. Automated broth culture systems should be supplemented with molecular analysis to identify cross-contamination events. VNTR analysis is reproducible and provides timely results when applied to early positive broth cultures. This method should ensure that patients are not placed on unnecessary tuberculosis therapy or that cases are not falsely identified as treatment failures. In addition, areas where existing procedures may be improved can be identified. PMID:11136751
Formation and Repair of Mismatches Containing Ribonucleotides and Oxidized Bases at Repeated DNA Sequences*

PubMed Central

Cilli, Piera; Minoprio, Anna; Bossa, Cecilia; Bignami, Margherita; Mazzei, Filomena

2015-01-01

The cellular pool of ribonucleotide triphosphates (rNTPs) is higher than that of deoxyribonucleotide triphosphates. To ensure genome stability, DNA polymerases must discriminate against rNTPs and incorporated ribonucleotides must be removed by ribonucleotide excision repair (RER). We investigated DNA polymerase β (POL β) capacity to incorporate ribonucleotides into trinucleotide repeated DNA sequences and the efficiency of base excision repair (BER) and RER enzymes (OGG1, MUTYH, and RNase H2) when presented with an incorrect sugar and an oxidized base. POL β incorporated rAMP and rCMP opposite 7,8-dihydro-8-oxoguanine (8-oxodG) and extended both mispairs. In addition, POL β was able to insert and elongate an oxidized rGMP when paired with dA. We show that RNase H2 always preserves the capacity to remove a single ribonucleotide when paired to an oxidized base or to incise an oxidized ribonucleotide in a DNA duplex. In contrast, BER activity is affected by the presence of a ribonucleotide opposite an 8-oxodG. In particular, MUTYH activity on 8-oxodG:rA mispairs is fully inhibited, although its binding capacity is retained. This results in the reduction of RNase H2 incision capability of this substrate. Thus complex mispairs formed by an oxidized base and a ribonucleotide can compromise BER and RER in repeated sequences. PMID:26338705
Survey and analysis of simple sequence repeats in the Laccaria bicolor genome, with development of microsatellite markers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Labbe, Jessy L; Murat, Claude; Morin, Emmanuelle

It is becoming clear that simple sequence repeats (SSRs) play a significant role in fungal genome organization, and they are a large source of genetic markers for population genetics and meiotic maps. We identified SSRs in the Laccaria bicolor genome by in silico survey and analyzed their distribution in the different genomic regions. We also compared the abundance and distribution of SSRs in L. bicolor with those of the following fungal genomes: Phanerochaete chrysosporium, Coprinopsis cinerea, Ustilago maydis, Cryptococcus neoformans, Aspergillus nidulans, Magnaporthe grisea, Neurospora crassa and Saccharomyces cerevisiae. Using the MISA computer program, we detected 277,062 SSRs in themore » L. bicolor genome representing 8% of the assembled genomic sequence. Among the analyzed basidiomycetes, L. bicolor exhibited the highest SSR density although no correlation between relative abundance and the genome sizes was observed. In most genomes the short motifs (mono- to trinucleotides) were more abundant than the longer repeated SSRs. Generally, in each organism, the occurrence, relative abundance, and relative density of SSRs decreased as the repeat unit increased. Furthermore, each organism had its own common and longest SSRs. In the L. bicolor genome, most of the SSRs were located in intergenic regions (73.3%) and the highest SSR density was observed in transposable elements (TEs; 6,706 SSRs/Mb). However, 81% of the protein-coding genes contained SSRs in their exons, suggesting that SSR polymorphism may alter gene phenotypes. Within a L. bicolor offspring, sequence polymorphism of 78 SSRs was mainly detected in non-TE intergenic regions. Unlike previously developed microsatellite markers, these new ones are spread throughout the genome; these markers could have immediate applications in population genetics.« less
Multivalent Binding of Formin-binding Protein 21 (FBP21)-Tandem-WW Domains Fosters Protein Recognition in the Pre-spliceosome*

PubMed Central

Klippel, Stefan; Wieczorek, Marek; Schümann, Michael; Krause, Eberhard; Marg, Berenice; Seidel, Thorsten; Meyer, Tim; Knapp, Ernst-Walter; Freund, Christian

2011-01-01

The high abundance of repetitive but nonidentical proline-rich sequences in spliceosomal proteins raises the question of how these known interaction motifs recruit their interacting protein domains. Whereas complex formation of these adaptors with individual motifs has been studied in great detail, little is known about the binding mode of domains arranged in tandem repeats and long proline-rich sequences including multiple motifs. Here we studied the interaction of the two adjacent WW domains of spliceosomal protein FBP21 with several ligands of different lengths and composition to elucidate the hallmarks of multivalent binding for this class of recognition domains. First, we show that many of the proteins that define the cellular proteome interacting with FBP21-WW1-WW2 contain multiple proline-rich motifs. Among these is the newly identified binding partner SF3B4. Fluorescence resonance energy transfer (FRET) analysis reveals the tandem-WW domains of FBP21 to interact with splicing factor 3B4 (SF3B4) in nuclear speckles where splicing takes place. Isothermal titration calorimetry and NMR shows that the tandem arrangement of WW domains and the multivalency of the proline-rich ligands both contribute to affinity enhancement. However, ligand exchange remains fast compared with the NMR time scale. Surprisingly, a N-terminal spin label attached to a bivalent ligand induces NMR line broadening of signals corresponding to both WW domains of the FBP21-WW1-WW2 protein. This suggests that distinct orientations of the ligand contribute to a delocalized and semispecific binding mode that should facilitate search processes within the spliceosome. PMID:21917930
A variable number of tandem repeats in the 3'-untranslated region of the dopamine transporter modulates striatal function during working memory updating across the adult age span.

PubMed

Sambataro, Fabio; Podell, Jamie E; Murty, Vishnu P; Das, Saumitra; Kolachana, Bhaskar; Goldberg, Terry E; Weinberger, Daniel R; Mattay, Venkata S

2015-08-01

Dopamine modulation of striatal function is critical for executive functions such as working memory (WM) updating. The dopamine transporter (DAT) regulates striatal dopamine signaling via synaptic reuptake. A variable number of tandem repeats in the 3'-untranslated region of SLC6A3 (DAT1-3'-UTR-VNTR) is associated with DAT expression, such that 9-repeat allele carriers tend to express lower levels (associated with higher extracellular dopamine concentrations) than 10-repeat homozygotes. Aging is also associated with decline of the dopamine system. The goal of the present study was to investigate the effects of aging and DAT1-3'-UTR-VNTR on the neural activity and functional connectivity of the striatum during WM updating. Our results showed both an age-related decrease in striatal activity and an effect of DAT1-3'-UTR-VNTR. Ten-repeat homozygotes showed reduced striatal activity and increased striatal-hippocampal connectivity during WM updating relative to the 9-repeat carriers. There was no age by DAT1-3'-UTR-VNTR interaction. These results suggest that, whereas striatal function during WM updating is modulated by both age and genetically determined DAT levels, the rate of the age-related decline in striatal function is similar across both DAT1-3'-UTR-VNTR genotype groups. They further suggest that, because of the baseline difference in striatal function based on DAT1-3'-UTR-VNTR polymorphism, 10-repeat homozygotes, who have lower levels of striatal function throughout the adult life span, may reach a threshold of decreased striatal function and manifest impairments in cognitive processes mediated by the striatum earlier in life than the 9-repeat carriers. Our data suggest that age and DAT1-3'-UTR-VNTR polymorphism independently modulate striatal function. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.

Analysis of expressed sequence tags from Prunus mume flower and fruit and development of simple sequence repeat markers

PubMed Central

2010-01-01

Background Expressed Sequence Tag (EST) has been a cost-effective tool in molecular biology and represents an abundant valuable resource for genome annotation, gene expression, and comparative genomics in plants. Results In this study, we constructed a cDNA library of Prunus mume flower and fruit, sequenced 10,123 clones of the library, and obtained 8,656 expressed sequence tag (EST) sequences with high quality. The ESTs were assembled into 4,473 unigenes composed of 1,492 contigs and 2,981 singletons and that have been deposited in NCBI (accession IDs: GW868575 - GW873047), among which 1,294 unique ESTs were with known or putative functions. Furthermore, we found 1,233 putative simple sequence repeats (SSRs) in the P. mume unigene dataset. We randomly tested 42 pairs of PCR primers flanking potential SSRs, and 14 pairs were identified as true-to-type SSR loci and could amplify polymorphic bands from 20 individual plants of P. mume. We further used the 14 EST-SSR primer pairs to test the transferability on peach and plum. The result showed that nearly 89% of the primer pairs produced target PCR bands in the two species. A high level of marker polymorphism was observed in the plum species (65%) and low in the peach (46%), and the clustering analysis of the three species indicated that these SSR markers were useful in the evaluation of genetic relationships and diversity between and within the Prunus species. Conclusions We have constructed the first cDNA library of P. mume flower and fruit, and our data provide sets of molecular biology resources for P. mume and other Prunus species. These resources will be useful for further study such as genome annotation, new gene discovery, gene functional analysis, molecular breeding, evolution and comparative genomics between Prunus species. PMID:20626882
Development of simple sequence repeat (SSR) markers from a genome survey of Chinese bayberry (Myrica rubra)

PubMed Central

2012-01-01

Background Chinese bayberry (Myrica rubra Sieb. and Zucc.) is a subtropical evergreen tree originating in China. It has been cultivated in southern China for several thousand years, and annual production has reached 1.1 million tons. The taste and high level of health promoting characters identified in the fruit in recent years has stimulated its extension in China and introduction to Australia. A limited number of co-dominant markers have been developed and applied in genetic diversity and identity studies. Here we report, for the first time, a survey of whole genome shotgun data to develop a large number of simple sequence repeat (SSR) markers to analyse the genetic diversity of the common cultivated Chinese bayberry and the relationship with three other Myrica species. Results The whole genome shotgun survey of Chinese bayberry produced 9.01Gb of sequence data, about 26x coverage of the estimated genome size of 323 Mb. The genome sequences were highly heterozygous, but with little duplication. From the initial assembled scaffold covering 255 Mb sequence data, 28,602 SSRs (≥5 repeats) were identified. Dinucleotide was the most common repeat motif with a frequency of 84.73%, followed by 13.78% trinucleotide, 1.34% tetranucleotide, 0.12% pentanucleotide and 0.04% hexanucleotide. From 600 primer pairs, 186 polymorphic SSRs were developed. Of these, 158 were used to screen 29 Chinese bayberry accessions and three other Myrica species: 91.14%, 89.87% and 46.84% SSRs could be used in Myrica adenophora, Myrica nana and Myrica cerifera, respectively. The UPGMA dendrogram tree showed that cultivated Myrica rubra is closely related to Myrica adenophora and Myrica nana, originating in southwest China, and very distantly related to Myrica cerifera, originating in America. These markers can be used in the construction of a linkage map and for genetic diversity studies in Myrica species. Conclusion Myrica rubra has a small genome of about 323 Mb with a high level of
SINE sequences detect DNA fingerprints in salmonid fishes.

PubMed

Spruell, P; Thorgaard, G H

1996-04-01

DNA probes homologous to two previously described salmonid short interspersed nuclear elements (SINEs) detected DNA fingerprint patterns in 14 species of salmonid fishes. The probes showed more homology to some species than to others and little homology to three nonsalmonid fishes. The DNA fingerprint patterns derived from the SINE probes are individual-specific and inherited in a Mendelian manner. Probes derived from different regions of the same SINE detect only partially overlapping banding patterns, reflecting a more complex SINE structure than has been previously reported. Like the human Alu sequence, the SINEs found in salmonids could provide useful genetic markers and primer sites for PCR-based techniques. These elements may be more desirable for some applications than traditional DNA fingerprinting probes that detect tandemly repeated arrays.
The impact of CRISPR repeat sequence on structures of a Cas6 protein-RNA complex

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Ruiying; Zheng, Han; Preamplume, Gan

The repeat-associated mysterious proteins (RAMPs) comprise the most abundant family of proteins involved in prokaryotic immunity against invading genetic elements conferred by the clustered regularly interspaced short palindromic repeat (CRISPR) system. Cas6 is one of the first characterized RAMP proteins and is a key enzyme required for CRISPR RNA maturation. Despite a strong structural homology with other RAMP proteins that bind hairpin RNA, Cas6 distinctly recognizes single-stranded RNA. Previous structural and biochemical studies show that Cas6 captures the 5' end while cleaving the 3' end of the CRISPR RNA. Here, we describe three structures and complementary biochemical analysis of amore » noncatalytic Cas6 homolog from Pyrococcus horikoshii bound to CRISPR repeat RNA of different sequences. Our study confirms the specificity of the Cas6 protein for single-stranded RNA and further reveals the importance of the bases at Positions 5-7 in Cas6-RNA interactions. Substitutions of these bases result in structural changes in the protein-RNA complex including its oligomerization state.« less
Plant chromosomes from end to end: telomeres, heterochromatin and centromeres.

PubMed

Lamb, Jonathan C; Yu, Weichang; Han, Fangpu; Birchler, James A

2007-04-01

Recent evidence indicates that heterochromatin in plants is composed of heterogeneous sequences, which are usually composed of transposable elements or tandem repeat arrays. These arrays are associated with chromatin modifications that produce a closed configuration that limits transcription. Centromere sequences in plants are usually composed of tandem repeat arrays that are homogenized across the genome. Analysis of such arrays in closely related taxa suggests a rapid turnover of the repeat unit that is typical of a particular species. In addition, two lines of evidence for an epigenetic component of centromere specification have been reported, namely an example of a neocentromere formed over sequences without the typical repeat array and examples of centromere inactivation. Although the telomere repeat unit is quite prevalent in the plant kingdom, unusual repeats have been found in some families. Recently, it was demonstrated that the introduction of telomere sequences into plants cells causes truncation of the chromosomes, and that this technique can be used to produce artificial chromosome platforms.
Tracking neural correlates of successful learning over repeated sequence observations

PubMed Central

Steinemann, Natalie A.; Moisello, Clara; Ghilardi, M. Felice; Kelly, Simon P.

2016-01-01

The neural correlates of memory formation in humans have long been investigated by exposing subjects to diverse material and comparing responses to items later remembered to those forgotten. Tasks requiring memorization of sensory sequences afford unique possibilities for linking neural memorization processes to behavior, because, rather than comparing across different items of varying content, each individual item can be examined across the successive learning states of being initially unknown, newly learned, and eventually, fully known. Sequence learning paradigms have not yet been exploited in this way, however. Here, we analyze the event-related potentials of subjects attempting to memorize sequences of visual locations over several blocks of repeated observation, with respect to pre- and post-block recall tests. Over centro-parietal regions, we observed a rapid P300 component superimposed on a broader positivity, which exhibited distinct modulations across learning states that were replicated in two separate experiments. Consistent with its well-known encoding of surprise, the P300 deflection monotonically decreased over blocks as locations became better learned and hence more expected. In contrast, the broader positivity was especially elevated at the point when a given item was newly learned, i.e., started being successfully recalled. These results implicate the Broad Positivity in endogenously-driven, intentional memory formation, whereas the P300, in processing the current stimulus to the degree that it was previously uncertain, indexes the cumulative knowledge thereby gained. The decreasing surprise/P300 effect significantly predicted learning success both across blocks and across subjects. This presents a new, neural-based means to evaluate learning capabilities independent of verbal reports, which could have considerable value in distinguishing genuine learning disabilities from difficulties to communicate the outcomes of learning, or perceptual
47 CFR 69.111 - Tandem-switched transport and tandem charge.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 47 Telecommunication 3 2011-10-01 2011-10-01 false Tandem-switched transport and tandem charge. 69... SERVICES (CONTINUED) ACCESS CHARGES Computation of Charges § 69.111 Tandem-switched transport and tandem...-switched transport shall consist of two rate elements, a transmission charge and a tandem switching charge...
Identification, characterization, and utilization of genome-wide simple sequence repeats to identify a QTL for acidity in apple

PubMed Central

2012-01-01

Background Apple is an economically important fruit crop worldwide. Developing a genetic linkage map is a critical step towards mapping and cloning of genes responsible for important horticultural traits in apple. To facilitate linkage map construction, we surveyed and characterized the distribution and frequency of perfect microsatellites in assembled contig sequences of the apple genome. Results A total of 28,538 SSRs have been identified in the apple genome, with an overall density of 40.8 SSRs per Mb. Di-nucleotide repeats are the most frequent microsatellites in the apple genome, accounting for 71.9% of all microsatellites. AT/TA repeats are the most frequent in genomic regions, accounting for 38.3% of all the G-SSRs, while AG/GA dimers prevail in transcribed sequences, and account for 59.4% of all EST-SSRs. A total set of 310 SSRs is selected to amplify eight apple genotypes. Of these, 245 (79.0%) are found to be polymorphic among cultivars and wild species tested. AG/GA motifs in genomic regions have detected more alleles and higher PIC values than AT/TA or AC/CA motifs. Moreover, AG/GA repeats are more variable than any other dimers in apple, and should be preferentially selected for studies, such as genetic diversity and linkage map construction. A total of 54 newly developed apple SSRs have been genetically mapped. Interestingly, clustering of markers with distorted segregation is observed on linkage groups 1, 2, 10, 15, and 16. A QTL responsible for malic acid content of apple fruits is detected on linkage group 8, and accounts for ~13.5% of the observed phenotypic variation. Conclusions This study demonstrates that di-nucleotide repeats are prevalent in the apple genome and that AT/TA and AG/GA repeats are the most frequent in genomic and transcribed sequences of apple, respectively. All SSR motifs identified in this study as well as those newly mapped SSRs will serve as valuable resources for pursuing apple genetic studies, aiding the apple breeding
Multilocus Variable-Number-Tandem-Repeats Analysis (MLVA) distinguishes a clonal complex of Clavibacter michiganensis subsp. michiganensis strains isolated from recent outbreaks of bacterial wilt and canker in Belgium

PubMed Central

2013-01-01

Background Clavibacter michiganensis subsp. michiganensis (Cmm) causes bacterial wilt and canker in tomato. Cmm is present nearly in all European countries. During the last three years several local outbreaks were detected in Belgium. The lack of a convenient high-resolution strain-typing method has hampered the study of the routes of transmission of Cmm and epidemiology in tomato cultivation. In this study the genetic relatedness among a worldwide collection of Cmm strains and their relatives was approached by gyrB and dnaA gene sequencing. Further, we developed and applied a multilocus variable number of tandem repeats analysis (MLVA) scheme to discriminate among Cmm strains. Results A phylogenetic analysis of gyrB and dnaA gene sequences of 56 Cmm strains demonstrated that Belgian Cmm strains from recent outbreaks of 2010–2012 form a genetically uniform group within the Cmm clade, and Cmm is phylogenetically distinct from other Clavibacter subspecies and from non-pathogenic Clavibacter-like strains. MLVA conducted with eight minisatellite loci detected 25 haplotypes within Cmm. All strains from Belgian outbreaks, isolated between 2010 and 2012, together with two French strains from 2010 seem to form one monomorphic group. Regardless of the isolation year, location or tomato cultivar, Belgian strains from recent outbreaks belonged to the same haplotype. On the contrary, strains from diverse geographical locations or isolated over longer periods of time formed mostly singletons. Conclusions We hypothesise that the introduction might have originated from one lot of seeds or contaminated tomato seedlings that was the source of the outbreak in 2010 and that these Cmm strains persisted and induced infection in 2011 and 2012. Our results demonstrate that MLVA is a promising typing technique for a local surveillance and outbreaks investigation in epidemiological studies of Cmm. PMID:23738754
Multilocus variable-number-tandem-repeats analysis (MLVA) distinguishes a clonal complex of Clavibacter michiganensis subsp. michiganensis strains isolated from recent outbreaks of bacterial wilt and canker in Belgium.

PubMed

Zaluga, Joanna; Stragier, Pieter; Van Vaerenbergh, Johan; Maes, Martine; De Vos, Paul

2013-06-05

Clavibacter michiganensis subsp. michiganensis (Cmm) causes bacterial wilt and canker in tomato. Cmm is present nearly in all European countries. During the last three years several local outbreaks were detected in Belgium. The lack of a convenient high-resolution strain-typing method has hampered the study of the routes of transmission of Cmm and epidemiology in tomato cultivation. In this study the genetic relatedness among a worldwide collection of Cmm strains and their relatives was approached by gyrB and dnaA gene sequencing. Further, we developed and applied a multilocus variable number of tandem repeats analysis (MLVA) scheme to discriminate among Cmm strains. A phylogenetic analysis of gyrB and dnaA gene sequences of 56 Cmm strains demonstrated that Belgian Cmm strains from recent outbreaks of 2010-2012 form a genetically uniform group within the Cmm clade, and Cmm is phylogenetically distinct from other Clavibacter subspecies and from non-pathogenic Clavibacter-like strains. MLVA conducted with eight minisatellite loci detected 25 haplotypes within Cmm. All strains from Belgian outbreaks, isolated between 2010 and 2012, together with two French strains from 2010 seem to form one monomorphic group. Regardless of the isolation year, location or tomato cultivar, Belgian strains from recent outbreaks belonged to the same haplotype. On the contrary, strains from diverse geographical locations or isolated over longer periods of time formed mostly singletons. We hypothesise that the introduction might have originated from one lot of seeds or contaminated tomato seedlings that was the source of the outbreak in 2010 and that these Cmm strains persisted and induced infection in 2011 and 2012. Our results demonstrate that MLVA is a promising typing technique for a local surveillance and outbreaks investigation in epidemiological studies of Cmm.
Accurate quantification of chromosomal lesions via short tandem repeat analysis using minimal amounts of DNA

PubMed Central

Jann, Johann-Christoph; Nowak, Daniel; Nolte, Florian; Fey, Stephanie; Nowak, Verena; Obländer, Julia; Pressler, Jovita; Palme, Iris; Xanthopoulos, Christina; Fabarius, Alice; Platzbecker, Uwe; Giagounidis, Aristoteles; Götze, Katharina; Letsch, Anne; Haase, Detlef; Schlenk, Richard; Bug, Gesine; Lübbert, Michael; Ganser, Arnold; Germing, Ulrich; Haferlach, Claudia; Hofmann, Wolf-Karsten; Mossner, Maximilian

2017-01-01

Background Cytogenetic aberrations such as deletion of chromosome 5q (del(5q)) represent key elements in routine clinical diagnostics of haematological malignancies. Currently established methods such as metaphase cytogenetics, FISH or array-based approaches have limitations due to their dependency on viable cells, high costs or semi-quantitative nature. Importantly, they cannot be used on low abundance DNA. We therefore aimed to establish a robust and quantitative technique that overcomes these shortcomings. Methods For precise determination of del(5q) cell fractions, we developed an inexpensive multiplex-PCR assay requiring only nanograms of DNA that simultaneously measures allelic imbalances of 12 independent short tandem repeat markers. Results Application of this method to n=1142 samples from n=260 individuals revealed strong intermarker concordance (R²=0.77–0.97) and reproducibility (mean SD: 1.7%). Notably, the assay showed accurate quantification via standard curve assessment (R²>0.99) and high concordance with paired FISH measurements (R²=0.92) even with subnanogram amounts of DNA. Moreover, cytogenetic response was reliably confirmed in del(5q) patients with myelodysplastic syndromes treated with lenalidomide. While the assay demonstrated good diagnostic accuracy in receiver operating characteristic analysis (area under the curve: 0.97), we further observed robust correlation between bone marrow and peripheral blood samples (R²=0.79), suggesting its potential suitability for less-invasive clonal monitoring. Conclusions In conclusion, we present an adaptable tool for quantification of chromosomal aberrations, particularly in problematic samples, which should be easily applicable to further tumour entities. PMID:28600436
Recommendation of short tandem repeat profiling for authenticating human cell lines, stem cells, and tissues.

PubMed

Barallon, Rita; Bauer, Steven R; Butler, John; Capes-Davis, Amanda; Dirks, Wilhelm G; Elmore, Eugene; Furtado, Manohar; Kline, Margaret C; Kohara, Arihiro; Los, Georgyi V; MacLeod, Roderick A F; Masters, John R W; Nardone, Mark; Nardone, Roland M; Nims, Raymond W; Price, Paul J; Reid, Yvonne A; Shewale, Jaiprakash; Sykes, Gregory; Steuer, Anton F; Storts, Douglas R; Thomson, Jim; Taraporewala, Zenobia; Alston-Roberts, Christine; Kerrigan, Liz

2010-10-01

Cell misidentification and cross-contamination have plagued biomedical research for as long as cells have been employed as research tools. Examples of misidentified cell lines continue to surface to this day. Efforts to eradicate the problem by raising awareness of the issue and by asking scientists voluntarily to take appropriate actions have not been successful. Unambiguous cell authentication is an essential step in the scientific process and should be an inherent consideration during peer review of papers submitted for publication or during review of grants submitted for funding. In order to facilitate proper identity testing, accurate, reliable, inexpensive, and standardized methods for authentication of cells and cell lines must be made available. To this end, an international team of scientists is, at this time, preparing a consensus standard on the authentication of human cells using short tandem repeat (STR) profiling. This standard, which will be submitted for review and approval as an American National Standard by the American National Standards Institute, will provide investigators guidance on the use of STR profiling for authenticating human cell lines. Such guidance will include methodological detail on the preparation of the DNA sample, the appropriate numbers and types of loci to be evaluated, and the interpretation and quality control of the results. Associated with the standard itself will be the establishment and maintenance of a public STR profile database under the auspices of the National Center for Biotechnology Information. The consensus standard is anticipated to be adopted by granting agencies and scientific journals as appropriate methodology for authenticating human cell lines, stem cells, and tissues.
Recommendation of short tandem repeat profiling for authenticating human cell lines, stem cells, and tissues

PubMed Central

Barallon, Rita; Bauer, Steven R.; Butler, John; Capes-Davis, Amanda; Dirks, Wilhelm G.; Furtado, Manohar; Kline, Margaret C.; Kohara, Arihiro; Los, Georgyi V.; MacLeod, Roderick A. F.; Masters, John R. W.; Nardone, Mark; Nardone, Roland M.; Nims, Raymond W.; Price, Paul J.; Reid, Yvonne A.; Shewale, Jaiprakash; Sykes, Gregory; Steuer, Anton F.; Storts, Douglas R.; Thomson, Jim; Taraporewala, Zenobia; Alston-Roberts, Christine; Kerrigan, Liz

2010-01-01

Cell misidentification and cross-contamination have plagued biomedical research for as long as cells have been employed as research tools. Examples of misidentified cell lines continue to surface to this day. Efforts to eradicate the problem by raising awareness of the issue and by asking scientists voluntarily to take appropriate actions have not been successful. Unambiguous cell authentication is an essential step in the scientific process and should be an inherent consideration during peer review of papers submitted for publication or during review of grants submitted for funding. In order to facilitate proper identity testing, accurate, reliable, inexpensive, and standardized methods for authentication of cells and cell lines must be made available. To this end, an international team of scientists is, at this time, preparing a consensus standard on the authentication of human cells using short tandem repeat (STR) profiling. This standard, which will be submitted for review and approval as an American National Standard by the American National Standards Institute, will provide investigators guidance on the use of STR profiling for authenticating human cell lines. Such guidance will include methodological detail on the preparation of the DNA sample, the appropriate numbers and types of loci to be evaluated, and the interpretation and quality control of the results. Associated with the standard itself will be the establishment and maintenance of a public STR profile database under the auspices of the National Center for Biotechnology Information. The consensus standard is anticipated to be adopted by granting agencies and scientific journals as appropriate methodology for authenticating human cell lines, stem cells, and tissues. PMID:20614197
The complete chloroplast genome sequence of Epipremnum aureum and its comparative analysis among eight Araceae species

PubMed Central

Han, Limin; Chen, Chen; Wang, Zhezhi

2018-01-01

Epipremnum aureum is an important foliage plant in the Araceae family. In this study, we have sequenced the complete chloroplast genome of E. aureum by using Illumina Hiseq sequencing platforms. This genome is a double-stranded circular DNA sequence of 164,831 bp that contains 35.8% GC. The two inverted repeats (IRa and IRb; 26,606 bp) are spaced by a small single-copy region (22,868 bp) and a large single-copy region (88,751 bp). The chloroplast genome has 131 (113 unique) functional genes, including 86 (79 unique) protein-coding genes, 37 (30 unique) tRNA genes, and eight (four unique) rRNA genes. Tandem repeats comprise the majority of the 43 long repetitive sequences. In addition, 111 simple sequence repeats are present, with mononucleotides being the most common type and di- and tetranucleotides being infrequent events. Positive selection pressure on rps12 in the E. aureum chloroplast has been demonstrated via synonymous and nonsynonymous substitution rates and selection pressure sites analyses. Ycf15 and infA are pseudogenes in this species. We constructed a Maximum Likelihood phylogenetic tree based on the complete chloroplast genomes of 38 species from 13 families. Those results strongly indicated that E. aureum is positioned as the sister of Colocasia esculenta within the Araceae family. This work may provide information for further study of the molecular phylogenetic relationships within Araceae, as well as molecular markers and breeding novel varieties by chloroplast genetic-transformation of E. aureum in particular. PMID:29529038
Investigation into the sequence structure of 23 Y chromosomal STR loci using massively parallel sequencing.

PubMed

Kwon, So Yeun; Lee, Hwan Young; Kim, Eun Hye; Lee, Eun Young; Shin, Kyoung-Jin

2016-11-01

Next-generation sequencing (NGS) can produce massively parallel sequencing (MPS) data for many targeted regions with a high depth of coverage, suggesting its successful application to the amplicons of forensic genetic markers. In the present study, we evaluated the practical utility of MPS in Y-chromosome short tandem repeat (Y-STR) analysis using a multiplex polymerase chain reaction (PCR) system. The multiplex PCR system simultaneously amplified 24 Y-chromosomal markers, including the PowerPlex ® Y23 loci (DYS19, DYS385ab, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS481, DYS533, DYS549, DYS570, DYS576, DYS635, DYS643, and YGATAH4) and the M175 marker with the small-sized amplicons ranging from 85 to 253bp. The barcoded libraries for the amplicons of the 24 Y-chromosomal markers were produced using a simplified PCR-based library preparation method and successfully sequenced using MPS on a MiSeq ® System with samples from 250 unrelated Korean males. The genotyping concordance between MPS and the capillary electrophoresis (CE) method, as well as the sequence structure of the 23 Y-STRs, were investigated. Three samples exhibited discordance between the MPS and CE results at DYS385, DYS439, and DYS576. There were 12 Y-STR loci that showed sequence variations in the alleles by a fragment size determination, and the most varied alleles occurred in DYS389II with a different sequence structure in the repeat region. The largest increase in gene diversity between the CE and MPS results was in DYS437 at +34.41%. Single nucleotide polymorphisms (SNPs), insertions, and deletions (indels) were observed in the flanking regions of DYS481, DYS576, and DYS385, respectively. Stutter and noise ratios of the 23 Y-STRs using the developed MPS system were also investigated. Based on these results, the MPS analysis system used in this study could facilitate the investigation into the sequences of the 23 Y-STRs in forensic
Determining Phylogenetic Relationships Among Date Palm Cultivars Using Random Amplified Polymorphic DNA (RAPD) and Inter-Simple Sequence Repeat (ISSR) Markers.

PubMed

Haider, Nadia

2017-01-01

Investigation of genetic variation and phylogenetic relationships among date palm (Phoenix dactylifera L.) cultivars is useful for their conservation and genetic improvement. Various molecular markers such as restriction fragment length polymorphisms (RFLPs), simple sequence repeat (SSR), representational difference analysis (RDA), and amplified fragment length polymorphism (AFLP) have been developed to molecularly characterize date palm cultivars. PCR-based markers random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) are powerful tools to determine the relatedness of date palm cultivars that are difficult to distinguish morphologically. In this chapter, the principles, materials, and methods of RAPD and ISSR techniques are presented. Analysis of data generated from these two techniques and the use of these data to reveal phylogenetic relationships among date palm cultivars are also discussed.
Variable number of tandem repeat profiles and antimicrobial resistance patterns of Staphylococcus haemolyticus strains isolated from blood cultures in children.

PubMed

Hosseinkhani, Faride; Jabalameli, Fereshteh; Nodeh Farahani, Narges; Taherikalani, Morovat; van Leeuwen, Willem B; Emaneini, Mohammad

2016-03-01

Staphylococcus haemolyticus is a healthcare-associated pathogen and can cause a variety of lifethreatening infections. Additionally, multi-drug resistance (MDR), in particular methicillin-resistant S. haemolyticus (MRSH) isolates, have emerged. Dissemination of such strains can be of great concern in the hospital environment. A total number of 20S. haemolyticus isolates from blood cultures obtained from children were included in this study. A high prevalence of MDR-MRSH isolates with high MIC values to vancomycin was found and 35% of the isolates were intermediate resistant to vancomycin. Multilocus variable number of tandem repeats analysis (MLVF) revealed 5 MLVF types among 20 isolates of S. haemolyticus. Twelve isolates shared the same MLVF type and were isolated from different wards in a pediatric hospital in Iran. This is a serious alarm for infection control; i.e. in the absence of adequate infection diagnostics and infection control guidelines, these resistant strains can spread to other sectors of a hospital and possibly among the community. Copyright © 2015 Elsevier B.V. All rights reserved.
Occurrence and Nature of Double Alleles in Variable-Number Tandem-Repeat Patterns of More than 8,000 Mycobacterium tuberculosis Complex Isolates in The Netherlands

PubMed Central

Kamst, Miranda; van Hunen, Rianne; de Zwaan, Carolina Catherina; Mulder, Arnout; Supply, Philip; Anthony, Richard; van der Hoek, Wim; van Soolingen, Dick

2017-01-01

ABSTRACT Since 2004, variable-number tandem-repeat (VNTR) typing of Mycobacterium tuberculosis complex isolates has been applied on a structural basis in The Netherlands to study the epidemiology of tuberculosis (TB). Although this technique is faster and technically less demanding than the previously used restriction fragment length polymorphism (RFLP) typing, reproducibility remains a concern. In the period from 2004 to 2015, 8,532 isolates were subjected to VNTR typing in The Netherlands, with 186 (2.2%) of these exhibiting double alleles at one locus. Double alleles were most common in loci 4052 and 2163b. The variables significantly associated with double alleles were urban living (odds ratio [OR], 1.503; 95% confidence interval [CI], 1.084 to 2.084; P = 0.014) and pulmonary TB (OR, 1.703; 95% CI, 1.216 to 2.386; P = 0.002). Single-colony cultures of double-allele strains were produced and revealed single-allele profiles; a maximum of five single nucleotide polymorphisms (SNPs) was observed between the single- and double-allele isolates from the same patient when whole-genome sequencing (WGS) was applied. This indicates the presence of two bacterial populations with slightly different VNTR profiles in the parental population, related to genetic drift. This observation is confirmed by the fact that secondary cases from TB source cases with double-allele isolates sometimes display only one of the two alleles present in the source case. Double alleles occur at a frequency of 2.2% in VNTR patterns in The Netherlands. They are caused by biological variation rather than by technical aberrations and can be transmitted either as single- or double-allele variants. PMID:29142049
Mutation rates at 42 Y chromosomal short tandem repeats in Chinese Han population in Eastern China.

PubMed

Wu, Weiwei; Ren, Wenyan; Hao, Honglei; Nan, Hailun; He, Xin; Liu, Qiuling; Lu, Dejian

2018-01-31

Mutation analysis of 42 Y chromosomal short tandem repeats (Y-STRs) loci was performed using a sample of 1160 father-son pairs from the Chinese Han population in Eastern China. The results showed that the average mutation rate across the 42 Y-STR loci was 0.0041 (95% CI 0.0036-0.0047) per locus per generation. The locus-specific mutation rates varied from 0.000 to 0.0190. No mutation was found at DYS388, DYS437, DYS448, DYS531, and GATA_H4. DYS627, DYS570, DYS576, and DYS449 could be classified as rapidly mutating Y-STRs, with mutation rates higher than 1.0 × 10 -2 . DYS458, DYS630, and DYS518 were moderately mutating Y-STRs, with mutation rates ranging from 8 × 10 -3 to 1 × 10 -2 . Although the characteristics of the Y-STR mutations were consistent with those in previous studies, mutation rate differences between our data and previous published data were found at some rapidly mutating Y-STRs. The single-copy loci located on the short arm of the Y chromosome (Yp) showed relatively higher mutation rates more frequently than the multi-copy loci. These results will not only extend the data for Y-STR mutations but also be important for kinship analysis, paternal lineage identification, and family relationship reconstruction in forensic Y-STR analysis.
Formation and Repair of Mismatches Containing Ribonucleotides and Oxidized Bases at Repeated DNA Sequences.

PubMed

Cilli, Piera; Minoprio, Anna; Bossa, Cecilia; Bignami, Margherita; Mazzei, Filomena

2015-10-23

The cellular pool of ribonucleotide triphosphates (rNTPs) is higher than that of deoxyribonucleotide triphosphates. To ensure genome stability, DNA polymerases must discriminate against rNTPs and incorporated ribonucleotides must be removed by ribonucleotide excision repair (RER). We investigated DNA polymerase β (POL β) capacity to incorporate ribonucleotides into trinucleotide repeated DNA sequences and the efficiency of base excision repair (BER) and RER enzymes (OGG1, MUTYH, and RNase H2) when presented with an incorrect sugar and an oxidized base. POL β incorporated rAMP and rCMP opposite 7,8-dihydro-8-oxoguanine (8-oxodG) and extended both mispairs. In addition, POL β was able to insert and elongate an oxidized rGMP when paired with dA. We show that RNase H2 always preserves the capacity to remove a single ribonucleotide when paired to an oxidized base or to incise an oxidized ribonucleotide in a DNA duplex. In contrast, BER activity is affected by the presence of a ribonucleotide opposite an 8-oxodG. In particular, MUTYH activity on 8-oxodG:rA mispairs is fully inhibited, although its binding capacity is retained. This results in the reduction of RNase H2 incision capability of this substrate. Thus complex mispairs formed by an oxidized base and a ribonucleotide can compromise BER and RER in repeated sequences. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.

Linking Y-chromosomal short tandem repeat loci to human male impulsive aggression.

PubMed

Yang, Chun; Ba, Huajie; Cao, Yin; Dong, Guoying; Zhang, Shuyou; Gao, Zhiqin; Zhao, Hanqing; Zhou, Xianju

2017-11-01

Men are more susceptible to impulsive behavior than women. Epidemiological studies revealed that the impulsive aggressive behavior is affected by genetic factors, and the male-specific Y chromosome plays an important role in this behavior. In this study, we investigated the association between the impulsive aggressive behavior and Y-chromosomal short tandem repeats (Y-STRs) loci. The collected biologic samples from 271 offenders with impulsive aggressive behavior and 492 healthy individuals without impulsive aggressive behavior were amplified by PowerPlex R Y23 PCR System and the resultant products were separated by electrophoresis and further genotyped. Then, comparisons in allele and haplotype frequencies of the selected 22 Y-STRs were made in the two groups. Our results showed that there were significant differences in allele frequencies at DYS448 and DYS456 between offenders and controls ( p < .05). Univariate analysis further revealed significant frequency differences for alleles 18 and 22 at DYS448 (0.18 vs 0.27, compared to the controls, p = .003, OR=0.57,95% CI=0.39-0.82; 0.03 vs 0.01, compared to the controls, p = .003, OR=7.45, 95% CI=1.57-35.35, respectively) and for allele 17 at DYS456 (0.07 vs 0.14, compared to the controls, p = .006, OR=0.48, 95% CI =0.28-0.82) between two groups. Interestingly, the frequency of haploid haplotype 22-15 on the DYS448-DYS456 (DYS448-DYS456-22-15) was significantly higher in offenders than in controls (0.033 vs 0.004, compared to the control, p = .001, OR = 8.42, 95%CI =1.81-39.24). Moreover, there were no significant differences in allele frequencies of other Y-STRs loci between two groups. Furthermore, the unconditional logistic regression analysis confirmed that alleles 18 and 22 at DYS448 and allele 17 at DYS456 are associated with male impulsive aggression. However, the DYS448-DYS456-22-15 is less related to impulsive aggression. Our results suggest a link between Y-chromosomal allele types and male
How proteins bind to DNA: target discrimination and dynamic sequence search by the telomeric protein TRF1

PubMed Central

2017-01-01

Abstract Target search as performed by DNA-binding proteins is a complex process, in which multiple factors contribute to both thermodynamic discrimination of the target sequence from overwhelmingly abundant off-target sites and kinetic acceleration of dynamic sequence interrogation. TRF1, the protein that binds to telomeric tandem repeats, faces an intriguing variant of the search problem where target sites are clustered within short fragments of chromosomal DNA. In this study, we use extensive (>0.5 ms in total) MD simulations to study the dynamical aspects of sequence-specific binding of TRF1 at both telomeric and non-cognate DNA. For the first time, we describe the spontaneous formation of a sequence-specific native protein–DNA complex in atomistic detail, and study the mechanism by which proteins avoid off-target binding while retaining high affinity for target sites. Our calculated free energy landscapes reproduce the thermodynamics of sequence-specific binding, while statistical approaches allow for a comprehensive description of intermediate stages of complex formation. PMID:28633355
Molecular Identification of Sex in Phoenix dactylifera Using Inter Simple Sequence Repeat Markers.

PubMed

Al-Ameri, Abdulhafed A; Al-Qurainy, Fahad; Gaafar, Abdel-Rhman Z; Khan, Salim; Nadeem, M

2016-01-01

Early sex identification of Date Palm (Phoenix dactylifera L.) at seedling stage is an economically desirable objective, which will significantly increase the profits of seed based cultivation. The utilization of molecular markers at this stage for early and rapid identification of sex is important due to the lack of morphological markers. In this study, a total of two hundred Inter Simple Sequence Repeat (ISSR) primers were screened among male and female Date palm plants to identify putative sex-specific marker, out of which only two primers (IS_A02 and IS_A71) were found to be associated with sex. The primer IS_A02 produced a unique band of size 390 bp and was found clearly in all female plants, while it was absent in all male plants. Contrary to this, the primer IS_A71 produced a unique band of size 380 bp and was clearly found in all male plants, whereas it was absent in all the female plants. Subsequently, these specific fragments were excised, purified, and sequenced for the development of sequence specific markers further in future for the implementation on dioecious Date Palm for sex determination. These markers are efficient, highly reliable, and reproducible for sex identification at the early stage of seedling.
Fifteen non-CODIS autosomal short tandem repeat loci multiplex data from nine population groups living in Taiwan.

PubMed

Hwa, Hsiao-Lin; Chang, Yih-Yuan; Lee, James Chun-I; Lin, Chun-Yen; Yin, Hsiang-Yi; Tseng, Li-Hui; Su, Yi-Ning; Ko, Tsang-Ming

2012-07-01

The analysis of autosomal short tandem repeat (STR) loci is a powerful tool in forensic genetics. We developed a multiplex system in which 15 non-Combined DNA Index System autosomal STRs (D3S1744, D4S2366, D8S1110, D10S2325, D12S1090, D13S765, D14S608, Penta E, D17S1294, D18S536, D18S1270, D20S470, D21S1437, Penta D, and D22S683) could be amplified in one single polymerase chain reaction. DNA samples from 1,098 unrelated subjects of nine population groups living in Taiwan, including Taiwanese Han, indigenous Taiwanese of Taiwan Island, Tao, mainland Chinese, Filipinos, Thais, Vietnamese, Indonesians, and Caucasians, were collected and analyzed using this system. The distributions of the allelic frequencies and the forensic parameters of each population group were presented. The combined discrimination power and the combined power of exclusion were high in all population groups tested in this study. A multidimensional scaling plot of these nine population groups based on the Reynolds' genetic distances calculated from 15 autosomal STRs was constructed, and the genetic substructure in this area was presented. In conclusion, this 15 autosomal STR multiplex system provides highly informative STR data and appears useful in forensic casework and parentage testing in different populations.
Highly Effective DNA Extraction Method for Nuclear Short Tandem Repeat Testing of Skeletal Remains from Mass Graves

PubMed Central

Davoren, Jon; Vanek, Daniel; Konjhodzić, Rijad; Crews, John; Huffine, Edwin; Parsons, Thomas J.

2007-01-01

Aim To quantitatively compare a silica extraction method with a commonly used phenol/chloroform extraction method for DNA analysis of specimens exhumed from mass graves. Methods DNA was extracted from twenty randomly chosen femur samples, using the International Commission on Missing Persons (ICMP) silica method, based on Qiagen Blood Maxi Kit, and compared with the DNA extracted by the standard phenol/chloroform-based method. The efficacy of extraction methods was compared by real time polymerase chain reaction (PCR) to measure DNA quantity and the presence of inhibitors and by amplification with the PowerPlex 16 (PP16) multiplex nuclear short tandem repeat (STR) kit. Results DNA quantification results showed that the silica-based method extracted on average 1.94 ng of DNA per gram of bone (range 0.25-9.58 ng/g), compared with only 0.68 ng/g by the organic method extracted (range 0.0016-4.4880 ng/g). Inhibition tests showed that there were on average significantly lower levels of PCR inhibitors in DNA isolated by the organic method. When amplified with PP16, all samples extracted by silica-based method produced 16 full loci profiles, while only 75% of the DNA extracts obtained by organic technique amplified 16 loci profiles. Conclusions The silica-based extraction method showed better results in nuclear STR typing from degraded bone samples than a commonly used phenol/chloroform method. PMID:17696302
Identification of multiple binding sites for the THAP domain of the Galileo transposase in the long terminal inverted-repeats.

PubMed

Marzo, Mar; Liu, Danxu; Ruiz, Alfredo; Chalmers, Ronald

2013-08-01

Galileo is a DNA transposon responsible for the generation of several chromosomal inversions in Drosophila. In contrast to other members of the P-element superfamily, it has unusually long terminal inverted-repeats (TIRs) that resemble those of Foldback elements. To investigate the function of the long TIRs we derived consensus and ancestral sequences for the Galileo transposase in three species of Drosophilids. Following gene synthesis, we expressed and purified their constituent THAP domains and tested their binding activity towards the respective Galileo TIRs. DNase I footprinting located the most proximal DNA binding site about 70 bp from the transposon end. Using this sequence we identified further binding sites in the tandem repeats that are found within the long TIRs. This suggests that the synaptic complex between Galileo ends may be a complicated structure containing higher-order multimers of the transposase. We also attempted to reconstitute Galileo transposition in Drosophila embryos but no events were detected. Thus, although the limited numbers of Galileo copies in each genome were sufficient to provide functional consensus sequences for the THAP domains, they do not specify a fully active transposase. Since the THAP recognition sequence is short, and will occur many times in a large genome, it seems likely that the multiple binding sites within the long, internally repetitive, TIRs of Galileo and other Foldback-like elements may provide the transposase with its binding specificity. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
An Ultra-High Discrimination Y Chromosome Short Tandem Repeat Multiplex DNA Typing System

PubMed Central

Hanson, Erin K.; Ballantyne, Jack

2007-01-01

In forensic casework, Y chromosome short tandem repeat markers (Y-STRs) are often used to identify a male donor DNA profile in the presence of excess quantities of female DNA, such as is found in many sexual assault investigations. Commercially available Y-STR multiplexes incorporating 12–17 loci are currently used in forensic casework (Promega's PowerPlex® Y and Applied Biosystems' AmpFlSTR® Yfiler®). Despite the robustness of these commercial multiplex Y-STR systems and the ability to discriminate two male individuals in most cases, the coincidence match probabilities between unrelated males are modest compared with the standard set of autosomal STR markers. Hence there is still a need to develop new multiplex systems to supplement these for those cases where additional discriminatory power is desired or where there is a coincidental Y-STR match between potential male participants. Over 400 Y-STR loci have been identified on the Y chromosome. While these have the potential to increase the discrimination potential afforded by the commercially available kits, many have not been well characterized. In the present work, 91 loci were tested for their relative ability to increase the discrimination potential of the commonly used ‘core’ Y-STR loci. The result of this extensive evaluation was the development of an ultra high discrimination (UHD) multiplex DNA typing system that allows for the robust co-amplification of 14 non-core Y-STR loci. Population studies with a mixed African American and American Caucasian sample set (n = 572) indicated that the overall discriminatory potential of the UHD multiplex was superior to all commercial kits tested. The combined use of the UHD multiplex and the Applied Biosystems' AmpFlSTR® Yfiler® kit resulted in 100% discrimination of all individuals within the sample set, which presages its potential to maximally augment currently available forensic casework markers. It could also find applications in human evolutionary
Multiple-locus variable-number tandem repeat analysis for molecular typing of Aspergillus fumigatus

PubMed Central

2010-01-01

Background Multiple-locus variable-number tandem repeat (VNTR) analysis (MLVA) is a prominent subtyping method to resolve closely related microbial isolates to provide information for establishing genetic patterns among isolates and to investigate disease outbreaks. The usefulness of MLVA was recently demonstrated for the avian major pathogen Chlamydophila psittaci. In the present study, we developed a similar method for another pathogen of birds: the filamentous fungus Aspergillus fumigatus. Results We selected 10 VNTR markers located on 4 different chromosomes (1, 5, 6 and 8) of A. fumigatus. These markers were tested with 57 unrelated isolates from different hosts or their environment (53 isolates from avian species in France, China or Morocco, 3 isolates from humans collected at CHU Henri Mondor hospital in France and the reference strain CBS 144.89). The Simpson index for individual markers ranged from 0.5771 to 0.8530. A combined loci index calculated with all the markers yielded an index of 0.9994. In a second step, the panel of 10 markers was used in different epidemiological situations and tested on 277 isolates, including 62 isolates from birds in Guangxi province in China, 95 isolates collected in two duck farms in France and 120 environmental isolates from a turkey hatchery in France. A database was created with the results of the present study http://minisatellites.u-psud.fr/MLVAnet/. Three major clusters of isolates were defined by using the graphing algorithm termed Minimum Spanning Tree (MST). The first cluster comprised most of the avian isolates collected in the two duck farms in France, the second cluster comprised most of the avian isolates collected in poultry farms in China and the third one comprised most of the isolates collected in the turkey hatchery in France. Conclusions MLVA displayed excellent discriminatory power. The method showed a good reproducibility. MST analysis revealed an interesting clustering with a clear separation between
Two DNA-binding factors recognize specific sequences at silencers, upstream activating sequences, autonomously replicating sequences, and telomeres in Saccharomyces cerevisiae

DOE Office of Scientific and Technical Information (OSTI.GOV)

Buchman, A.R.; Kimmerly, W.J.; Rine, J.

1988-01-01

Two DNA-binding factors from Saccharomyces cerevisiae have been characterized, GRFI (general regulatory factor I) and ABFI (ARS-binding factor I), that recognize specific sequences within diverse genetic elements. GRFI bound to sequences at the negative regulatory elements (silencers) of the silent mating type loci HML E and HMR E and to the upstream activating sequence (UAS) required for transcription of the MAT ..cap alpha.. genes. A putative conserved UAS located at genes involved in translation (RPG box) was also recognized by GRFI. In addition, GRFI bound with high affinity to sequences within the (C/sub 1-3/A)-repeat region at yeast telomeres. Binding sitesmore » for GRFI with the highest affinity appeared to be of the form 5'-(A/G)(A/C)ACCCAN NCA(T/C)(T/C)-3', where N is any nucleotide. ABFI-binding sites were located next to autonomously replicating sequences (ARSs) at controlling elements of the silent mating type loci HMR E, HMR I, and HML I and were associated with ARS1, ARS2, and the 2..mu..m plasmid ARS. Two tandem ABFI binding sites were found between the HIS3 and DED1 genes, several kilobase pairs from any ARS, indicating that ABFI-binding sites are not restricted to ARSs. The sequences recognized by AFBI showed partial dyad-symmetry and appeared to be variations of the consensus 5'-TATCATTNNNNACGA-3'. GRFI and ABFI were both abundant DNA-binding factors and did not appear to be encoded by the SIR genes, whose product are required for repression of the silent mating type loci. Together, these results indicate that both GRFI and ABFI play multiple roles within the cell.« less
Molecular characterization of Shiga-toxigenic Escherichia coli isolated from diverse sources from India by multi-locus variable number tandem repeat analysis (MLVA).

PubMed

Kumar, A; Taneja, N; Sharma, R K; Sharma, H; Ramamurthy, T; Sharma, M

2014-12-01

In a first study from India, a diverse collection of 140 environmental and clinical non-O157 Shiga-toxigenic Escherichia coli strains from a large geographical area in north India was typed by multi-locus variable number tandem repeat analysis (MLVA). The distribution of major virulence genes stx1, stx2 and eae was found to be 78%, 70% and 10%, respectively; 15 isolates were enterohaemorrhagic E. coli (stx1 +/stx2 + and eae +). By MLVA analysis, 44 different alleles were obtained. Dendrogram analysis revealed 104 different genotypes and 19 MLVA-type complexes divided into two main lineages, i.e. mutton and animal stool. Human isolates presented a statistically significant greater odds ratio for clustering with mutton samples compared to animal stool isolates. Five human isolates clustered with animal stool strains suggesting that some of the human infections may be from cattle, perhaps through milk, contact or the environment. Further epidemiological studies are required to explore these sources in context with occurrence of human cases.
A multiple-locus variable-number tandem repeat analysis (MLVA) of Listeria monocytogenes isolated from Norwegian salmon-processing factories and from listeriosis patients.

PubMed

Lunestad, B T; Truong, T T T; Lindstedt, B-A

2013-10-01

The objective of this study was to characterize Listeria monocytogenes isolated from farmed Atlantic salmon (Salmo salar) and the processing environment in three different Norwegian factories, and compare these to clinical isolates by multiple-locus variable-number tandem repeat analysis (MLVA). The 65 L. monocytogenes isolates obtained gave 15 distinct MLVA profiles. There was great heterogeneity in the distribution of MLVA profiles in factories and within each factory. Nine of the 15 MLVA profiles found in the fish-associated isolates were found to match human profiles. The MLVA profile 07-07-09-10-06 was the most common strain in Norwegian listeriosis patients. L. monocytogenes with this profile has previously been associated with at least two known listeriosis outbreaks in Norway, neither determined to be due to fish consumption. However, since this profile was also found in fish and in the processing environment, fish should be considered as a possible food vehicle during sporadic cases and outbreaks of listeriosis.
The paradox of MHC-DRB exon/intron evolution: alpha-helix and beta-sheet encoding regions diverge while hypervariable intronic simple repeats coevolve with beta-sheet codons.

PubMed

Schwaiger, F W; Weyers, E; Epplen, C; Brün, J; Ruff, G; Crawford, A; Epplen, J T

1993-09-01

Twenty-one different caprine and 13 ovine MHC-DRB exon 2 sequences were determined including part of the adjacent introns containing simple repetitive (gt)n(ga)m elements. The positions for highly polymorphic DRB amino acids vary slightly among ungulates and other mammals. From man and mouse to ungulates the basic (gt)n(ga)m structure is fixed in evolution for 7 x 10(7) years whereas ample variations exist in the tandem (gt)n and (ga)m dinucleotides and especially their "degenerated" derivatives. Phylogenetic trees for the alpha-helices and beta-pleated sheets of the ungulate DRB sequences suggest different evolutionary histories. In hoofed animals as well as in humans DRB beta-sheet encoding sequences and adjacent intronic repeats can be assembled into virtually identical groups suggesting coevolution of noncoding as well as coding DNA. In contrast alpha-helices and C-terminal parts of the first DRB domain evolve distinctly. In the absence of a defined mechanism causing specific, site-directed mutations, double-recombination or gene-conversion-like events would readily explain this fact. The role of the intronic simple (gt)n(ga)m repeat is discussed with respect to these genetic exchange mechanisms during evolution.
Accurate quantification of chromosomal lesions via short tandem repeat analysis using minimal amounts of DNA.

PubMed

Jann, Johann-Christoph; Nowak, Daniel; Nolte, Florian; Fey, Stephanie; Nowak, Verena; Obländer, Julia; Pressler, Jovita; Palme, Iris; Xanthopoulos, Christina; Fabarius, Alice; Platzbecker, Uwe; Giagounidis, Aristoteles; Götze, Katharina; Letsch, Anne; Haase, Detlef; Schlenk, Richard; Bug, Gesine; Lübbert, Michael; Ganser, Arnold; Germing, Ulrich; Haferlach, Claudia; Hofmann, Wolf-Karsten; Mossner, Maximilian

2017-09-01

Cytogenetic aberrations such as deletion of chromosome 5q (del(5q)) represent key elements in routine clinical diagnostics of haematological malignancies. Currently established methods such as metaphase cytogenetics, FISH or array-based approaches have limitations due to their dependency on viable cells, high costs or semi-quantitative nature. Importantly, they cannot be used on low abundance DNA. We therefore aimed to establish a robust and quantitative technique that overcomes these shortcomings. For precise determination of del(5q) cell fractions, we developed an inexpensive multiplex-PCR assay requiring only nanograms of DNA that simultaneously measures allelic imbalances of 12 independent short tandem repeat markers. Application of this method to n=1142 samples from n=260 individuals revealed strong intermarker concordance (R²=0.77-0.97) and reproducibility (mean SD: 1.7%). Notably, the assay showed accurate quantification via standard curve assessment (R²>0.99) and high concordance with paired FISH measurements (R²=0.92) even with subnanogram amounts of DNA. Moreover, cytogenetic response was reliably confirmed in del(5q) patients with myelodysplastic syndromes treated with lenalidomide. While the assay demonstrated good diagnostic accuracy in receiver operating characteristic analysis (area under the curve: 0.97), we further observed robust correlation between bone marrow and peripheral blood samples (R²=0.79), suggesting its potential suitability for less-invasive clonal monitoring. In conclusion, we present an adaptable tool for quantification of chromosomal aberrations, particularly in problematic samples, which should be easily applicable to further tumour entities. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Population genetic study of 10 short tandem repeat loci from 600 domestic dogs in Korea.

PubMed

Moon, Seo Hyun; Jang, Yoon-Jeong; Han, Myun Soo; Cho, Myung-Haing

2016-09-30

Dogs have long shared close relationships with many humans. Due to the large number of dogs in human populations, they are often involved in crimes. Occasionally, canine biological evidence such as saliva, bloodstains and hairs can be found at crime scenes. Accordingly, canine DNA can be used as forensic evidence. The use of short tandem repeat (STR) loci from biological evidence is valuable for forensic investigations. In Korea, canine STR profiling-related crimes are being successfully analyzed, leading to diverse crimes such as animal cruelty, dog-attacks, murder, robbery, and missing and abandoned dogs being solved. However, the probability of random DNA profile matches cannot be analyzed because of a lack of canine STR data. Therefore, in this study, 10 STR loci were analyzed in 600 dogs in Korea (344 dogs belonging to 30 different purebreds and 256 crossbred dogs) to estimate canine forensic genetic parameters. Among purebred dogs, a separate statistical analysis was conducted for five major subgroups, 97 Maltese, 47 Poodles, 31 Shih Tzus, 32 Yorkshire Terriers, and 25 Pomeranians. Allele frequencies, expected (Hexp) and observed heterozygosity (Hobs), fixation index (F), probability of identity (P(ID)), probability of sibling identity (P(ID)sib) and probability of exclusion (PE) were then calculated. The Hexp values ranged from 0.901 (PEZ12) to 0.634 (FHC2079), while the P(ID)sib values were between 0.481 (FHC2079) and 0.304 (PEZ12) and the P(ID)sib was about 3.35 × 10(-)⁵ for the combination of all 10 loci. The results presented herein will strengthen the value of canine DNA to solving dog-related crimes.
Sequences characterization of microsatellite DNA sequences in Pacific abalone ( Haliotis discus hannai)

NASA Astrophysics Data System (ADS)

Li, Qi; Akihiro, Kijima

2007-01-01

The microsatellite-enriched library was constructed using magnetic bead hybridization selection method, and the microsatellite DNA sequences were analyzed in Pacific abalone Haliotis discus hannai. Three hundred and fifty white colonies were screened using PCR-based technique, and 84 clones were identified to potentially contain microsatellite repeat motif. The 84 clones were sequenced, and 42 microsatellites and 4 minisatellites with a minimum of five repeats were found (13.1% of white colonies screened). Besides the motif of CA contained in the oligoprobe, we also found other 16 types of microsatellite repeats including a dinucleotide repeat, two tetranucleotide repeats, twelve pentanucleotide repeats and a hexanucleotide repeat. According to Weber (1990), the microsatellite sequences obtained could be categorized structurally into perfect repeats (73.3%), imperfect repeats (13.3%), and compound repeats (13.4%). Among the microsatellite repeats, relatively short arrays (<20 repeats) were most abundant, accounting for 75.0%. The largest length of microsatellites was 48 repeats, and the average number of repeats was 13.4. The data on the composition and length distribution of microsatellites obtained in the present study can be useful for choosing the repeat motifs for microsatellite isolation in other abalone species.
Transcription factor IID in the Archaea: sequences in the Thermococcus celer genome would encode a product closely related to the TATA-binding protein of eukaryotes

NASA Technical Reports Server (NTRS)

Marsh, T. L.; Reich, C. I.; Whitelock, R. B.; Olsen, G. J.; Woese, C. R. (Principal Investigator)

1994-01-01

The first step in transcription initiation in eukaryotes is mediated by the TATA-binding protein, a subunit of the transcription factor IID complex. We have cloned and sequenced the gene for a presumptive homolog of this eukaryotic protein from Thermococcus celer, a member of the Archaea (formerly archaebacteria). The protein encoded by the archaeal gene is a tandem repeat of a conserved domain, corresponding to the repeated domain in its eukaryotic counterparts. Molecular phylogenetic analyses of the two halves of the repeat are consistent with the duplication occurring before the divergence of the archael and eukaryotic domains. In conjunction with previous observations of similarity in RNA polymerase subunit composition and sequences and the finding of a transcription factor IIB-like sequence in Pyrococcus woesei (a relative of T. celer) it appears that major features of the eukaryotic transcription apparatus were well-established before the origin of eukaryotic cellular organization. The divergence between the two halves of the archael protein is less than that between the halves of the individual eukaryotic sequences, indicating that the average rate of sequence change in the archael protein has been less than in its eukaryotic counterparts. To the extent that this lower rate applies to the genome as a whole, a clearer picture of the early genes (and gene families) that gave rise to present-day genomes is more apt to emerge from the study of sequences from the Archaea than from the corresponding sequences from eukaryotes.
Application of Tandem Two-Dimensional Mass Spectrometry for Top-Down Deep Sequencing of Calmodulin.

PubMed

Floris, Federico; Chiron, Lionel; Lynch, Alice M; Barrow, Mark P; Delsuc, Marc-André; O'Connor, Peter B

2018-06-04

Two-dimensional mass spectrometry (2DMS) involves simultaneous acquisition of the fragmentation patterns of all the analytes in a mixture by correlating their precursor and fragment ions by modulating precursor ions systematically through a fragmentation zone. Tandem two-dimensional mass spectrometry (MS/2DMS) unites the ultra-high accuracy of Fourier transform ion cyclotron resonance (FT-ICR) MS/MS and the simultaneous data-independent fragmentation of 2DMS to achieve extensive inter-residue fragmentation of entire proteins. 2DMS was recently developed for top-down proteomics (TDP), and applied to the analysis of calmodulin (CaM), reporting a cleavage coverage of about ~23% using infrared multiphoton dissociation (IRMPD) as fragmentation technique. The goal of this work is to expand the utility of top-down protein analysis using MS/2DMS in order to extend the cleavage coverage in top-down proteomics further into the interior regions of the protein. In this case, using MS/2DMS, the cleavage coverage of CaM increased from ~23% to ~42%. Graphical Abstract Two-dimensional mass spectrometry, when applied to primary fragment ions from the source, allows deep-sequencing of the protein calmodulin.
Discordant expression and variable numbers of neighboring GGA- and GAA-rich triplet repeats in the 3' untranslated regions of two groups of messenger RNAs encoded by the rat polymeric immunoglobulin receptor gene.

PubMed Central

Koch, K S; Gleiberman, A S; Aoki, T; Leffert, H L; Feren, A; Jones, A L; Fodor, E J

1995-01-01

An unusual S1-nuclease sensitive microsatellite (STMS) has been found in the single copy, rat polymeric immunoglobulin receptor gene (PIGR) terminal exon. In Fisher rats, elements within or beyond the STMS are expressed variably in the 3' untranslated regions (3'UTRs) of two 'Groups' of PIGR-encoded hepatic mRNAs (pIg-R) during liver regeneration. STMS elements include neighboring constant regions (a 60-bp d[GA]-rich tract with a chi-like octamer, followed by 15 tandem d[GGA] repeats) that merge directly with 36 or 39 tandem d[GAA] repeats (Fisher or Wistar strains, respectively) interrupted by d[AA] between their 5th-6th repeat units. The Wistar STMS is flanked upstream by two regions of nearly contiguous d[CA] or d[CT] repeats in the 3' end of intron 8; and downstream, by a 283 bp 'unit' containing several inversions at its 5' end, and two polyadenylation signals at its 3' end. The 283 nt unit is expressed in Group 1 pIg-R mRNAs; but it is absent in the Group 2 family so that their GAA repeats merge with their poly A tails. In contrast to genomic sequence, GGA triplet repeats are amplified (n > or = 24-26), whereas GAA triplet repeats are truncated variably (n < or = 9-37) and expressed uninterruptedly in both mRNA Groups. These results suggest that 3' end processing of the rat PIGR gene may involve misalignment, slippage and premature termination of RNA polymerase II. The function of this unusual processing and possible roles of chi-like octamers in quiescent or extrahepatic tissues are discussed. Images PMID:7739889
Molecular evolution of pentatricopeptide repeat genes reveals truncation in species lacking an editing target and structural domains under distinct selective pressures.

PubMed

Hayes, Michael L; Giang, Karolyn; Mulligan, R Michael

2012-05-14

Pentatricopeptide repeat (PPR) proteins are required for numerous RNA processing events in plant organelles including C-to-U editing, splicing, stabilization, and cleavage. Fifteen PPR proteins are known to be required for RNA editing at 21 sites in Arabidopsis chloroplasts, and belong to the PLS class of PPR proteins. In this study, we investigate the co-evolution of four PPR genes (CRR4, CRR21, CLB19, and OTP82) and their six editing targets in Brassicaceae species. PPR genes are composed of approximately 10 to 20 tandem repeats and each repeat has two α-helical regions, helix A and helix B, that are separated by short coil regions. Each repeat and structural feature was examined to determine the selective pressures on these regions. All of the PPR genes examined are under strong negative selection. Multiple independent losses of editing site targets are observed for both CRR21 and OTP82. In several species lacking the known editing target for CRR21, PPR genes are truncated near the 17th PPR repeat. The coding sequences of the truncated CRR21 genes are maintained under strong negative selection; however, the 3' UTR sequences beyond the truncation site have substantially diverged. Phylogenetic analyses of four PPR genes show that sequences corresponding to helix A are high compared to helix B sequences. Differential evolutionary selection of helix A versus helix B is observed in both plant and mammalian PPR genes. PPR genes and their cognate editing sites are mutually constrained in evolution. Editing sites are frequently lost by replacement of an edited C with a genomic T. After the loss of an editing site, the PPR genes are observed with three outcomes: first, few changes are detected in some cases; second, the PPR gene is present as a pseudogene; and third, the PPR gene is present but truncated in the C-terminal region. The retention of truncated forms of CRR21 that are maintained under strong negative selection even in the absence of an editing site target
Catalytic stereoselective synthesis of highly substituted indanones via tandem Nazarov cyclization and electrophilic fluorination trapping.

PubMed

Nie, Jing; Zhu, Hong-Wei; Cui, Han-Feng; Hua, Ming-Qing; Ma, Jun-An

2007-08-02

A new catalytic stereoselective tandem transformation via Nazarov cyclization/electrophilic fluorination has been accomplished. This sequence is efficiently catalyzed by a Cu(II) complex to afford fluorine-containing 1-indanone derivatives with two new stereocenters with high diastereoselectivity (trans/cis up to 49/1). Three examples of catalytic enantioselective tandem transformation are presented.

Phosphate Control of Oxytetracycline Production by Streptomyces rimosus Is at the Level of Transcription from Promoters Overlapped by Tandem Repeats Similar to Those of the DNA-Binding Sites of the OmpR Family

PubMed Central

McDowall, Kenneth J.; Thamchaipenet, Arinthip; Hunter, Iain S.

1999-01-01

Physiological studies have shown that Streptomyces rimosus produces the polyketide antibiotic oxytetracycline abundantly when its mycelial growth is limited by phosphate starvation. We show here that transcripts originating from the promoter for one of the biosynthetic genes, otcC (encoding anhydrotetracycline oxygenase), and from a promoter for the divergent otcX genes peak in abundance at the onset of antibiotic production induced by phosphate starvation, indicating that the synthesis of oxytetracycline is controlled, at least in part, at the level of transcription. Furthermore, analysis of the sequences of the promoters for otcC, otcX, and the polyketide synthase (otcY) genes revealed tandem repeats having significant similarity to the DNA-binding sites of ActII-Orf4 and DnrI, which are Streptomyces antibiotic regulatory proteins (SARPs) related to the OmpR family of transcription activators. Together, the above results suggest that oxytetracycline production by S. rimosus requires a SARP-like transcription factor that is either produced or activated or both under conditions of low phosphate concentrations. We also provide evidence consistent with the otrA resistance gene being cotranscribed with otcC as part of a polycistronic message, suggesting a simple mechanism of coordinate regulation which ensures that resistance to the antibiotic increases in proportion to production. PMID:10322002
Simple sequence repeat markers useful for sorghum downy mildew (Peronosclerospora sorghi) and related species

PubMed Central

Perumal, Ramasamy; Nimmakayala, Padmavathi; Erattaimuthu, Saradha R; No, Eun-Gyu; Reddy, Umesh K; Prom, Louis K; Odvody, Gary N; Luster, Douglas G; Magill, Clint W

2008-01-01

Background A recent outbreak of sorghum downy mildew in Texas has led to the discovery of both metalaxyl resistance and a new pathotype in the causal organism, Peronosclerospora sorghi. These observations and the difficulty in resolving among phylogenetically related downy mildew pathogens dramatically point out the need for simply scored markers in order to differentiate among isolates and species, and to study the population structure within these obligate oomycetes. Here we present the initial results from the use of a biotin capture method to discover, clone and develop PCR primers that permit the use of simple sequence repeats (microsatellites) to detect differences at the DNA level. Results Among the 55 primers pairs designed from clones from pathotype 3 of P. sorghi, 36 flanked microsatellite loci containing simple repeats, including 28 (55%) with dinucleotide repeats and 6 (11%) with trinucleotide repeats. A total of 22 microsatellites with CA/AC or GT/TG repeats were the most abundant (40%) and GA/AG or CT/TC types contribute 15% in our collection. When used to amplify DNA from 19 isolates from P. sorghi, as well as from 5 related species that cause downy mildew on other hosts, the number of different bands detected for each SSR primer pair using a LI-COR- DNA Analyzer ranged from two to eight. Successful cross-amplification for 12 primer pairs studied in detail using DNA from downy mildews that attack maize (P. maydis & P. philippinensis), sugar cane (P. sacchari), pearl millet (Sclerospora graminicola) and rose (Peronospora sparsa) indicate that the flanking regions are conserved in all these species. A total of 15 SSR amplicons unique to P. philippinensis (one of the potential threats to US maize production) were detected, and these have potential for development of diagnostic tests. A total of 260 alleles were obtained using 54 microsatellites primer combinations, with an average of 4.8 polymorphic markers per SSR across 34 Peronosclerospora
Direct repeat sequences are essential for function of the cis-acting locus of transfer (clt) of Streptomyces phaeochromogenes plasmid pJV1.

PubMed

Franco, Bernardo; González-Cerón, Gabriela; Servín-González, Luis

2003-11-01

The functionality of direct and inverted repeat sequences inside the cis acting locus of transfer (clt) of the Streptomyces plasmid pJV1 was determined by testing the effect of different deletions on plasmid transfer. The results show that the single most important element for pJV1 clt function is a series of evenly spaced 9 bp long direct repeats which match the consensus CCGCACA(C/G)(C/G), since their deletion caused a dramatic reduction in plasmid transfer. The presence of these repeats in the absence of any other clt sequences allowed plasmid transfer to occur at a frequency that was at least two orders of magnitude higher than that obtained in the complete absence of clt. A database search revealed regions with a similar organization, and in the same position, in Streptomyces plasmids pSN22 and pSLS, which have transfer proteins homologous to those of pJV1.
A phylogenetic framework facilitates Y-STR variant discovery and classification via massively parallel sequencing.

PubMed

Huszar, Tunde I; Jobling, Mark A; Wetton, Jon H

2018-04-12

Short tandem repeats on the male-specific region of the Y chromosome (Y-STRs) are permanently linked as haplotypes, and therefore Y-STR sequence diversity can be considered within the robust framework of a phylogeny of haplogroups defined by single nucleotide polymorphisms (SNPs). Here we use massively parallel sequencing (MPS) to analyse the 23 Y-STRs in Promega's prototype PowerSeq™ Auto/Mito/Y System kit (containing the markers of the PowerPlex® Y23 [PPY23] System) in a set of 100 diverse Y chromosomes whose phylogenetic relationships are known from previous megabase-scale resequencing. Including allele duplications and alleles resulting from likely somatic mutation, we characterised 2311 alleles, demonstrating 99.83% concordance with capillary electrophoresis (CE) data on the same sample set. The set contains 267 distinct sequence-based alleles (an increase of 58% compared to the 169 detectable by CE), including 60 novel Y-STR variants phased with their flanking sequences which have not been reported previously to our knowledge. Variation includes 46 distinct alleles containing non-reference variants of SNPs/indels in both repeat and flanking regions, and 145 distinct alleles containing repeat pattern variants (RPV). For DYS385a,b, DYS481 and DYS390 we observed repeat count variation in short flanking segments previously considered invariable, and suggest new MPS-based structural designations based on these. We considered the observed variation in the context of the Y phylogeny: several specific haplogroup associations were observed for SNPs and indels, reflecting the low mutation rates of such variant types; however, RPVs showed less phylogenetic coherence and more recurrence, reflecting their relatively high mutation rates. In conclusion, our study reveals considerable additional diversity at the Y-STRs of the PPY23 set via MPS analysis, demonstrates high concordance with CE data, facilitates nomenclature standardisation, and places Y-STR sequence variants
The complete chloroplast genome sequence of the relict woody plant Metasequoia glyptostroboides Hu et Cheng.

PubMed

Chen, Jinhui; Hao, Zhaodong; Xu, Haibin; Yang, Liming; Liu, Guangxin; Sheng, Yu; Zheng, Chen; Zheng, Weiwei; Cheng, Tielong; Shi, Jisen

2015-01-01

Metasequoia glyptostroboides Hu et Cheng is the only species in the genus Metasequoia Miki ex Hu et Cheng, which belongs to the Cupressaceae family. There were around 10 species in the Metasequoia genus, which were widely spread across the Northern Hemisphere during the Cretaceous of the Mesozoic and in the Cenozoic. M. glyptostroboides is the only remaining representative of this genus. Here, we report the complete chloroplast (cp) genome sequence and the cp genomic features of M. glyptostroboides. The M. glyptostroboides cp genome is 131,887 bp in length, with a total of 117 genes comprised of 82 protein-coding genes, 31 tRNA genes and four rRNA genes. In this genome, 11 forward repeats, nine palindromic repeats, and 15 tandem repeats were detected. A total of 188 perfect microsatellites were detected through simple sequence repeat (SSR) analysis and these were distributed unevenly within the cp genome. Comparison of the cp genome structure and gene order to those of several other land plants indicated that a copy of the inverted repeat (IR) region, which was found to be IR region A (IRA), was lost in the M. glyptostroboides cp genome. The five most divergent and five most conserved genes were determined and further phylogenetic analysis was performed among plant species, especially for related species in conifers. Finally, phylogenetic analysis demonstrated that M. glyptostroboides is a sister species to Cryptomeria japonica (L. F.) D. Don and to Taiwania cryptomerioides Hayata. The complete cp genome sequence information of M. glyptostroboides will be great helpful for further investigations of this endemic relict woody plant and for in-depth understanding of the evolutionary history of the coniferous cp genomes, especially for the position of M. glyptostroboides in plant systematics and evolution.
The complete chloroplast genome sequence of the relict woody plant Metasequoia glyptostroboides Hu et Cheng

PubMed Central

Chen, Jinhui; Hao, Zhaodong; Xu, Haibin; Yang, Liming; Liu, Guangxin; Sheng, Yu; Zheng, Chen; Zheng, Weiwei; Cheng, Tielong; Shi, Jisen

2015-01-01

Metasequoia glyptostroboides Hu et Cheng is the only species in the genus Metasequoia Miki ex Hu et Cheng, which belongs to the Cupressaceae family. There were around 10 species in the Metasequoia genus, which were widely spread across the Northern Hemisphere during the Cretaceous of the Mesozoic and in the Cenozoic. M. glyptostroboides is the only remaining representative of this genus. Here, we report the complete chloroplast (cp) genome sequence and the cp genomic features of M. glyptostroboides. The M. glyptostroboides cp genome is 131,887 bp in length, with a total of 117 genes comprised of 82 protein-coding genes, 31 tRNA genes and four rRNA genes. In this genome, 11 forward repeats, nine palindromic repeats, and 15 tandem repeats were detected. A total of 188 perfect microsatellites were detected through simple sequence repeat (SSR) analysis and these were distributed unevenly within the cp genome. Comparison of the cp genome structure and gene order to those of several other land plants indicated that a copy of the inverted repeat (IR) region, which was found to be IR region A (IRA), was lost in the M. glyptostroboides cp genome. The five most divergent and five most conserved genes were determined and further phylogenetic analysis was performed among plant species, especially for related species in conifers. Finally, phylogenetic analysis demonstrated that M. glyptostroboides is a sister species to Cryptomeria japonica (L. F.) D. Don and to Taiwania cryptomerioides Hayata. The complete cp genome sequence information of M. glyptostroboides will be great helpful for further investigations of this endemic relict woody plant and for in-depth understanding of the evolutionary history of the coniferous cp genomes, especially for the position of M. glyptostroboides in plant systematics and evolution. PMID:26136762
PopAffiliator: online calculator for individual affiliation to a major population group based on 17 autosomal short tandem repeat genotype profile.

PubMed

Pereira, Luísa; Alshamali, Farida; Andreassen, Rune; Ballard, Ruth; Chantratita, Wasun; Cho, Nam Soo; Coudray, Clotilde; Dugoujon, Jean-Michel; Espinoza, Marta; González-Andrade, Fabricio; Hadi, Sibte; Immel, Uta-Dorothee; Marian, Catalin; Gonzalez-Martin, Antonio; Mertens, Gerhard; Parson, Walther; Perone, Carlos; Prieto, Lourdes; Takeshita, Haruo; Rangel Villalobos, Héctor; Zeng, Zhaoshu; Zhivotovsky, Lev; Camacho, Rui; Fonseca, Nuno A

2011-09-01

Because of their sensitivity and high level of discrimination, short tandem repeat (STR) maker systems are currently the method of choice in routine forensic casework and data banking, usually in multiplexes up to 15-17 loci. Constraints related to sample amount and quality, frequently encountered in forensic casework, will not allow to change this picture in the near future, notwithstanding the technological developments. In this study, we present a free online calculator named PopAffiliator ( http://cracs.fc.up.pt/popaffiliator ) for individual population affiliation in the three main population groups, Eurasian, East Asian and sub-Saharan African, based on genotype profiles for the common set of STRs used in forensics. This calculator performs affiliation based on a model constructed using machine learning techniques. The model was constructed using a data set of approximately fifteen thousand individuals collected for this work. The accuracy of individual population affiliation is approximately 86%, showing that the common set of STRs routinely used in forensics provide a considerable amount of information for population assignment, in addition to being excellent for individual identification.
[Discriminatory power of variable number on tandem repeats loci for genotyping Mycobacterium tuberculosis strains in China].

PubMed

Chen, H X; Cai, C; Liu, J Y; Zhang, Z G; Yuan, M; Jia, J N; Sun, Z G; Huang, H R; Gao, J M; Li, W M

2017-06-10

Objective: Using the standard genotype method, variable number of tandem repeats (VNTR), we constructed a VNTR database to cover all provinces and proposed a set of optimized VNTR loci combinations for each province, in order to improve the preventive and control programs on tuberculosis, in China. Methods: A total of 15 loci VNTR was used to analyze 4 116 Mycobacterium tuberculosis strains, isolated from national survey of Drug Resistant Tuberculosis, in 2007. Hunter-Gaston Index (HGI) was also used to analyze the discriminatory power of each VNTR site. A set combination of 12-VNTR, 10-VNTR, 8-VNTR and 5-VNTR was respectively constructed for each province, based on 1) epidemic characteristics of M. tuberculosis lineages in China, with high discriminatory power and genetic stability. Results: Through the completed 15 loci VNTR patterns of 3 966 strains under 96.36 % (3 966/4 116) coverage, we found seven high HGI loci (including QUB11b and MIRU26) as well as low stable loci (including QUB26, MIRU16, Mtub21 and QUB11b) in several areas. In all the 31 provinces, we found an optimization VNTR combination as 10-VNTR loci in Inner Mongolia, Chongqing and Heilongjiang, but with 8-VNTR combination shared in other provinces. Conclusions: It is necessary to not only use the VNTR database for tracing the source of infection and cluster of M. tuberculosis in the nation but also using the set of optimized VNTR combinations in monitoring those local epidemics and M. tuberculosis (genetics in local) population.
Identification of a highly sulfated fucoidan from sea cucumber Pearsonothuria graeffei with well-repeated tetrasaccharides units.

PubMed

Hu, Yaqin; Li, Shan; Li, Junhui; Ye, Xingqian; Ding, Tian; Liu, Donghong; Chen, Jianchu; Ge, Zhiwei; Chen, Shiguo

2015-12-10

Sea cucumber fucoidan is a major bioactive component of sea cucumber. The structures of fucoidans have significant influences on their biological activities. The present study clarified the delicate structure of a fucoidan from Pearsonothuria graeffei. Fucoidan was obtained after papain digestion and purified by ion chromatography. The carbohydrate sequence of fucoidan was firstly determined by negative-ion electrospray tandem mass spectrometry (ES-MS) with collision-induced dissociation of the oligosaccharide fragments, which were obtained by mild acid hydrolysis, and completed by NMR for assignment of the anomeric conformation. It was unambiguously identified as a tetrasaccharide repeating unit with a backbone of [ → 3Fuc (2S, 4S) α1 → 3Fucα1→ 3Fuc (4S) α1 → 3Fuc#7 × 10#]n. The glycosidic bonds between the non-sulfated and 2,4-O-disulfated fucose residues were selectively cleaved, and highly ordered oligosaccharide fragments with a tetrasaccharide repeating unit were obtained. The highly 4-O- and 2, 4-di-O-sulfated polysaccharide deserves further developments for Pharmacia use. Copyright © 2015 Elsevier Ltd. All rights reserved.
FDSTools: A software package for analysis of massively parallel sequencing data with the ability to recognise and correct STR stutter and other PCR or sequencing noise.

PubMed

Hoogenboom, Jerry; van der Gaag, Kristiaan J; de Leeuw, Rick H; Sijen, Titia; de Knijff, Peter; Laros, Jeroen F J

2017-03-01

Massively parallel sequencing (MPS) is on the advent of a broad scale application in forensic research and casework. The improved capabilities to analyse evidentiary traces representing unbalanced mixtures is often mentioned as one of the major advantages of this technique. However, most of the available software packages that analyse forensic short tandem repeat (STR) sequencing data are not well suited for high throughput analysis of such mixed traces. The largest challenge is the presence of stutter artefacts in STR amplifications, which are not readily discerned from minor contributions. FDSTools is an open-source software solution developed for this purpose. The level of stutter formation is influenced by various aspects of the sequence, such as the length of the longest uninterrupted stretch occurring in an STR. When MPS is used, STRs are evaluated as sequence variants that each have particular stutter characteristics which can be precisely determined. FDSTools uses a database of reference samples to determine stutter and other systemic PCR or sequencing artefacts for each individual allele. In addition, stutter models are created for each repeating element in order to predict stutter artefacts for alleles that are not included in the reference set. This information is subsequently used to recognise and compensate for the noise in a sequence profile. The result is a better representation of the true composition of a sample. Using Promega Powerseq™ Auto System data from 450 reference samples and 31 two-person mixtures, we show that the FDSTools correction module decreases stutter ratios above 20% to below 3%. Consequently, much lower levels of contributions in the mixed traces are detected. FDSTools contains modules to visualise the data in an interactive format allowing users to filter data with their own preferred thresholds. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Expressed sequence tags from the plant trypanosomatid Phytomonas serpens.

PubMed

Pappas, Georgios J; Benabdellah, Karim; Zingales, Bianca; González, Antonio

2005-08-01

We have generated 2190 expressed sequence tags (ESTs) from a cDNA library of the plant trypanosomatid Phytomonas serpens. Upon processing and clustering the set of 1893 accepted sequences was reduced to 697 clusters consisting of 452 singletons and 245 contigs. Functional categories were assigned based on BLAST searches against a database of the eukaryotic orthologous groups of proteins (KOG). Thirty six percent of the generated sequences showed no hits against the KOG database and 39.6% presented similarity to the KOG classes corresponding to translation, ribosomal structure and biogenesis. The most populated cluster contained 45 ESTs homologous to members of the glucose transporter family. This fact can be immediately correlated to the reported Phytomonas dependence on anaerobic glycolytic ATP production due to the lack of cytochrome-mediated respiratory chain. In this context, not only a number of enzymes of the glycolytic pathway were identified but also of the Krebs cycle as well as specific components of the respiratory chain. The data here reported, including a few hundred unique sequences and the description of tandemly repeated motifs and putative transcript stability motifs at untranslated mRNA ends, represent an initial approach to overcome the lack of information on the molecular biology of this organism.
The complete sequence of the mitochondrial genome of Arctic fox (Alopex lagopus).

PubMed

Yan, Shou-Qing; Guo, Peng-Cheng; Yue, Yuan; Li, Wan-Hong; Bai, Chun-Yan; Li, Yu-Mei; Sun, Jin-Hai; Zhao, Zhi-Hui

2016-11-01

In the present study, the complete mitochondrial genome sequence of Arctic fox (Alopex lagopus) was determined for the first time. It has a total length of 16,656 bp, and contains 13 protein-coding genes, 22 tRNA genes, 2 ribosome RNA genes and 1 control region. The nucleotide composition is 31.3% for A, 26.2% for C, 14.8% for G and 27.7% for T, respectively. The D-loop region located between tRNA Pro and tRNA Phe contains a (ACACGTACACGCAT) 18 tandem repeat array. The data will be useful for the investigation of the genetic structure and diversity in the natural and farmed population of Arctic foxes.
Variant Alleles, Triallelic Patterns, and Point Mutations Observed in Nuclear Short Tandem Repeat Typing of Populations in Bosnia and Serbia

PubMed Central

Huel, René L. M.; Bašić, Lara; Madacki-Todorović, Kamelija; Smajlović, Lejla; Eminović, Izet; Berbić, Irfan; Miloš, Ana; Parsons, Thomas J.

2007-01-01

Aim To present a compendium of off-ladder alleles and other genotyping irregularities relating to rare/unexpected population genetic variation, observed in a large short tandem repeat (STR) database from Bosnia and Serbia. Methods DNA was extracted from blood stain cards relating to reference samples from a population of 32 800 individuals from Bosnia and Serbia, and typed using Promega’s PowerPlex®16 STR kit. Results There were 31 distinct off-ladder alleles were observed in 10 of the 15 STR loci amplified from the PowerPlex®16 STR kit. Of these 31 alleles, 3 have not been previously reported. Furthermore, 16 instances of triallelic patterns were observed in 9 of the 15 loci. Primer binding site mismatches that affected amplification were observed in two loci, D5S818 and D8S1179. Conclusion Instances of deviations from manufacturer’s allelic ladders should be expected and caution taken to properly designate the correct alleles in large DNA databases. Particular care should be taken in kinship matching or paternity cases as incorrect designation of any of these deviations from allelic ladders could lead to false exclusions. PMID:17696304
Multiple-Locus Variable-Number Tandem-Repeats Analysis of Escherichia coli O157 using PCR multiplexing and multi-colored capillary electrophoresis.

PubMed

Lindstedt, Bjørn-Arne; Vardund, Traute; Kapperud, Georg

2004-08-01

The Multiple-Locus Variable-Number Tandem-Repeats Analysis (MLVA) method is currently being used as the primary typing tool for Shiga-toxin-producing Escherichia coli (STEC) O157 isolates in our laboratory. The initial assay was performed using a single fluorescent dye and the different patterns were assigned using a gel image. Here, we present a significantly improved assay using multiple dye colors and enhanced PCR multiplexing to increase speed, and ease the interpretation of the results. The different MLVA patterns are now based on allele sizes entered as character values, thus removing the uncertainties introduced when analyzing band patterns from the gel image. We additionally propose an easy numbering scheme for the identification of separate isolates that will facilitate exchange of typing data. Seventy-two human and animal strains of Shiga-toxin-producing E. coli O157 were used for the development of the improved MLVA assay. The method is based on capillary separation of multiplexed PCR products of VNTR loci in the E. coli O157 genome labeled with multiple fluorescent dyes. The different alleles at each locus were then assigned to allele numbers, which were used for strain comparison.
Application of Short Tandem Repeat markers in diagnosis of chromosomal aneuploidies and forensic DNA investigation in Pakistan.

PubMed

Chishti, Hafsah Muhammad; Ansar, Muhammad; Ajmal, Muhammad; Hameed, Abdul

2014-09-15

Short Tandem Repeat (STR) genetic markers hold great potential in forensic investigations, molecular diagnostics and molecular genetics research. AmpFlSTR® Identifiler™ PCR amplification kit is a multiplex system for co-amplification of 15 STR markers used worldwide in forensic investigations. This study attempts to assess forensic validity of these STRs in Pakistani population and to investigate its applicability in quick and simultaneous diagnosis and tracing parental source of common chromosomal aneuploidies. Samples from 554 healthy Pakistani individuals from 5 different ethnicities were analyzed for forensic parameters using Identifiler STRs and 74 patients' samples with different aneuploidies were evaluated for diagnostic strengths of these markers. All STRs hold sufficient forensic applicability in Pakistani population with paternity index between 1.5 and 3.5, polymorphic information content from 0.63 to 0.87 and discrimination power ≥0.9 (except TPOX locus). Variation from Hardy-Weinberg equilibrium was observed at some loci reflecting selective breeding and intermarriages trend in Pakistan. Among aneuploidic samples, all trisomies were precisely detectable while aneuploidies involving sex chromosomes or missing chromosomes were not clearly detectable using Identifiler STRs. Parental origin of aneuploidy was traceable in 92.54% patients. The studied STR markers are valuable tools for forensic application in Pakistan and utilizable for quick and simultaneous identification of some common trisomic conditions. Adding more sex chromosome specific STR markers can immensely increase the diagnostic and forensic potential of this system. Copyright © 2014 Elsevier B.V. All rights reserved.
Simple Sequence Repeats Provide a Substrate for Phenotypic Variation in the Neurospora crassa Circadian Clock

PubMed Central

Michael, Todd P.; Park, Sohyun; Kim, Tae-Sung; Booth, Jim; Byer, Amanda; Sun, Qi; Chory, Joanne; Lee, Kwangwon

2007-01-01

Background WHITE COLLAR-1 (WC-1) mediates interactions between the circadian clock and the environment by acting as both a core clock component and as a blue light photoreceptor in Neurospora crassa. Loss of the amino-terminal polyglutamine (NpolyQ) domain in WC-1 results in an arrhythmic circadian clock; this data is consistent with this simple sequence repeat (SSR) being essential for clock function. Methodology/Principal Findings Since SSRs are often polymorphic in length across natural populations, we reasoned that investigating natural variation of the WC-1 NpolyQ may provide insight into its role in the circadian clock. We observed significant phenotypic variation in the period, phase and temperature compensation of circadian regulated asexual conidiation across 143 N. crassa accessions. In addition to the NpolyQ, we identified two other simple sequence repeats in WC-1. The sizes of all three WC-1 SSRs correlated with polymorphisms in other clock genes, latitude and circadian period length. Furthermore, in a cross between two N. crassa accessions, the WC-1 NpolyQ co-segregated with period length. Conclusions/Significance Natural variation of the WC-1 NpolyQ suggests a mechanism by which period length can be varied and selected for by the local environment that does not deleteriously affect WC-1 activity. Understanding natural variation in the N. crassa circadian clock will facilitate an understanding of how fungi exploit their environments. PMID:17726525
A direct repeat of E-box-like elements is required for cell-autonomous circadian rhythm of clock genes

PubMed Central

Nakahata, Yasukazu; Yoshida, Mayumi; Takano, Atsuko; Soma, Haruhiko; Yamamoto, Takuro; Yasuda, Akio; Nakatsu, Toru; Takumi, Toru

2008-01-01

Background The circadian expression of the mammalian clock genes is based on transcriptional feedback loops. Two basic helix-loop-helix (bHLH) PAS (for Period-Arnt-Sim) domain-containing transcriptional activators, CLOCK and BMAL1, are known to regulate gene expression by interacting with a promoter element termed the E-box (CACGTG). The non-canonical E-boxes or E-box-like sequences have also been reported to be necessary for circadian oscillation. Results We report a new cis-element required for cell-autonomous circadian transcription of clock genes. This new element consists of a canonical E-box or a non-canonical E-box and an E-box-like sequence in tandem with the latter with a short interval, 6 base pairs, between them. We demonstrate that both E-box or E-box-like sequences are needed to generate cell-autonomous oscillation. We also verify that the spacing nucleotides with constant length between these 2 E-elements are crucial for robust oscillation. Furthermore, by in silico analysis we conclude that several clock and clock-controlled genes possess a direct repeat of the E-box-like elements in their promoter region. Conclusion We propose a novel possible mechanism regulated by double E-box-like elements, not to a single E-box, for circadian transcriptional oscillation. The direct repeat of the E-box-like elements identified in this study is the minimal required element for the generation of cell-autonomous transcriptional oscillation of clock and clock-controlled genes. PMID:18177499
Genetic mapping of 15 human X chromosomal forensic short tandem repeat (STR) loci by means of multi-core parallelization.

PubMed

Diegoli, Toni Marie; Rohde, Heinrich; Borowski, Stefan; Krawczak, Michael; Coble, Michael D; Nothnagel, Michael

2016-11-01

Typing of X chromosomal short tandem repeat (X STR) markers has become a standard element of human forensic genetic analysis. Joint consideration of many X STR markers at a time increases their discriminatory power but, owing to physical linkage, requires inter-marker recombination rates to be accurately known. We estimated the recombination rates between 15 well established X STR markers using genotype data from 158 families (1041 individuals) and following a previously proposed likelihood-based approach that allows for single-step mutations. To meet the computational requirements of this family-based type of analysis, we modified a previous implementation so as to allow multi-core parallelization on a high-performance computing system. While we obtained recombination rate estimates larger than zero for all but one pair of adjacent markers within the four previously proposed linkage groups, none of the three X STR pairs defining the junctions of these groups yielded a recombination rate estimate of 0.50. Corroborating previous studies, our results therefore argue against a simple model of independent X chromosomal linkage groups. Moreover, the refined recombination fraction estimates obtained in our study will facilitate the appropriate joint consideration of all 15 investigated markers in forensic analysis. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Towards Development of Clustering Applications for Large-Scale Comparative Genotyping and Kinship Analysis Using Y-Short Tandem Repeats.

PubMed

Seman, Ali; Sapawi, Azizian Mohd; Salleh, Mohd Zaki

2015-06-01

Y-chromosome short tandem repeats (Y-STRs) are genetic markers with practical applications in human identification. However, where mass identification is required (e.g., in the aftermath of disasters with significant fatalities), the efficiency of the process could be improved with new statistical approaches. Clustering applications are relatively new tools for large-scale comparative genotyping, and the k-Approximate Modal Haplotype (k-AMH), an efficient algorithm for clustering large-scale Y-STR data, represents a promising method for developing these tools. In this study we improved the k-AMH and produced three new algorithms: the Nk-AMH I (including a new initial cluster center selection), the Nk-AMH II (including a new dominant weighting value), and the Nk-AMH III (combining I and II). The Nk-AMH III was the superior algorithm, with mean clustering accuracy that increased in four out of six datasets and remained at 100% in the other two. Additionally, the Nk-AMH III achieved a 2% higher overall mean clustering accuracy score than the k-AMH, as well as optimal accuracy for all datasets (0.84-1.00). With inclusion of the two new methods, the Nk-AMH III produced an optimal solution for clustering Y-STR data; thus, the algorithm has potential for further development towards fully automatic clustering of any large-scale genotypic data.
Investigation of Salmonella Enteritidis outbreaks in South Africa using multi-locus variable-number tandem-repeats analysis, 2013-2015.

PubMed

Muvhali, Munyadziwa; Smith, Anthony Marius; Rakgantso, Andronica Moipone; Keddy, Karen Helena

2017-10-02

Salmonella enterica serovar Enteritidis (Salmonella Enteritidis) has become a significant pathogen in South Africa, and the need for improved molecular surveillance of this pathogen has become important. Over the years, multi-locus variable-number tandem-repeats analysis (MLVA) has become a valuable molecular subtyping technique for Salmonella, particularly for highly homogenic serotypes such as Salmonella Enteritidis. This study describes the use of MLVA in the molecular epidemiological investigation of outbreak isolates in South Africa. Between the years 2013 and 2015, the Centre for Enteric Diseases (CED) received 39 Salmonella Enteritidis isolates from seven foodborne illness outbreaks, which occurred in six provinces. MLVA was performed on all isolates. Three MLVA profiles (MLVA profiles 21, 22 and 28) were identified among the 39 isolates. MLVA profile 28 accounted for 77% (30/39) of the isolates. Isolates from a single outbreak were grouped into a single MLVA profile. A minimum spanning tree (MST) created from the MLVA data showed a close relationship between MLVA profiles 21, 22 and 28, with a single VNTR locus difference between them. MLVA has proven to be a reliable method for the molecular epidemiological investigation of Salmonella Enteritidis outbreaks in South Africa. These foodborne outbreaks emphasize the importance of the One Health approach as an essential component for combating the spread of zoonotic pathogens such as Salmonella Enteritidis.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.