intersimple sequence repeat: Topics by Science.gov

Sample records for intersimple sequence repeat

Plant genotyping using fluorescently tagged inter-simple sequence repeats (ISSRs): basic principles and methodology.

PubMed

Prince, Linda M

2015-01-01

Inter-simple sequence repeat PCR (ISSR-PCR) is a fast, inexpensive genotyping technique based on length variation in the regions between microsatellites. The method requires no species-specific prior knowledge of microsatellite location or composition. Very small amounts of DNA are required, making this method ideal for organisms of conservation concern, or where the quantity of DNA is extremely limited due to organism size. ISSR-PCR can be highly reproducible but requires careful attention to detail. Optimization of DNA extraction, fragment amplification, and normalization of fragment peak heights during fluorescent detection are critical steps to minimizing the downstream time spent verifying and scoring the data.
Determining Phylogenetic Relationships Among Date Palm Cultivars Using Random Amplified Polymorphic DNA (RAPD) and Inter-Simple Sequence Repeat (ISSR) Markers.

PubMed

Haider, Nadia

2017-01-01

Investigation of genetic variation and phylogenetic relationships among date palm (Phoenix dactylifera L.) cultivars is useful for their conservation and genetic improvement. Various molecular markers such as restriction fragment length polymorphisms (RFLPs), simple sequence repeat (SSR), representational difference analysis (RDA), and amplified fragment length polymorphism (AFLP) have been developed to molecularly characterize date palm cultivars. PCR-based markers random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) are powerful tools to determine the relatedness of date palm cultivars that are difficult to distinguish morphologically. In this chapter, the principles, materials, and methods of RAPD and ISSR techniques are presented. Analysis of data generated from these two techniques and the use of these data to reveal phylogenetic relationships among date palm cultivars are also discussed.
Molecular Analysis of Date Palm Genetic Diversity Using Random Amplified Polymorphic DNA (RAPD) and Inter-Simple Sequence Repeats (ISSRs).

PubMed

El Sharabasy, Sherif F; Soliman, Khaled A

2017-01-01

The date palm is an ancient domesticated plant with great diversity and has been cultivated in the Middle East and North Africa for at last 5000 years. Date palm cultivars are classified based on the fruit moisture content, as dry, semidry, and soft dates. There are a number of biochemical and molecular techniques available for characterization of the date palm variation. This chapter focuses on the DNA-based markers random amplified polymorphic DNA (RAPD) and inter-simple sequence repeats (ISSR) techniques, in addition to biochemical markers based on isozyme analysis. These techniques coupled with appropriate statistical tools proved useful for determining phylogenetic relationships among date palm cultivars and provide information resources for date palm gene banks.
Genetic diversity of Pinus nigra Arn. populations in Southern Spain and Northern Morocco revealed by inter-simple sequence repeat profiles.

PubMed

Rubio-Moraga, Angela; Candel-Perez, David; Lucas-Borja, Manuel E; Tiscar, Pedro A; Viñegla, Benjamin; Linares, Juan C; Gómez-Gómez, Lourdes; Ahrazem, Oussama

2012-01-01

Eight Pinus nigra Arn. populations from Southern Spain and Northern Morocco were examined using inter-simple sequence repeat markers to characterize the genetic variability amongst populations. Pair-wise population genetic distance ranged from 0.031 to 0.283, with a mean of 0.150 between populations. The highest inter-population average distance was between PaCU from Cuenca and YeCA from Cazorla, while the lowest distance was between TaMO from Morocco and MA Sierra Mágina populations. Analysis of molecular variance (AMOVA) and Nei's genetic diversity analyses revealed higher genetic variation within the same population than among different populations. Genetic differentiation (Gst) was 0.233. Cuenca showed the highest Nei's genetic diversity followed by the Moroccan region, Sierra Mágina, and Cazorla region. However, clustering of populations was not in accordance with their geographical locations. Principal component analysis showed the presence of two major groups-Group 1 contained all populations from Cuenca while Group 2 contained populations from Cazorla, Sierra Mágina and Morocco-while Bayesian analysis revealed the presence of three clusters. The low genetic diversity observed in PaCU and YeCA is probably a consequence of inappropriate management since no estimation of genetic variability was performed before the silvicultural treatments. Data indicates that the inter-simple sequence repeat (ISSR) method is sufficiently informative and powerful to assess genetic variability among populations of P. nigra.
Genetic Diversity of Pinus nigra Arn. Populations in Southern Spain and Northern Morocco Revealed By Inter-Simple Sequence Repeat Profiles †

PubMed Central

Rubio-Moraga, Angela; Candel-Perez, David; Lucas-Borja, Manuel E.; Tiscar, Pedro A.; Viñegla, Benjamin; Linares, Juan C.; Gómez-Gómez, Lourdes; Ahrazem, Oussama

2012-01-01

Eight Pinus nigra Arn. populations from Southern Spain and Northern Morocco were examined using inter-simple sequence repeat markers to characterize the genetic variability amongst populations. Pair-wise population genetic distance ranged from 0.031 to 0.283, with a mean of 0.150 between populations. The highest inter-population average distance was between PaCU from Cuenca and YeCA from Cazorla, while the lowest distance was between TaMO from Morocco and MA Sierra Mágina populations. Analysis of molecular variance (AMOVA) and Nei’s genetic diversity analyses revealed higher genetic variation within the same population than among different populations. Genetic differentiation (Gst) was 0.233. Cuenca showed the highest Nei’s genetic diversity followed by the Moroccan region, Sierra Mágina, and Cazorla region. However, clustering of populations was not in accordance with their geographical locations. Principal component analysis showed the presence of two major groups—Group 1 contained all populations from Cuenca while Group 2 contained populations from Cazorla, Sierra Mágina and Morocco—while Bayesian analysis revealed the presence of three clusters. The low genetic diversity observed in PaCU and YeCA is probably a consequence of inappropriate management since no estimation of genetic variability was performed before the silvicultural treatments. Data indicates that the inter-simple sequence repeat (ISSR) method is sufficiently informative and powerful to assess genetic variability among populations of P. nigra. PMID:22754321
Genetic fidelity and variability of micropropagated cassava plants (Manihot esculenta Crantz) evaluated using ISSR markers.

PubMed

Vidal, Á M; Vieira, L J; Ferreira, C F; Souza, F V D; Souza, A S; Ledo, C A S

2015-07-14

Molecular markers are efficient for assessing the genetic fidelity of various species of plants after in vitro culture. In this study, we evaluated the genetic fidelity and variability of micropropagated cassava plants (Manihot esculenta Crantz) using inter-simple sequence repeat markers. Twenty-two cassava accessions from the Embrapa Cassava & Fruits Germplasm Bank were used. For each accession, DNA was extracted from a plant maintained in the field and from 3 plants grown in vitro. For DNA amplification, 27 inter-simple sequence repeat primers were used, of which 24 generated 175 bands; 100 of those bands were polymorphic and were used to study genetic variability among accessions of cassava plants maintained in the field. Based on the genetic distance matrix calculated using the arithmetic complement of the Jaccard's index, genotypes were clustered using the unweighted pair group method using arithmetic averages. The number of bands per primer was 2-13, with an average of 7.3. For most micropropagated accessions, the fidelity study showed no genetic variation between plants of the same accessions maintained in the field and those maintained in vitro, confirming the high genetic fidelity of the micropropagated plants. However, genetic variability was observed among different accessions grown in the field, and clustering based on the dissimilarity matrix revealed 7 groups. Inter-simple sequence repeat markers were efficient for detecting the genetic homogeneity of cassava plants derived from meristem culture, demonstrating the reliability of this propagation system.
Genotyping and Molecular Identification of Date Palm Cultivars Using Inter-Simple Sequence Repeat (ISSR) Markers.

PubMed

Ayesh, Basim M

2017-01-01

Molecular markers are credible for the discrimination of genotypes and estimation of the extent of genetic diversity and relatedness in a set of genotypes. Inter-simple sequence repeat (ISSR) markers rapidly reveal high polymorphic fingerprints and have been used frequently to determine the genetic diversity among date palm cultivars. This chapter describes the application of ISSR markers for genotyping of date palm cultivars. The application involves extraction of genomic DNA from the target cultivars with reliable quality and quantity. Subsequently the extracted DNA serves as a template for amplification of genomic regions flanked by inverted simple sequence repeats using a single primer. The similarity of each pair of samples is measured by calculating the number of mono- and polymorphic bands revealed by gel electrophoresis. Matrices constructed for similarity and genetic distance are used to build a phylogenetic tree and cluster analysis, to determine the molecular relatedness of cultivars. The protocol describes 3 out of 9 tested primers consistently amplified 31 loci in 6 date palm cultivars, with 28 polymorphic loci.
Application of Inter-Simple Sequence Repeat Markers in the Analysis of Populations of the Chagas Disease Vector Triatoma infestans (Hemiptera, Reduviidae)

PubMed Central

Pérez de Rosas, Alicia R.; Restelli, María F.; Fernández, Cintia J.; Blariza, María J.; García, Beatriz A.

2017-01-01

Here we apply inter-simple sequence repeat (ISSR) markers to explore the fine-scale genetic structure and dispersal in populations of Triatoma infestans. Five selected primers from 30 primers were used to amplify ISSRs by polymerase chain reaction. A total of 90 polymorphic bands were detected across 134 individuals captured from 11 peridomestic sites from the locality of San Martín (Capayán Department, Catamarca Province, Argentina). Significant levels of genetic differentiation suggest limited gene flow among sampling sites. Spatial autocorrelation analysis confirms that dispersal occurs on the scale of ∼469 m, suggesting that insecticide spraying should be extended at least within a radius of ∼500 m around the infested area. Moreover, Bayesian clustering algorithms indicated genetic exchange among different sites analyzed, supporting the hypothesis of an important role of peridomestic structures in the process of reinfestation. PMID:28115670
Genetic Variation and Population Differentiation in a Medical Herb Houttuynia cordata in China Revealed by Inter-Simple Sequence Repeats (ISSRs)

PubMed Central

Wei, Lin; Wu, Xian-Jin

2012-01-01

Houttuynia cordata is an important traditional Chinese herb with unresolved genetics and taxonomy, which lead to potential problems in the conservation and utilization of the resource. Inter-simple sequence repeat (ISSR) markers were used to assess the level and distribution of genetic diversity in 226 individuals from 15 populations of H. cordata in China. ISSR analysis revealed low genetic variations within populations but high genetic differentiations among populations. This genetic structure probably mainly reflects the historical association among populations. Genetic cluster analysis showed that the basal clade is composed of populations from Southwest China, and the other populations have continuous and eastward distributions. The structure of genetic diversity in H. cordata demonstrated that this species might have survived in Southwest China during the glacial age, and subsequently experienced an eastern postglacial expansion. Based on the results of genetic analysis, it was proposed that as many as possible targeted populations for conservation be included. PMID:22942696
Genetic variation and population differentiation in a medical herb Houttuynia cordata in China revealed by inter-simple sequence repeats (ISSRs).

PubMed

Wei, Lin; Wu, Xian-Jin

2012-01-01

Houttuynia cordata is an important traditional Chinese herb with unresolved genetics and taxonomy, which lead to potential problems in the conservation and utilization of the resource. Inter-simple sequence repeat (ISSR) markers were used to assess the level and distribution of genetic diversity in 226 individuals from 15 populations of H. cordata in China. ISSR analysis revealed low genetic variations within populations but high genetic differentiations among populations. This genetic structure probably mainly reflects the historical association among populations. Genetic cluster analysis showed that the basal clade is composed of populations from Southwest China, and the other populations have continuous and eastward distributions. The structure of genetic diversity in H. cordata demonstrated that this species might have survived in Southwest China during the glacial age, and subsequently experienced an eastern postglacial expansion. Based on the results of genetic analysis, it was proposed that as many as possible targeted populations for conservation be included.
Genetic Variation and Geographic Differentiation Among Populations of the Nonmigratory Agricultural Pest Oedaleus infernalis (Orthoptera: Acridoidea) in China

PubMed Central

Sun, Wei; Dong, Hui; Gao, Yue-Bo; Su, Qian-Fu; Qian, Hai-Tao; Bai, Hong-Yan; Zhang, Zhu-Ting; Cong, Bin

2015-01-01

The nonmigratory grasshopper Oedaleus infernalis Saussure (Orthoptera : Acridoidea) is an agricultural pest to crops and forage grasses over a wide natural geographical distribution in China. The genetic diversity and genetic variation among 10 geographically separated populations of O. infernalis was assessed using polymerase chain reaction-based molecular markers, including the intersimple sequence repeat and mitochondrial cytochrome oxidase sequences. A high level of genetic diversity was detected among these populations from the intersimple sequence repeat (H: 0.2628, I: 0.4129, Hs: 0.2130) and cytochrome oxidase analyses (Hd: 0.653). There was no obvious geographical structure based on an unweighted pair group method analysis and median-joining network. The values of FST, θII, and Gst estimated in this study are low, and the gene flow is high (Nm > 4). Analysis of the molecular variance suggested that most of the genetic variation occurs within populations, whereas only a small variation takes place between populations. No significant correlation was found between the genetic distance and geographical distance. Overall, our results suggest that the geographical distance plays an unimpeded role in the gene flow among O. infernalis populations. PMID:26496789
Evaluation of fire recurrence effect on genetic diversity in maritime pine (Pinus pinaster Ait.) stands using Inter-Simple Sequence Repeat profiles.

PubMed

Lucas-Borja, M E; Ahrazem, O; Candel-Pérez, D; Moya, D; Fonseca, T; Hernández Tecles, E; De Las Heras, J; Gómez-Gómez, L

2016-12-01

The management of maritime pine in fire-prone habitats is a challenging task and fine-scale population genetic analyses are necessary to check if different fire recurrences affect genetic variability. The objective of this study was to assess the effect of fire recurrence on maritime pine genetic diversity using inter-simple sequence repeat markers (ISSR). Three maritime pine (Pinus pinaster Ait.) populations from Northern Portugal were chosen to characterize the genetic variability among populations. In relation to fire recurrence, Seirós population was affected by fire both in 1990 and 2005 whereas Vila Seca-2 population was affected by fire just in 2005. The Vila Seca-1 population has been never affected by fire. Our results showed the highest Nei's genetic diversity (He=0.320), Shannon information index (I=0.474) and polymorphic loci (PPL=87.79%) among samples from twice burned populations (Seirós site). Thus, fire regime plays an important role affecting genetic diversity in the short-term, although not generating maritime pine genetic erosion. Copyright © 2016 Elsevier B.V. All rights reserved.
Inter-Simple Sequence Repeat Data Reveals High Genetic Diversity in Wild Populations of the Narrowly Distributed Endemic Lilium regale in the Minjiang River Valley of China

PubMed Central

Wu, Zhu-hua; Shi, Jisen; Xi, Meng-li; Jiang, Fu-xing; Deng, Ming-wen; Dayanandan, Selvadurai

2015-01-01

Lilium regale E.H. Wilson is endemic to a narrow geographic area in the Minjiang River valley in southwestern China, and is considered an important germplasm for breeding commercially valuable lily varieties, due to its vigorous growth, resistance to diseases and tolerance for low moisture. We analyzed the genetic diversity of eight populations of L. regale sampled across the entire natural distribution range of the species using Inter-Simple Sequence Repeat markers. The genetic diversity (expected heterozygosity= 0.3356) was higher than those reported for other narrowly distributed endemic plants. The levels of inbreeding (F st = 0.1897) were low, and most of the genetic variability was found to be within (80.91%) than amongpopulations (19.09%). An indirect estimate of historical levels of gene flow (N m =1.0678) indicated high levels of gene flow among populations. The eight analyzed populations clustered into three genetically distinct groups. Based on these results, we recommend conservation of large populations representing these three genetically distinct groups. PMID:25799495
Genetic diversity of an Azorean endemic and endangered plant species inferred from inter-simple sequence repeat markers.

PubMed

Lopes, Maria S; Mendonça, Duarte; Bettencourt, Sílvia X; Borba, Ana R; Melo, Catarina; Baptista, Cláudio; da Câmara Machado, Artur

2014-06-26

Knowledge of the levels and distribution of genetic diversity is important for designing conservation strategies for threatened and endangered species so as to guarantee sustainable survival of populations and to preserve their evolutionary potential. Picconia azorica is a valuable Azorean endemic species recently classified as endangered. To contribute with information useful for the establishment of conservation programmes, the genetic variability and differentiation among 230 samples from 11 populations collected in three Azorean islands was accessed with eight inter-simple sequence repeat markers. A total of 64 polymorphic loci were detected. The majority of genetic variability was found within populations and no genetic structure was detected between populations and between islands. Also the coefficient of genetic differentiation and the level of gene flow indicate that geographical distances do not act as barriers for gene flow. In order to ensure the survival of populations in situ and ex situ management practices should be considered, including artificial propagation through the use of plant tissue culture techniques, not only for the restoration of habitat but also for the sustainable use of its valuable wood. Published by Oxford University Press on behalf of the Annals of Botany Company.
Evaluation of genetic diversity amongst Descurainia sophia L. genotypes by inter-simple sequence repeat (ISSR) marker.

PubMed

Saki, Sahar; Bagheri, Hedayat; Deljou, Ali; Zeinalabedini, Mehrshad

2016-01-01

Descurainia sophia is a valuable medicinal plant in family of Brassicaceae. To determine the range of diversity amongst D. sophia in Iran, 32 naturally distributed plants belonging to six natural populations of the Iranian plateau were investigated by inter-simple sequence repeat (ISSR) markers. The average percentage of polymorphism produced by 12 ISSR primers was 86 %. The PIC values for primers ranged from 0.22 to 0.40 and Rp values ranged between 6.5 and 19.9. The relative genetic diversity of the populations was not high (Gst =0.32). However, the value of gene flow revealed by the ISSR marker was high (Nm = 1.03). UPGMA clustering method based on Jaccard similarity coefficient grouped the genotypes into two major clusters. Graph results from Neighbor-Net Network generated after a 1000 bootstrap test using Jaccard coefficient, and STRUCTURE analysis confirmed the UPGMA clustering. The first three PCAs represented 57.31 % of the total variation. The high levels of genetic diversity were observed within populations, which is useful in breeding and conservation programs. ISSR is found to be an eligible marker to study genetic diversity of D. sophia.
Use of inter-simple sequence repeats and amplified fragment length polymorphisms to analyze genetic relationships among small grain-infecting species of ustilago.

PubMed

Menzies, J G; Bakkeren, G; Matheson, F; Procunier, J D; Woods, S

2003-02-01

ABSTRACT In the smut fungi, few features are available for use as taxonomic criteria (spore size, shape, morphology, germination type, and host range). DNA-based molecular techniques are useful in expanding the traits considered in determining relationships among these fungi. We examined the phylogenetic relationships among seven species of Ustilago (U. avenae, U. bullata, U. hordei, U. kolleri, U. nigra, U. nuda, and U. tritici) using inter-simple sequence repeats (ISSRs) and amplified fragment length polymorphisms (AFLPs) to compare their DNA profiles. Fifty-four isolates of different Ustilago spp. were analyzed using ISSR primers, and 16 isolates of Ustilago were studied using AFLP primers. The variability among isolates within species was low for all species except U. bullata. The isolates of U. bullata, U. nuda, and U. tritici were well separated and our data supports their speciation. U. avenae and U. kolleri isolates did not separate from each other and there was little variability between these species. U. hordei and U. nigra isolates also showed little variability between species, but the isolates from each species grouped together. Our data suggest that U. avenae and U. kolleri are monophyletic and should be considered one species, as should U. hordei and U. nigra.
Analysis of genetic relationships and identification of lily cultivars based on inter-simple sequence repeat markers.

PubMed

Cui, G F; Wu, L F; Wang, X N; Jia, W J; Duan, Q; Ma, L L; Jiang, Y L; Wang, J H

2014-07-29

Inter-simple sequence repeat (ISSR) markers were used to discriminate 62 lily cultivars of 5 hybrid series. Eight ISSR primers generated 104 bands in total, which all showed 100% polymorphism, and an average of 13 bands were amplified by each primer. Two software packages, POPGENE 1.32 and NTSYSpc 2.1, were used to analyze the data matrix. Our results showed that the observed number of alleles (NA), effective number of alleles (NE), Nei's genetic diversity (H), and Shannon's information index (I) were 1.9630, 1.4179, 0.2606, and 0.4080, respectively. The highest genetic similarity (0.9601) was observed between the Oriental x Trumpet and Oriental lilies, which indicated that the two hybrids had a close genetic relationship. An unweighted pair-group method with arithmetic means dendrogram showed that the 62 lily cultivars clustered into two discrete groups. The first group included the Oriental and OT cultivars, while the Asiatic, LA, and Longiflorum lilies were placed in the second cluster. The distribution of individuals in the principal component analysis was consistent with the clustering of the dendrogram. Fingerprints of all lily cultivars built from 8 primers could be separated completely. This study confirmed the effect and efficiency of ISSR identification in lily cultivars.
Genetic variation of Sargassum horneri populations detected by inter-simple sequence repeats.

PubMed

Ren, J R; Yang, R; He, Y Y; Sun, Q H

2015-01-30

The seaweed Sargassum horneri is an important brown alga in the marine environment, and it is an important raw material in the alginate industry. Unfortunately, the fixed resource that was originally reported is now reduced or disappeared, and increased floating populations have been reported in recent years. We sampled a floating population and 4 fixed cultivated populations of S. horneri along the coast of Zhejiang, China. Inter-simple sequence repeat (ISSR) markers were applied in this research to analyze the genetic variation between floating populations and fixed cultivated populations of S. horneri. In total, 220 loci were amplified with 23 ISSR primers. The percentage of polymorphic loci within each population ranged from 53.64 to 95.45%. The highest diversity was observed in population 3, which was the local species that was suspension cultured in the lab and then fixed cultivated in the Nanji Islands before sampling. The lowest diversity was obtained in the floating population 4. The genetic distances among the 5 S. horneri populations ranged from 0.0819 to 0.2889, and the distance tendency confirmed the genetic diversity. The results suggest that the floating population had the lowest genetic diversity and could not be joined into the cluster branch of the fixed cultivated populations.
Molecular diversity analysis of Tetradium ruticarpum (WuZhuYu) in China based on inter-primer binding site (iPBS) markers and inter-simple sequence repeat (ISSR) markers.

PubMed

Xu, Jing-Yuan; Zhu, Yan; Yi, Ze; Wu, Gang; Xie, Guo-Yong; Qin, Min-Jian

2018-01-01

"Wu zhu yu", which is obtained from the dried unripe fruits of Tetradium ruticarpum (A. Jussieu) T. G. Hartley, has been used as a traditional Chinese medicine for treatment of headaches, abdominal colic, and hypertension for thousands of years. The present study was designed to assess the molecular genetic diversity among 25 collected accessions of T. ruticarpum (Wu zhu yu in Chinese) from different areas of China, based on inter-primer binding site (iPBS) markers and inter-simple sequence repeat (ISSR) markers. Thirteen ISSR primers generated 151 amplification bands, of which 130 were polymorphic. Out of 165 bands that were amplified using 10 iPBS primers, 152 were polymorphic. The iPBS markers displayed a higher proportion of polymorphic loci (PPL = 92.5%) than the ISSR markers (PPL = 84.9%). The results showed that T. ruticarpum possessed high loci polymorphism and genetic differentiation occurred in this plant. The combined data of iPBS and ISSR markers scored on 25 accessions produced five clusters that approximately matched the geographic distribution of the species. The results indicated that both iPBS and ISSR markers were reliable and effective tools for analyzing the genetic diversity in T. ruticarpum. Copyright © 2018 China Pharmaceutical University. Published by Elsevier B.V. All rights reserved.
DNA methylation polymorphism in a set of elite rice cultivars and its possible contribution to inter-cultivar differential gene expression.

PubMed

Wang, Yongming; Lin, Xiuyun; Dong, Bo; Wang, Yingdian; Liu, Bao

2004-01-01

RAPD (randomly amplified polymorphic DNA) and ISSR (inter-simple sequence repeat) fingerprinting on HpaII/MspI-digested genomic DNA of nine elite japonica rice cultivars implies inter-cultivar DNA methylation polymorphism. Using both DNA fragments isolated from RAPD or ISSR gels and selected low-copy sequences as probes, methylation-sensitive Southern blot analysis confirms the existence of extensive DNA methylation polymorphism in both genes and DNA repeats among the rice cultivars. The cultivar-specific methylation patterns are stably maintained, and can be used as reliable molecular markers. Transcriptional analysis of four selected sequences (RdRP, AC9, HSP90 and MMR) on leaves and roots from normal and 5-azacytidine-treated seedlings of three representative cultivars shows an association between the transcriptional activity of one of the genes, the mismatch repair (MMR) gene, and its CG methylation patterns.

Genetic diversity of the Andean tuber-bearing species, oca (Oxalis tuberosa Mol.), investigated by inter-simple sequence repeats.

PubMed

Pissard, A; Ghislain, M; Bertin, P

2006-01-01

The Andean tuber-bearing species, Oxalis tuberosa Mol., is a vegetatively propagated crop cultivated in the uplands of the Andes. Its genetic diversity was investigated in the present study using the inter-simple sequence repeat (ISSR) technique. Thirty-two accessions originating from South America (Argentina, Bolivia, Chile, and Peru) and maintained in vitro were chosen to represent the ecogeographic diversity of its cultivation area. Twenty-two primers were tested and 9 were selected according to fingerprinting quality and reproducibility. Genetic diversity analysis was performed with 90 markers. Jaccard's genetic distance between accessions ranged from 0 to 0.49 with an average of 0.28 +/- 0.08 (mean +/- SD). Dendrogram (UPGMA (unweighted pair-group method with arithmetic averaging)) and factorial correspondence analysis (FCA) showed that the genetic structure was influenced by the collection site. The two most distant clusters contained all of the Peruvian accessions, one from Bolivia, none from Argentina or Chile. Analysis by country revealed that Peru presented the greatest genetic distances from the other countries and possessed the highest intra-country genetic distance (0.30 +/- 0.08). This suggests that the Peruvian oca accessions form a distinct genetic group. The relatively low level of genetic diversity in the oca species may be related to its predominating reproduction strategy, i.e., vegetative propagation. The extent and structure of the genetic diversity of the species detailed here should help the establishment of conservation strategies.
Genetic variation assessment of acid lime accessions collected from south of Iran using SSR and ISSR molecular markers.

PubMed

Sharafi, Ata Allah; Abkenar, Asad Asadi; Sharafi, Ali; Masaeli, Mohammad

2016-01-01

Iran has a long history of acid lime cultivation and propagation. In this study, genetic variation in 28 acid lime accessions from five regions of south of Iran, and their relatedness with other 19 citrus cultivars were analyzed using Simple Sequence Repeat (SSR) and Inter-Simple Sequence Repeat (ISSR) molecular markers. Nine primers for SSR and nine ISSR primers were used for allele scoring. In total, 49 SSR and 131 ISSR polymorphic alleles were detected. Cluster analysis of SSR and ISSR data showed that most of the acid lime accessions (19 genotypes) have hybrid origin and genetically distance with nucellar of Mexican lime (9 genotypes). As nucellar of Mexican lime are susceptible to phytoplasma, these acid lime genotypes can be used to evaluate their tolerance against biotic constricts like lime "witches' broom disease".
Molecular Identification of Date Palm Cultivars Using Random Amplified Polymorphic DNA (RAPD) Markers.

PubMed

Al-Khalifah, Nasser S; Shanavaskhan, A E

2017-01-01

Ambiguity in the total number of date palm cultivars across the world is pointing toward the necessity for an enumerative study using standard morphological and molecular markers. Among molecular markers, DNA markers are more suitable and ubiquitous to most applications. They are highly polymorphic in nature, frequently occurring in genomes, easy to access, and highly reproducible. Various molecular markers such as restriction fragment length polymorphism (RFLP), amplified fragment length polymorphism (AFLP), simple sequence repeats (SSR), inter-simple sequence repeats (ISSR), and random amplified polymorphic DNA (RAPD) markers have been successfully used as efficient tools for analysis of genetic variation in date palm. This chapter explains a stepwise protocol for extracting total genomic DNA from date palm leaves. A user-friendly protocol for RAPD analysis and a table showing the primers used in different molecular techniques that produce polymorphisms in date palm are also provided.
Use of molecular markers to compare Fusarium verticillioides pathogenic strains isolated from plants and humans.

PubMed

Chang, S C; Macêdo, D P C; Souza-Motta, C M; Oliveira, N T

2013-08-12

Fusarium verticillioides is a pathogen of agriculturally important crops, especially maize. It is considered one of the most important pathogens responsible for fumonisin contamination of food products, which causes severe, chronic, and acute intoxication in humans and animals. Moreover, it is recognized as a cause of localized infections in immunocompetent patients and disseminated infections among severely immunosuppressed patients. Several molecular tools have been used to analyze the intraspecific variability of fungi. The objective of this study was to use molecular markers to compare pathogenic isolates of F. verticillioides and isolates of the same species obtained from clinical samples of patients with Fusarium mycoses. The molecular markers that we used were inter-simple sequence repeat markers (primers GTG5 and GACA4), intron splice site primer (primer EI1), random amplified polymorphic DNA marker (primer OPW-6), and restriction fragment length polymorphism-internal transcribed spacer (ITS) from rDNA. From the data obtained, clusters were generated based on the UPGMA clustering method. The amplification products obtained using primers ITS4 and ITS5 and loci ITS1-5.8-ITS2 of the rDNA yielded fragments of approximately 600 bp for all the isolates. Digestion of the ITS region fragment using restriction enzymes such as EcoRI, DraI, BshI, AluI, HaeIII, HinfI, MspI, and PstI did not permit differentiation among pathogenic and clinical isolates. The inter-simple sequence repeat, intron splice site primer, and random amplified polymorphic DNA markers presented high genetic homogeneity among clinical isolates in contrast to the high variability found among the phytopathogenic isolates of F. verticillioides.
MIG-seq: an effective PCR-based method for genome-wide single-nucleotide polymorphism genotyping using the next-generation sequencing platform

PubMed Central

Suyama, Yoshihisa; Matsuki, Yu

2015-01-01

Restriction-enzyme (RE)-based next-generation sequencing methods have revolutionized marker-assisted genetic studies; however, the use of REs has limited their widespread adoption, especially in field samples with low-quality DNA and/or small quantities of DNA. Here, we developed a PCR-based procedure to construct reduced representation libraries without RE digestion steps, representing de novo single-nucleotide polymorphism discovery, and its genotyping using next-generation sequencing. Using multiplexed inter-simple sequence repeat (ISSR) primers, thousands of genome-wide regions were amplified effectively from a wide variety of genomes, without prior genetic information. We demonstrated: 1) Mendelian gametic segregation of the discovered variants; 2) reproducibility of genotyping by checking its applicability for individual identification; and 3) applicability in a wide variety of species by checking standard population genetic analysis. This approach, called multiplexed ISSR genotyping by sequencing, should be applicable to many marker-assisted genetic studies with a wide range of DNA qualities and quantities. PMID:26593239
Identification of apple cultivars on the basis of simple sequence repeat markers.

PubMed

Liu, G S; Zhang, Y G; Tao, R; Fang, J G; Dai, H Y

2014-09-12

DNA markers are useful tools that play an important role in plant cultivar identification. They are usually based on polymerase chain reaction (PCR) and include simple sequence repeats (SSRs), inter-simple sequence repeats, and random amplified polymorphic DNA. However, DNA markers were not used effectively in the complete identification of plant cultivars because of the lack of known DNA fingerprints. Recently, a novel approach called the cultivar identification diagram (CID) strategy was developed to facilitate the use of DNA markers for separate plant individuals. The CID was designed whereby a polymorphic maker was generated from each PCR that directly allowed for cultivar sample separation at each step. Therefore, it could be used to identify cultivars and varieties easily with fewer primers. In this study, 60 apple cultivars, including a few main cultivars in fields and varieties from descendants (Fuji x Telamon) were examined. Of the 20 pairs of SSR primers screened, 8 pairs gave reproducible, polymorphic DNA amplification patterns. The banding patterns obtained from these 8 primers were used to construct a CID map. Each cultivar or variety in this study was distinguished from the others completely, indicating that this method can be used for efficient cultivar identification. The result contributed to studies on germplasm resources and the seedling industry in fruit trees.
A genetic linkage map of grape, utilizing Vitis rupestris and Vitis arizonica.

PubMed

Doucleff, M; Jin, Y; Gao, F; Riaz, S; Krivanek, A F; Walker, M A

2004-10-01

A genetic linkage map of grape was constructed, utilizing 116 progeny derived from a cross of two Vitis rupestris x V. arizonica interspecific hybrids, using the pseudo-testcross strategy. A total of 475 DNA markers-410 amplified fragment length polymorphism, 24 inter-simple sequence repeat, 32 random amplified polymorphic DNA, and nine simple sequence repeat markers-were used to construct the parental maps. Markers segregating 1:1 were used to construct parental framework maps with confidence levels >90% with the Plant Genome Research Initiative mapping program. In the maternal (D8909-15) map, 105 framework markers and 55 accessory markers were ordered in 17 linkage groups (756 cM). The paternal (F8909-17) map had 111 framework markers and 33 accessory markers ordered in 19 linkage groups (1,082 cM). One hundred eighty-one markers segregating 3:1 were used to connect the two parental maps' parents. This moderately dense map will be useful for the initial mapping of genes and/or QTL for resistance to the dagger nematode, Xiphinema index, and Xylella fastidiosa, the bacterial causal agent of Pierce's disease.
Microsatellite loci for the stingless bee Melipona rufiventris (Hymenoptera: Apidae).

PubMed

Lopes, Denilce Meneses; D Silva, Filipe Oliveira; Fernandes Salomão, Tânia Maria; Campos, Lúcio Antônio D Oliveira; Tavares, Mara Garcia

2009-05-01

Eight microsatellite primers were developed from ISSR (intersimple sequence repeats) markers for the stingless bee Melipona rufiventris. These primers were tested in 20 M. rufiventris workers, representing a single population from Minas Gerais state. The number of alleles per locus ranged from 2 to 5 (mean = 2.63) and the observed and expected heterozygosity values ranged from 0.00 to 0.44 (mean = 0.20) and from 0.05 to 0.68 (mean = 0.31), respectively. Several loci were also polymorphic in M. quadrifasciata, M. bicolor, M. mandacaia and Partamona helleri and should prove useful in population studies of other stingless bees. © 2009 The Authors. Journal compilation © 2009 Blackwell Publishing Ltd.
Genetic linkage map and QTL identification for adventitious rooting traits in red gum eucalypts.

PubMed

Sumathi, Murugan; Bachpai, Vijaya Kumar Waman; Mayavel, A; Dasgupta, Modhumita Ghosh; Nagarajan, Binai; Rajasugunasekar, D; Sivakumar, Veerasamy; Yasodha, Ramasamy

2018-05-01

The eucalypt species, Eucalyptus tereticornis and Eucalyptus camaldulensis , show tolerance to drought and salinity conditions, respectively, and are widely cultivated in arid and semiarid regions of tropical countries. In this study, genetic linkage map was developed for interspecific cross E. tereticornis × E. camaldulensis using pseudo-testcross strategy with simple sequence repeats (SSRs), intersimple sequence repeats (ISSRs), and sequence-related amplified polymorphism (SRAP) markers. The consensus genetic map comprised totally 283 markers with 84 SSRs, 94 ISSRs, and 105 SRAP markers on 11 linkage groups spanning 1163.4 cM genetic distance. Blasting the SSR sequences against E. grandis sequences allowed an alignment of 64% and the average ratio of genetic-to-physical distance was 1.7 Mbp/cM, which strengths the evidence that high amount of synteny and colinearity exists among eucalypts genome. Blast searches also revealed that 37% of SSRs had homologies with genes, which could potentially be used in the variety of downstream applications including candidate gene polymorphism. Quantitative trait loci (QTL) analysis for adventitious rooting traits revealed six QTL for rooting percent and root length on five chromosomes with interval and composite interval mapping. All the QTL explained 12.0-14.7% of the phenotypic variance, showing the involvement of major effect QTL on adventitious rooting traits. Increasing the density of markers would facilitate the detection of more number of small-effect QTL and also underpinning the genes involved in rooting process.
Molecular characterizations of somatic hybrids developed between Pleurotus florida and Lentinus squarrosulus through inter-simple sequence repeat markers and sequencing of ribosomal RNA-ITS gene.

PubMed

Mallick, Pijush; Chattaraj, Shruti; Sikdar, Samir Ranjan

2017-10-01

The 12 pfls somatic hybrids and 2 parents of Pleurotus florida and Lentinus s quarrosulus were characterized by ISSR and sequencing of rRNA-ITS genes. Five ISSR primers were used and amplified a total of 54 reproducible fragments with 98.14% polymorphism among all the pfls hybrid populations and parental strains. UPGMA-based cluster exhibited a dendrogram with three major groups between the parents and pfls hybrids. Parent P . florida and L . squarrosulus showed different degrees of genetic distance with all the hybrid lines and they showed closeness to hybrid pfls 1m and pfls 1h , respectively. ITS1(F) and ITS4(R) amplified the rRNA-ITS gene with 611-867 bp sequence length. The nucleotide polymorphisms were found in the ITS1, ITS2 and 5.8S rRNA region with different number of bases. Based on rRNA-ITS sequence, UPGMA cluster exhibited three distinct groups between L. squarrosulus and pfls 1p , pfls 1m and pfls 1s , and pfls 1e and P. florida .
Relative profile analysis of molecular markers for identification and genetic discrimination of loaches (Pisces, Nemacheilidae).

PubMed

Patil, Tejas Suresh; Tamboli, Asif Shabodin; Patil, Swapnil Mahadeo; Bhosale, Amrut Ravindra; Govindwar, Sanjay Prabhu; Muley, Dipak Vishwanathrao

2016-01-01

Genus Nemacheilus, Nemachilichthys and Schistura belong to the family Nemacheilidae of the order Cypriniformes. The present investigation was undertaken to observe genetic diversity, phylogenetic relationship and to develop a molecular-based tool for taxonomic identification. For this purpose, four different types of molecular markers were utilized in which 29 random amplified polymorphic DNA (RAPD), 25 inter-simple sequence repeat (ISSR) markers, and 10 amplified fragment length polymorphism (AFLP) marker sets were screened and mitochondrial COI gene was sequenced. This study added COI barcodes for the identification of Nemacheilus anguilla, Nemachilichthys rueppelli and Schistura denisoni. RAPD showed higher polymorphism (100%) than the ISSR (93.75-100%) and AFLP (93.86-98.96%). The polymorphic information content (PIC), heterozygosity, multiplex ratio, and gene diversity was observed highest for AFLP primers, whereas the major allele frequency was observed higher for RAPD (0.5556) and lowest for AFLP (0.1667). The COI region of all individuals was successfully amplified and sequenced, which gave a 100% species resolution. Copyright © 2016 Académie des sciences. Published by Elsevier SAS. All rights reserved.
Characterization of the Genetic Diversity of Acid Lime (Citrus aurantifolia (Christm.) Swingle) Cultivars of Eastern Nepal Using Inter-Simple Sequence Repeat Markers.

PubMed

Munankarmi, Nabin Narayan; Rana, Neesha; Bhattarai, Tribikram; Shrestha, Ram Lal; Joshi, Bal Krishna; Baral, Bikash; Shrestha, Sangita

2018-06-12

Acid lime ( Citrus aurantifolia (Christm.) Swingle) is an important fruit crop, which has high commercial value and is cultivated in 60 out of the 77 districts representing all geographical landscapes of Nepal. A lack of improved high-yielding varieties, infestation with various diseases, and pests, as well as poor management practices might have contributed to its extremely reduced productivity, which necessitates a reliable understanding of genetic diversity in existing cultivars. Hereby, we aim to characterize the genetic diversity of acid lime cultivars cultivated at three different agro-ecological gradients of eastern Nepal, employing PCR-based inter-simple sequence repeat (ISSR) markers. Altogether, 21 polymorphic ISSR markers were used to assess the genetic diversity in 60 acid lime cultivars sampled from different geographical locations. Analysis of binary data matrix was performed on the basis of bands obtained, and principal coordinate analysis and phenogram construction were performed using different computer algorithms. ISSR profiling yielded 234 amplicons, of which 87.18% were polymorphic. The number of amplified fragments ranged from 7⁻18, with amplicon size ranging from ca. 250⁻3200 bp. The Numerical Taxonomy and Multivariate System (NTSYS)-based cluster analysis using the unweighted pair group method of arithmetic averages (UPGMA) algorithm and Dice similarity coefficient separated 60 cultivars into two major and three minor clusters. Genetic diversity analysis using Popgene ver. 1.32 revealed the highest percentage of polymorphic bands (PPB), Nei’s genetic diversity (H), and Shannon’s information index (I) for the Terai zone (PPB = 69.66%; H = 0.215; I = 0.325), and the lowest of all three for the high hill zone (PPB = 55.13%; H = 0.173; I = 0.262). Thus, our data indicate that the ISSR marker has been successfully employed for evaluating the genetic diversity of Nepalese acid lime cultivars and has furnished valuable information on intrinsic genetic diversity and the relationship between cultivars that might be useful in acid lime breeding and conservation programs in Nepal.
Markers and mapping revisited: finding your gene.

PubMed

Jones, Neil; Ougham, Helen; Thomas, Howard; Pasakinskiene, Izolda

2009-01-01

This paper is an update of our earlier review (Jones et al., 1997, Markers and mapping: we are all geneticists now. New Phytologist 137: 165-177), which dealt with the genetics of mapping, in terms of recombination as the basis of the procedure, and covered some of the first generation of markers, including restriction fragment length polymorphisms (RFLPs), random amplified polymorphic DNA (RAPDs), simple sequence repeats (SSRs) and quantitative trait loci (QTLs). In the intervening decade there have been numerous developments in marker science with many new systems becoming available, which are herein described: cleavage amplification polymorphism (CAP), sequence-specific amplification polymorphism (S-SAP), inter-simple sequence repeat (ISSR), sequence tagged site (STS), sequence characterized amplification region (SCAR), selective amplification of microsatellite polymorphic loci (SAMPL), single nucleotide polymorphism (SNP), expressed sequence tag (EST), sequence-related amplified polymorphism (SRAP), target region amplification polymorphism (TRAP), microarrays, diversity arrays technology (DArT), single-strand conformation polymorphism (SSCP), denaturing gradient gel electrophoresis (DGGE), temperature gradient gel electrophoresis (TGGE) and methylation-sensitive PCR. In addition there has been an explosion of knowledge and databases in the area of genomics and bioinformatics. The number of flowering plant ESTs is c. 19 million and counting, with all the opportunity that this provides for gene-hunting, while the survey of bioinformatics and computer resources points to a rapid growth point for future activities in unravelling and applying the burst of new information on plant genomes. A case study is presented on tracking down a specific gene (stay-green (SGR), a post-transcriptional senescence regulator) using the full suite of mapping tools and comparative mapping resources. We end with a brief speculation on how genome analysis may progress into the future of this highly dynamic arena of plant science.
Varietal Discrimination and Genetic Variability Analysis of Cymbopogon Using RAPD and ISSR Markers Analysis.

PubMed

Bishoyi, Ashok Kumar; Sharma, Anjali; Kavane, Aarti; Geetha, K A

2016-06-01

Cymbopogon is an important genus of family Poaceae, cultivated mainly for its essential oils which possess high medicinal and economical value. Several cultivars of Cymbopogon species are available for commercial cultivation in India and identification of these cultivars was conceded by means of morphological markers and essential oil constitution. Since these parameters are highly influenced by environmental factors, in most of the cases, it is difficult to identify Cymbopogon cultivars. In the present study, Random amplified polymorphic DNA (RAPD) and Inter-simple sequence repeat (ISSR) markers were employed to discriminate nine leading varieties of Cymbopogon since prior genomic information is lacking or very little in the genus. Ninety RAPD and 70 ISSR primers were used which generated 63 and 69 % polymorphic amplicons, respectively. Similarity in the pattern of UPGMA-derived dendrogram of RAPD and ISSR analysis revealed the reliability of the markers chosen for the study. Varietal/cultivar-specific markers generated from the study could be utilised for varietal/cultivar authentication, thus monitoring the quality of the essential oil production in Cymbopogon. These markers can also be utilised for the IPR protection of the cultivars. Moreover, the study provides molecular marker tool kit in both random and simple sequence repeats for diverse molecular research in the same or related genera.
Genetic diversity and structure of Atta robusta (Hymenoptera, Formicidae, Attini), an endangered species endemic to the restinga ecoregion

PubMed Central

dos Reis, Evelyze Pinheiro; Fernandes Salomão, Tânia Maria; de Oliveira Campos, Lucio Antonio; Tavares, Mara Garcia

2014-01-01

The genetic diversity and structure of the ant Atta robusta were assessed by ISSR (inter-simple sequence repeats) in 72 colonies collected from 10 localities in the Brazilian states of Espírito Santo (48 colonies) and Rio de Janeiro (24 colonies). The ISSR pattern included 67 bands, 51 of them (76.1%) polymorphic. Analysis of molecular variance (AMOVA) revealed a high level (57.4%) of inter-population variation, which suggested a high degree of genetic structure that was confirmed by UPGMA (unweighted pair-group method using an arithmetic average) cluster analysis. The significant correlation between genetic and geographic distances (r = 0.64, p < 0.05) indicated isolation that reflected the distance between locations. Overall, the populations were found to be genetically divergent. This finding indicates the need for management plans to preserve and reduce the risk of extinction of A. robusta. PMID:25249782
Microsatellite markers for Senna spectabilis var. excelsa (Caesalpinioideae, Fabaceae)1

PubMed Central

López-Roberts, M. Cristina; Barbosa, Ariane R.; Paganucci de Queiroz, Luciano; van den Berg, Cássio

2016-01-01

Premise of the study: Senna spectabilis var. excelsa (Fabaceae) is a South and Central American tree of great ecological importance and one of the most common species in several sites of seasonally dry forests. Our goal was to develop microsatellite markers to assess the genetic diversity and structure of this species. Methods and Results: We designed and assessed 53 loci obtained from a microsatellite-enriched library and an intersimple sequence repeat library. Fourteen loci were polymorphic, and they presented a total of 39 alleles in a sample of 61 individuals from six populations. The mean values of observed and expected heterozygosities were 0.355 and 0.479, respectively. Polymorphism information content was 0.390 and the Shannon index was 0.778. Conclusions: Polymorphism information content and Shannon index indicate that at least nine of the 14 microsatellite loci developed are moderate to highly informative, and potentially useful for population genetic studies in this species. PMID:26819856
Genetic diversity and relationship of Hedychium from Northeast India as dissected using PCA analysis and hierarchical clustering.

PubMed

Basak, Supriyo; Ramesh, Aadi Moolam; Kesari, Vigya; Parida, Ajay; Mitra, Sudip; Rangan, Latha

2014-12-01

Molecular genetic fingerprints of eleven Hedychium species from Northeast India were developed using PCR based markers. Fifteen inter-simple sequence repeats (ISSRs) and five amplified fragment length polymorphism (AFLP) primers produced 547 polymorphic fragments. Positive correlation (r = 0.46) was observed between the mean genetic similarity and genetic diversity parameters at the inter-species level. AFLP and ISSR markers were able to group the species according to its altitude and intensity of flower aroma. Cophenetic correlation coefficients between the dendrogram and the original similarity matrix were significant for ISSR (r = 0.89) compared to AFLP (r = 0.83) markers. This genetic characterization of Hedychium from Northeast India contributes to the knowledge of genetic structure of the species and can be used to define strategies for their conservation and management.
Genetic diversity and structure of Atta robusta (Hymenoptera, Formicidae, Attini), an endangered species endemic to the restinga ecoregion.

PubMed

Dos Reis, Evelyze Pinheiro; Fernandes Salomão, Tânia Maria; de Oliveira Campos, Lucio Antonio; Tavares, Mara Garcia

2014-09-01

The genetic diversity and structure of the ant Atta robusta were assessed by ISSR (inter-simple sequence repeats) in 72 colonies collected from 10 localities in the Brazilian states of Espírito Santo (48 colonies) and Rio de Janeiro (24 colonies). The ISSR pattern included 67 bands, 51 of them (76.1%) polymorphic. Analysis of molecular variance (AMOVA) revealed a high level (57.4%) of inter-population variation, which suggested a high degree of genetic structure that was confirmed by UPGMA (unweighted pair-group method using an arithmetic average) cluster analysis. The significant correlation between genetic and geographic distances (r = 0.64, p < 0.05) indicated isolation that reflected the distance between locations. Overall, the populations were found to be genetically divergent. This finding indicates the need for management plans to preserve and reduce the risk of extinction of A. robusta.
Diversity and genetic stability in banana genotypes in a breeding program using inter simple sequence repeats (ISSR) markers.

PubMed

Silva, A V C; Nascimento, A L S; Vitória, M F; Rabbani, A R C; Soares, A N R; Lédo, A S

2017-02-23

Banana (Musa spp) is a fruit species frequently cultivated and consumed worldwide. Molecular markers are important for estimating genetic diversity in germplasm and between genotypes in breeding programs. The objective of this study was to analyze the genetic diversity of 21 banana genotypes (FHIA 23, PA42-44, Maçã, Pacovan Ken, Bucaneiro, YB42-47, Grand Naine, Tropical, FHIA 18, PA94-01, YB42-17, Enxerto, Japira, Pacovã, Prata-Anã, Maravilha, PV79-34, Caipira, Princesa, Garantida, and Thap Maeo), by using inter-simple sequence repeat (ISSR) markers. Material was generated from the banana breeding program of Embrapa Cassava & Fruits and evaluated at Embrapa Coastal Tablelands. The 12 primers used in this study generated 97.5% polymorphism. Four clusters were identified among the different genotypes studied, and the sum of the first two principal components was 48.91%. From the Unweighted Pair Group Method using Arithmetic averages (UPGMA) dendrogram, it was possible to identify two main clusters and subclusters. Two genotypes (Garantida and Thap Maeo) remained isolated from the others, both in the UPGMA clustering and in the principal cordinate analysis (PCoA). Using ISSR markers, we could analyze the genetic diversity of the studied material and state that these markers were efficient at detecting sufficient polymorphism to estimate the genetic variability in banana genotypes.
Molecular diversity and hypoglycemic polypeptide-P content of Momordica charantia in different accessions and different seasons.

PubMed

Tian, Miao; Zeng, Xiang-Qing; Song, Huan-Lei; Hu, Shan-Xin; Wang, Fu-Jun; Zhao, Jian; Hu, Zhi-Bi

2015-04-01

Momordica charantia (MC) has been used for treating diabetes mellitus from ancient times in Asia, Africa and South America. There are many MC accessions in local markets. Polypeptide-P as a main hypoglycemic component in MC was first studied in this experiment to illustrate the different contents in MC of different accessions and different harvesting times. Nineteen MC accessions collected from different regions were clustered into three groups using random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) molecular markers. Content of polypeptide-P in the tested MC accessions was detected by western blot (WB) method. The WB results revealed that polypeptide-P was detected in MC accessions harvested in June and July but not in September and October. Furthermore, Polypeptide-P content corresponded well with the MC accessions. Our results suggest that the MC accessions and the harvesting times or the weather during harvest play significant roles in high content of polypeptide-P. © 2014 Society of Chemical Industry.

Comparative analysis of genetic diversity among Indian populations of Scirpophaga incertulas by ISSR-PCR and RAPD-PCR.

PubMed

Kumar, L S; Sawant, A S; Gupta, V S; Ranjekar, P K

2001-10-01

Genetic variation between 28 Indian populations of the rice pest, Scirpophaga incertulas was evaluated using inter-simple sequence repeats (ISSR)-PCR assay. Nine SSR primers gave rise to 79 amplification products of which 67 were polymorphic. A dendrogram constructed from this data indicates that there is no geographical bias to the clustering and that gene flow between populations appears to be relatively unrestricted, substantiating our earlier conclusion based on the RAPD (random amplified polymorphic DNA) data. The dendrograms obtained using each of these marker systems were poorly correlated with each other as determined by Mantel's test for matrix correlation. Estimates of expected heterozygosity and marker index for each of these marker systems suggests that both these marker systems are equally efficient in determining polymorphisms. Matrix correlation analyses suggest that reliable estimates of genetic variation among the S. incertulas pest populations can be obtained by using RAPDs alone or in combination with ISSRs, but ISSRs alone cannot be used for this purpose.
Molecular genetic variation and structure of Southeast Asian crocodile (Tomistoma schlegelii): Comparative potentials of SSRs versus ISSRs.

PubMed

Shafiei-Astani, Behnam; Ong, Alan Han Kiat; Valdiani, Alireza; Tan, Soon Guan; Yien, Christina Yong Seok; Ahmady, Fatemeh; Alitheen, Noorjahan Banu; Ng, Wei Lun; Kuar, Taranjeet

2015-10-15

Tomistoma schlegelii, also referred to as the "false gharial", is one of the most exclusive and least known of the world's fresh water crocodilians, limited to Southeast Asia. Indeed, lack of economic value for its skin has led to neglect the biodiversity of the species. The current study aimed to investigate the mentioned case using 40 simple sequence repeat (SSR) primer pairs and 45 inter-simple sequence repeat (ISSR) primers. DNA analysis of 17 T. schlegelii samples using the SSR and ISSR markers resulted in producing a total of 49 and 108 polymorphic bands, respectively. Furthermore, the SSR- and ISSR-based cluster analyses both generated two main clusters. However, the SSR based results were found to be more in line with the geographical distributions of the crocodile samples collected across the country as compared with the ISSR-based results. The observed heterozygosity (HO) and expected heterozygosity (HE) of the polymorphic SSRs ranged between 0.588-1 and 0.470-0.891, respectively. The present results suggest that the Malaysian T. schlegelii populations had originated from a core population of crocodiles. In cooperation with the SSR markers, the ISSRs showed high potential for studying the genetic variation of T. schlegelii, and these markers are suitable to be employed in conservation genetic programs of this endangered species. Both SSR- and ISSR-based STRUCTURE analyses suggested that all the individuals of T. schlegelii are genetically similar with each other. Copyright © 2015 Elsevier B.V. All rights reserved.
Construction of the first genetic linkage map of Japanese gentian (Gentianaceae)

PubMed Central

2012-01-01

Background Japanese gentians (Gentiana triflora and Gentiana scabra) are amongst the most popular floricultural plants in Japan. However, genomic resources for Japanese gentians have not yet been developed, mainly because of the heterozygous genome structure conserved by outcrossing, the long juvenile period, and limited knowledge about the inheritance of important traits. In this study, we developed a genetic linkage map to improve breeding programs of Japanese gentians. Results Enriched simple sequence repeat (SSR) libraries from a G. triflora double haploid line yielded almost 20,000 clones using 454 pyrosequencing technology, 6.7% of which could be used to design SSR markers. To increase the number of molecular markers, we identified three putative long terminal repeat (LTR) sequences using the recently developed inter-primer binding site (iPBS) method. We also developed retrotransposon microsatellite amplified polymorphism (REMAP) markers combining retrotransposon and inter-simple sequence repeat (ISSR) markers. In addition to SSR and REMAP markers, modified amplified fragment length polymorphism (AFLP) and random amplification polymorphic DNA (RAPD) markers were developed. Using 93 BC1 progeny from G. scabra backcrossed with a G. triflora double haploid line, 19 linkage groups were constructed with a total of 263 markers (97 SSR, 97 AFLP, 39 RAPD, and 30 REMAP markers). One phenotypic trait (stem color) and 10 functional markers related to genes controlling flower color, flowering time and cold tolerance were assigned to the linkage map, confirming its utility. Conclusions This is the first reported genetic linkage map for Japanese gentians and for any species belonging to the family Gentianaceae. As demonstrated by mapping of functional markers and the stem color trait, our results will help to explain the genetic basis of agronomic important traits, and will be useful for marker-assisted selection in gentian breeding programs. Our map will also be an important resource for further genetic analyses such as mapping of quantitative trait loci and map-based cloning of genes in this species. PMID:23186361
RAPD and ISSR based evaluation of genetic stability of micropropagated plantlets of Morus alba L. variety S-1

PubMed Central

Saha, Soumen; Adhikari, Sinchan; Dey, Tulsi; Ghosh, Parthadeb

2015-01-01

Plant regeneration through rapid in vitro clonal propagation of nodal explants of Morus alba L. variety S-1 was established along with genetic stability analysis of regenerates. Axillary shoot bud proliferation was achieved on Murashige and Skoog (MS) medium in various culture regimes. Highest number of shoots (5.62 ± 0.01), with average length 4.19 ± 0.01 cm, was initially achieved with medium containing 0.5 mg/l N6-benzyladenine (BA) and 3% sucrose. Repeated subculturing of newly formed nodal parts after each harvest up to sixth passage, yielded highest number of shoots (about 32.27) per explants was obtained after fourth passage. Rooting of shoots occurred on 1/2 MS medium supplemented with 1.0 mg/1 Indole-3-butyric acid (IBA). About 90% (89.16) of the plantlets transferred to the mixture of sand:soil:organic manure (2:2:1) in small plastic pots acclimatized successfully. Genetic stability of the discussed protocol was confirmed by two DNA-based fingerprinting techniques i.e. RAPD (random amplified polymorphic DNA) and ISSR (inter-simple sequence repeat). This protocol can be used for commercial propagation and for future genetic improvement studies. PMID:26693403
Sequence-related amplified polymorphism (SRAP) markers: A potential resource for studies in plant molecular biology(1.).

PubMed

Robarts, Daniel W H; Wolfe, Andrea D

2014-07-01

In the past few decades, many investigations in the field of plant biology have employed selectively neutral, multilocus, dominant markers such as inter-simple sequence repeat (ISSR), random-amplified polymorphic DNA (RAPD), and amplified fragment length polymorphism (AFLP) to address hypotheses at lower taxonomic levels. More recently, sequence-related amplified polymorphism (SRAP) markers have been developed, which are used to amplify coding regions of DNA with primers targeting open reading frames. These markers have proven to be robust and highly variable, on par with AFLP, and are attained through a significantly less technically demanding process. SRAP markers have been used primarily for agronomic and horticultural purposes, developing quantitative trait loci in advanced hybrids and assessing genetic diversity of large germplasm collections. Here, we suggest that SRAP markers should be employed for research addressing hypotheses in plant systematics, biogeography, conservation, ecology, and beyond. We provide an overview of the SRAP literature to date, review descriptive statistics of SRAP markers in a subset of 171 publications, and present relevant case studies to demonstrate the applicability of SRAP markers to the diverse field of plant biology. Results of these selected works indicate that SRAP markers have the potential to enhance the current suite of molecular tools in a diversity of fields by providing an easy-to-use, highly variable marker with inherent biological significance.
Sequence-related amplified polymorphism (SRAP) markers: A potential resource for studies in plant molecular biology1

PubMed Central

Robarts, Daniel W. H.; Wolfe, Andrea D.

2014-01-01

In the past few decades, many investigations in the field of plant biology have employed selectively neutral, multilocus, dominant markers such as inter-simple sequence repeat (ISSR), random-amplified polymorphic DNA (RAPD), and amplified fragment length polymorphism (AFLP) to address hypotheses at lower taxonomic levels. More recently, sequence-related amplified polymorphism (SRAP) markers have been developed, which are used to amplify coding regions of DNA with primers targeting open reading frames. These markers have proven to be robust and highly variable, on par with AFLP, and are attained through a significantly less technically demanding process. SRAP markers have been used primarily for agronomic and horticultural purposes, developing quantitative trait loci in advanced hybrids and assessing genetic diversity of large germplasm collections. Here, we suggest that SRAP markers should be employed for research addressing hypotheses in plant systematics, biogeography, conservation, ecology, and beyond. We provide an overview of the SRAP literature to date, review descriptive statistics of SRAP markers in a subset of 171 publications, and present relevant case studies to demonstrate the applicability of SRAP markers to the diverse field of plant biology. Results of these selected works indicate that SRAP markers have the potential to enhance the current suite of molecular tools in a diversity of fields by providing an easy-to-use, highly variable marker with inherent biological significance. PMID:25202637
Assessment of genetic diversity of Bermudagrass (Cynodon dactylon) using ISSR markers.

PubMed

Farsani, Tayebeh Mohammadi; Etemadi, Nematollah; Sayed-Tabatabaei, Badraldin Ebrahim; Talebi, Majid

2012-01-01

Bermudagrass (Cynodon spp.) is a major turfgrass for home lawns, public parks, golf courses and sport fields and is known to have originated in the Middle East. Morphological and physiological characteristics are not sufficient to differentiate some bermudagrass genotypes because the differences between them are often subtle and subjected to environmental influences. In this study, twenty seven bermudagrass accessions and introductions, mostly from different parts of Iran, were assayed by inter-simple sequence repeat (ISSR) markers to differentiate and explore their genetic relationships. Fourteen ISSR primers amplified 389 fragments of which 313 (80.5%) were polymorphic. The average polymorphism information content (PIC) was 0.328, which shows that the majority of primers are informative. Cluster analysis using the un-weighted paired group method with arithmetic average (UPGMA) method and Jaccard's similarity coefficient (r = 0.828) grouped the accessions into six main clusters according to some degree to geographical origin, their chromosome number and some morphological characteristics. It can be concluded that there exists a wide genetic base of bermudograss in Iran and that ISSR markers are effective in determining genetic diversity and relationships among them.
Assessment of Genetic Diversity of Bermudagrass (Cynodon dactylon) Using ISSR Markers

PubMed Central

Farsani, Tayebeh Mohammadi; Etemadi, Nematollah; Sayed-Tabatabaei, Badraldin Ebrahim; Talebi, Majid

2012-01-01

Bermudagrass (Cynodon spp.) is a major turfgrass for home lawns, public parks, golf courses and sport fields and is known to have originated in the Middle East. Morphological and physiological characteristics are not sufficient to differentiate some bermudagrass genotypes because the differences between them are often subtle and subjected to environmental influences. In this study, twenty seven bermudagrass accessions and introductions, mostly from different parts of Iran, were assayed by inter-simple sequence repeat (ISSR) markers to differentiate and explore their genetic relationships. Fourteen ISSR primers amplified 389 fragments of which 313 (80.5%) were polymorphic. The average polymorphism information content (PIC) was 0.328, which shows that the majority of primers are informative. Cluster analysis using the un-weighted paired group method with arithmetic average (UPGMA) method and Jaccard’s similarity coefficient (r = 0.828) grouped the accessions into six main clusters according to some degree to geographical origin, their chromosome number and some morphological characteristics. It can be concluded that there exists a wide genetic base of bermudograss in Iran and that ISSR markers are effective in determining genetic diversity and relationships among them. PMID:22312259
Population genetic structure of Monimopetalum chinense (Celastraceae), an endangered endemic species of eastern China.

PubMed

Xie, Guo-Wen; Wang, De-Lian; Yuan, Yong-Ming; Ge, Xue-Jun

2005-04-01

Monimopetalum chinense (Celastraceae) standing for the monotypic genus is endemic to eastern China. Its conservation status is vulnerable as most populations are small and isolated. Monimopetalum chinense is capable of reproducing both sexually and asexually. The aim of this study was to understand the genetic structure of M. chinense and to suggest conservation strategies. One hundred and ninety individuals from ten populations sampled from the entire distribution area of M. chinense were investigated by using inter-simple sequence repeats (ISSR). A total of 110 different ISSR bands were generated using ten primers. Low levels of genetic variation were revealed both at the species level (Isp=0.183) and at the population level (Ipop=0.083). High clonal diversity (D = 0.997) was found, and strong genetic differentiation among populations was detected (49.06 %). Small population size, possible inbreeding, limited gene flow due to short distances of seed dispersal, fragmentation of the once continuous range and subsequent genetic drift, may have contributed to shaping the population genetic structure of the species.
Detection of Variation in Long-Term Micropropagated Mature Pistachio via DNA-Based Molecular Markers.

PubMed

Akdemir, Hülya; Suzerer, Veysel; Tilkat, Engin; Onay, Ahmet; Çiftçi, Yelda Ozden

2016-12-01

Determination of genetic stability of in vitro-grown plantlets is needed for safe and large-scale production of mature trees. In this study, genetic variation of long-term micropropagated mature pistachio developed through direct shoot bud regeneration using apical buds (protocol A) and in vitro-derived leaves (protocol B) was assessed via DNA-based molecular markers. Randomly amplified polymorphic DNA (RAPD), inter-simple sequence repeat (ISSR), and amplified fragment length polymorphism (AFLP) were employed, and the obtained PIC values from RAPD (0.226), ISSR (0.220), and AFLP (0.241) showed that micropropagation of pistachio for different periods of time resulted in "reasonable polymorphism" among donor plant and its 18 clones. Mantel's test showed a consistence polymorphism level between marker systems based on similarity matrices. In conclusion, this is the first study on occurrence of genetic variability in long-term micropropagated mature pistachio plantlets. The obtained results clearly indicated that different marker approaches used in this study are reliable for assessing tissue culture-induced variations in long-term cultured pistachio plantlets.
Analysis of the genetic diversity of Chinese native Cannabis sativa cultivars by using ISSR and chromosome markers.

PubMed

Zhang, L G; Chang, Y; Zhang, X F; Guan, F Z; Yuan, H M; Yu, Y; Zhao, L J

2014-12-12

Hemp (Cannabis sativa) is an important fiber crop, and native cultivars exist widely throughout China. In the present study, we analyzed the genetic diversity of 27 important Chinese native hemp cultivars, by using inter-simple sequence repeats (ISSR) and chromosome markers. We determined the following chromosome formulas: 2n = 20 = 14m + 6sm; 2n = 20 = 20m; 2n = 20 = 18m + 2sm; 2n = 20 = 16m + 4sm; and 2n = 20 = 12m + 8sm. The results of our ISSR analysis revealed the genetic relationships among the 27 cultivars; these relationships were analyzed by using the unweighted pair-group method based on DNA polymorphism. Our results revealed that all of the native cultivars showed considerable genetic diversity. At a genetic distance of 0.324, the 27 varieties could be classified into five categories; this grouping corresponded well with the chromosome formulas. All of the investigated hemp cultivars represent relatively primitive types; moreover, the genetic distances show a geographical distribution, with a small amount of regional hybridity.
Genetic diversity and variation of Chinese fir from Fujian province and Taiwan, China, based on ISSR markers

PubMed Central

Chen, Yu; Peng, Zhuqing; Wu, Chao; Ma, Zhihui; Ding, Guochang; Cao, Guangqiu; Ruan, Shaoning; Lin, Sizu

2017-01-01

Genetic diversity and variation among 11 populations of Chinese fir from Fujian province and Taiwan were assessed using inter-simple sequence repeat (ISSR) markers to reveal the evolutionary relationship in their distribution range in this report. Analysis of genetic parameters of the different populations showed that populations in Fujian province exhibited a greater level of genetic diversity than did the populations in Taiwan. Compared to Taiwan populations, significant limited gene flow were observed among Fujian populations. An UPGMA cluster analysis showed that the most individuals of Taiwan populations formed a single cluster, whereas 6 discrete clusters were formed by each population from Fujian. All populations were divided into 3 main groups and that all 5 populations from Taiwan were gathered into a subgroup combined with 2 populations, Dehua and Liancheng, formed one of the 3 main groups, which indicated relative stronger relatedness. It is supported by a genetic structure analysis. All those results are suggesting different levels of genetic diversity and variation of Chinese fir between Fujian and Taiwan, and indicating different patterns of evolutionary process and local environmental adaption. PMID:28406956
Genetic diversity and variation of Chinese fir from Fujian province and Taiwan, China, based on ISSR markers.

PubMed

Chen, Yu; Peng, Zhuqing; Wu, Chao; Ma, Zhihui; Ding, Guochang; Cao, Guangqiu; Ruan, Shaoning; Lin, Sizu

2017-01-01

Genetic diversity and variation among 11 populations of Chinese fir from Fujian province and Taiwan were assessed using inter-simple sequence repeat (ISSR) markers to reveal the evolutionary relationship in their distribution range in this report. Analysis of genetic parameters of the different populations showed that populations in Fujian province exhibited a greater level of genetic diversity than did the populations in Taiwan. Compared to Taiwan populations, significant limited gene flow were observed among Fujian populations. An UPGMA cluster analysis showed that the most individuals of Taiwan populations formed a single cluster, whereas 6 discrete clusters were formed by each population from Fujian. All populations were divided into 3 main groups and that all 5 populations from Taiwan were gathered into a subgroup combined with 2 populations, Dehua and Liancheng, formed one of the 3 main groups, which indicated relative stronger relatedness. It is supported by a genetic structure analysis. All those results are suggesting different levels of genetic diversity and variation of Chinese fir between Fujian and Taiwan, and indicating different patterns of evolutionary process and local environmental adaption.
Genotypic analysis of Mucor from the platypus in Australia.

PubMed

Connolly, J H; Stodart, B J; Ash, G J

2010-01-01

Mucor amphibiorum is the only pathogen known to cause significant morbidity and mortality in the free-living platypus (Ornithorhynchus anatinus) in Tasmania. Infection has also been reported in free-ranging cane toads (Bufo marinus) and green tree frogs (Litoria caerulea) from mainland Australia but has not been confirmed in platypuses from the mainland. To date, there has been little genotyping specifically conducted on M. amphibiorum. A collection of 21 Mucor isolates representing isolates from the platypus, frogs and toads, and environmental samples were obtained for genotypic analysis. Internal transcribed spacer (ITS) region sequencing and GenBank comparison confirmed the identity of most of the isolates. Representative isolates from infected platypuses formed a clade containing the reference isolates of M. amphibiorum from the Centraal Bureau voor Schimmelcultures repository. The M. amphibiorum isolates showed a close sequence identity with Mucor indicus and consisted of two haplotypes, differentiated by single nucleotide polymorphisms within the ITS1 and ITS2 regions. With the exception of isolate 96-4049, all isolates from platypuses were in one haplotype. Multilocus fingerprinting via the use of intersimple sequence repeats polymerase chain reaction identified 19 genotypes. Two major clusters were evident: 1) M. amphibiorum and Mucor racemosus; and 2) Mucor circinelloides, Mucor ramosissimus, and Mucor fragilis. Seven M. amphibiorum isolates from platypuses were present in two subclusters, with isolate 96-4053 appearing genetically distinct from all other isolates. Isolates classified as M. circinelloides by sequence analysis formed a separate subcluster, distinct from other Mucor spp. The combination of sequencing and multilocus fingerprinting has the potential to provide the tools for rapid identification of M. amphibiorum. Data presented on the diversity of the pathogen and further work in linking genetic diversity to functional diversity will provide critical information for its management in Tasmanian river systems.
Molecular Linkage Mapping and Marker-Trait Associations with NlRPT, a Downy Mildew Resistance Gene in Nicotiana langsdorffii

PubMed Central

Zhang, Shouan; Gao, Muqiang; Zaitlin, David

2012-01-01

Nicotiana langsdorffii is one of two species of Nicotiana known to express an incompatible interaction with the oomycete Peronospora tabacina, the causal agent of tobacco blue mold disease. We previously showed that incompatibility is due to the hypersensitive response (HR), and plants expressing the HR are resistant to P. tabacina at all stages of growth. Resistance is due to a single dominant gene in N. langsdorffii accession S-4-4 that we have named NlRPT. In further characterizing this unique host-pathogen interaction, NlRPT has been placed on a preliminary genetic map of the N. langsdorffii genome. Allelic scores for five classes of DNA markers were determined for 90 progeny of a “modified backcross” involving two N. langsdorffii inbred lines and the related species N. forgetiana. All markers had an expected segregation ratio of 1:1, and were scored in a common format. The map was constructed with JoinMap 3.0, and loci showing excessive transmission distortion were removed. The linkage map consists of 266 molecular marker loci defined by 217 amplified fragment length polymorphisms (AFLPs), 26 simple-sequence repeats (SSRs), 10 conserved orthologous sequence markers, nine inter-simple sequence repeat markers, and four target region amplification polymorphism markers arranged in 12 linkage groups with a combined length of 1062 cM. NlRPT is located on linkage group three, flanked by four AFLP markers and one SSR. Regions of skewed segregation were detected on LGs 1, 5, and 9. Markers developed for N. langsdorffii are potentially useful genetic tools for other species in Nicotiana section Alatae, as well as in N. benthamiana. We also investigated whether AFLPs could be used to infer genetic relationships within N. langsdorffii and related species from section Alatae. A phenetic analysis of the AFLP data showed that there are two main lineages within N. langsdorffii, and that both contain populations expressing dominant resistance to P. tabacina. PMID:22936937
Diversity of black Aspergilli isolated from raisins in Argentina: Polyphasic approach to species identification and development of SCAR markers for Aspergillus ibericus.

PubMed

Giaj Merlera, G; Muñoz, S; Coelho, I; Cavaglieri, L R; Torres, A M; Reynoso, M M

2015-10-01

Aspergillus section Nigri is a heterogeneous fungal group including some ochratoxin A producer species that usually contaminate raisins. The section contains the Series Carbonaria which includes the toxigenic species Aspergillus carbonarius and nontoxigenic Aspergillus ibericus that are phenotypically undistinguishable. The aim of this study was to examine the diversity of black aspergilli isolated from raisins and to develop a specific genetic marker to distinguish A. ibericus from A. carbonarius. The species most frequently found in raisins in this study were Aspergillus tubingensis (35.4%) and A. carbonarius (32.3%), followed by Aspergillus luchuensis (10.7%), Aspergillus japonicus (7.7%), Aspergillus niger (6.2%), Aspergillus welwitschiae (4.6%) and A. ibericus (3.1%). Based on inter-simple sequence repeat (ISSR) fingerprinting profiles of major Aspergillus section Nigri members, a sequence-characterized amplified region (SCAR) marker was identified. Primers were designed based on the conserved regions of the SCAR marker and were utilized in a PCR for simultaneous identification of A. carbonarius and A. ibericus. The detection level of the SCAR-PCR was found to be 0.01 ng of purified DNA. The present SCAR-PCR is rapid and less cumbersome than conventional identification techniques and could be a supplementary strategy and a reliable tool for high-throughput sample analysis. Copyright © 2015 Elsevier B.V. All rights reserved.
High degree of genetic diversity among genotypes of the forage grass Brachiaria ruziziensis (Poaceae) detected with ISSR markers.

PubMed

Azevedo, A L S; Costa, P P; Machado, M A; de Paula, C M P; Sobrinho, F S

2011-11-17

The grasses of the genus Brachiaria account for 80% of the cultivated pastures in Brazil. Despite its importance for livestock production, little information is available for breeding purposes. Embrapa has a population of B. ruziziensis from different regions of Brazil, representing most of existing variability. This population was used to initiate an improvement program based on recurrent selection. In order to assist the genetic improvement program, we estimated the molecular variability among 93 genotypes of Embrapa's collection using ISSR (inter-simple sequence repeat) markers. DNA was extracted from the leaves. Twelve ISSR primers generated 89 polymorphic bands in the 93 genotypes. The number of bands identified by each primer ranged from two to 13, with a mean of 7.41. Cluster analysis revealed a clearly distinct group, containing most of the B. ruziziensis genotypes apart from the outgroup genotypes. Genetic similarity coefficients ranged from 0.0 to 0.95, with a mean of 0.50 and analysis of molecular variance indicated higher variation within (73.43%) than among species (26.57%). We conclude that there is a high genetic diversity among these B. ruziziensis genotypes, which could be explored by breeding programs.
Flower induction, microscope-aided cross-pollination, and seed production in the duckweed Lemna gibba with discovery of a male-sterile clone.

PubMed

Fu, Lili; Huang, Meng; Han, Bingying; Sun, Xuepiao; Sree, K Sowjanya; Appenroth, Klaus-J; Zhang, Jiaming

2017-06-08

Duckweed species have a great potential to develop into fast-growing crops for water remediation and bioenergy production. Seed production and utilization of hybrid vigour are essential steps in this process. However, even in the extensively-studied duckweed species, Lemna gibba, flower primordia were often aborted prior to maturation. Salicylic acid (SA) and agar solidification of the medium promoted flower maturation and resulted in high flowering rates in L. gibba 7741 and 5504. Artificial cross-pollination between individuals of L. gibba 7741 yielded seeds at high frequencies unlike that in L. gibba 5504. In contrast to clone 7741, the anthers of 5504 did not dehisce upon maturation, its artificially released pollen grains had pineapple-like exine with tilted spines. These pollens were not stained by 2,5-diphenylmonotetrazoliumbromide (MTT) and failed to germinate. Therefore, clone 5504 is male sterile and has potential application with respect to hybrid vigour. Moreover, pollination of flowers of 5504 with 7741 pollen grains resulted in intraspecific hybrid seeds, which was confirmed by inter-simple sequence repeat (ISSR) markers. These hybrid seeds germinated at a high frequency, forming new clones.
Genetic variation of the endangered Gentiana lutea L. var. aurantiaca (Gentianaceae) in populations from the Northwest Iberian Peninsula.

PubMed

González-López, Oscar; Polanco, Carlos; György, Zsuzsanna; Pedryc, Andrzej; Casquero, Pedro A

2014-06-05

Gentiana lutea L. (G. lutea L.) is an endangered plant, patchily distributed along the mountains of Central and Southern Europe. In this study, inter-simple sequence repeat (ISSR) markers were used to investigate the genetic variation in this species within and among populations of G. lutea L. var. aurantiaca of the Cantabrian Mountains (Northwest Iberian Peninsula). Samples of G. lutea L. collected at different locations of the Pyrenees and samples of G. lutea L. subsp. vardjanii of the Dolomites Alps were also analyzed for comparison. Using nine ISSR primers, 106 bands were generated, and 89.6% of those were polymorphic. The populations from the Northwest Iberian Peninsula were clustered in three different groups, with a significant correlation between genetic and geographic distances. Gentiana lutea L. var. aurantiaca showed 19.8% private loci and demonstrated a remarkable level of genetic variation, both among populations and within populations; those populations with the highest level of isolation show the lowest genetic variation within populations. The low number of individuals, as well as the observed genetic structure of the analyzed populations makes it necessary to protect them to ensure their survival before they are too small to persist naturally.
Genetic Variation of the Endangered Gentiana lutea L. var. aurantiaca (Gentianaceae) in Populations from the Northwest Iberian Peninsula

PubMed Central

González-López, Oscar; Polanco, Carlos; György, Zsuzsanna; Pedryc, Andrzej; Casquero, Pedro A.

2014-01-01

Gentiana lutea L. (G. lutea L.) is an endangered plant, patchily distributed along the mountains of Central and Southern Europe. In this study, inter-simple sequence repeat (ISSR) markers were used to investigate the genetic variation in this species within and among populations of G. lutea L. var. aurantiaca of the Cantabrian Mountains (Northwest Iberian Peninsula). Samples of G. lutea L. collected at different locations of the Pyrenees and samples of G. lutea L. subsp. vardjanii of the Dolomites Alps were also analyzed for comparison. Using nine ISSR primers, 106 bands were generated, and 89.6% of those were polymorphic. The populations from the Northwest Iberian Peninsula were clustered in three different groups, with a significant correlation between genetic and geographic distances. Gentiana lutea L. var. aurantiaca showed 19.8% private loci and demonstrated a remarkable level of genetic variation, both among populations and within populations; those populations with the highest level of isolation show the lowest genetic variation within populations. The low number of individuals, as well as the observed genetic structure of the analyzed populations makes it necessary to protect them to ensure their survival before they are too small to persist naturally. PMID:24905405

Genetic diversity and geographic differentiation in the threatened species Dysosma pleiantha in China as revealed by ISSR analysis.

PubMed

Zong, Min; Liu, Hai-Long; Qiu, Ying-Xiong; Yang, Shu-Zhen; Zhao, Ming-Shui; Fu, Cheng-Xin

2008-04-01

Dysosma pleiantha, an important threatened medicinal plant species, is restricted in distribution to southeastern China. The species is capable of reproducing both sexually and asexually. In this study, inter-simple sequence repeat marker data were obtained and analyzed with respect to genetic variation and genetic structure. The extent of clonality, together with the clonal and sexual reproductive strategies, varied among sites, and the populations under harsh ecological conditions tended to have large clones with relatively low clonal diversity caused by vegetative reproduction. The ramets sharing the same genotype show a clumped distribution. Across all populations surveyed, average within-population diversity was remarkably low (e.g., 0.111 for Nei's gene diversity), with populations from the nature reserves maintaining relatively high amounts of genetic diversity. Among all populations, high genetic differentiation (AMOVA: Phi(ST) = 0.500; Nei's genetic diversity: G (ST) = 0.465, Bayesian analysis: Phi(B) = 0.436) was detected, together with an isolation-by-distance pattern. Low seedling recruitment due to inbreeding, restricted gene flow, and genetic drift are proposed as determinant factors responsible for the low genetic diversity and high genetic differentiation observed.
Genetic differentiation and karyotype variation in Hedysarum chaiyrakanicum, an endemic species of Tuva Republic, Russia.

PubMed

Zvyagina, Natalia S; Dorogina, Olga V; Krasnikov, Alexander A

2016-05-01

Overgrazing and mining affect vegetation, particularly in mountains. At times, it goes to such an extent that the plant species become vulnerable and slowly extinct from its habitat. Such endemic species need to be protected. One such endemic species Hedysarum chaiyrakanicum Kurbatsky, a vulnerable steppe vegetation of Tuva Republic, Russia was evaluated for its genetic diversity and taxonomic definition using molecular technique and chromosome number adjustment. The genetic differentiation among H. chaiyrakanicum, H. setigerum Turcz. and H. gmelinii Ledeb. genotypes was determined using five inter-simple sequence repeat (ISSR) markers and then examined with Nei's genetic distance coefficient (D) and Shannon's information index (H). A total of 134 reproducible bands were detected with polymorphism percentage of 98%. The genetic diversity of H. chaiyrakanicum was found to be 0.343 while the Shannon index H(sp) was determined as 8 06. The chromosome number 2n = 16 is newly observed within the H. chaiyrakanicum. The genetic relationship based on ISSR data supported the taxonomic distinction of H. chaiyrakanicum from H. setigerum and H. gmelinii. We recommend both in situ and ex situ conservation strategies, specially germplasm sampling, to save this endemic species.
Population Genetic Structure of Monimopetalum chinense (Celastraceae), an Endangered Endemic Species of Eastern China

PubMed Central

XIE, GUO-WEN; WANG, DE-LIAN; YUAN, YONG-MING; GE, XUE-JUN

2005-01-01

• Background and Aims Monimopetalum chinense (Celastraceae) standing for the monotypic genus is endemic to eastern China. Its conservation status is vulnerable as most populations are small and isolated. Monimopetalum chinense is capable of reproducing both sexually and asexually. The aim of this study was to understand the genetic structure of M. chinense and to suggest conservation strategies. • Methods One hundred and ninety individuals from ten populations sampled from the entire distribution area of M. chinense were investigated by using inter-simple sequence repeats (ISSR). • Key Results A total of 110 different ISSR bands were generated using ten primers. Low levels of genetic variation were revealed both at the species level (Isp = 0·183) and at the population level (Ipop = 0·083). High clonal diversity (D = 0·997) was found, and strong genetic differentiation among populations was detected (49·06 %). • Conclusions Small population size, possible inbreeding, limited gene flow due to short distances of seed dispersal, fragmentation of the once continuous range and subsequent genetic drift, may have contributed to shaping the population genetic structure of the species. PMID:15710646
Construction of an integrated genetic map for Capsicum baccatum L.

PubMed

Moulin, M M; Rodrigues, R; Ramos, H C C; Bento, C S; Sudré, C P; Gonçalves, L S A; Viana, A P

2015-06-18

Capsicum baccatum L. is one of the five Capsicum domesticated species and has multiple uses in the food, pharmaceutical and cosmetic industries. This species is also a valuable source of genes for chili pepper breeding, especially genes for disease resistance and fruit quality. However, knowledge of the genetic structure of C. baccatum is limited. A reference map for C. baccatum (2n = 2x = 24) based on 42 microsatellite, 85 inter-simple sequence repeat, and 56 random amplified polymorphic DNA markers was constructed using an F2 population consisting of 203 individuals. The map was generated using the JoinMap software (version 4.0) and the linkage groups were formed and ordered using a LOD score of 3.0 and maximum of 40% recombination. The genetic map consisted of 12 major and four minor linkage groups covering a total genome distance of 2547.5 cM with an average distance of 14.25 cM between markers. Of the 152 pairs of microsatellite markers available for Capsicum annuum, 62 were successfully transferred to C. baccatum, generating polymorphism. Forty-two of these markers were mapped, allowing the introduction of C. baccatum in synteny studies with other species of the genus Capsicum.
Genetic Diversity and Genetic Relationships of Purple Willow (Salix purpurea L.) from Natural Locations

PubMed Central

Prinz, Kathleen; Przyborowski, Jerzy A.

2017-01-01

In this study, the genetic diversity and structure of 13 natural locations of Salix purpurea were determined with the use of AFLP (amplified length polymorphism), RAPD (randomly amplified polymorphic DNA) and ISSR (inter-simple sequence repeats). The genetic relationships between 91 examined S. purpurea genotypes were evaluated by analyses of molecular variance (AMOVA), principal coordinates analyses (PCoA) and UPGMA (unweighted pair group method with arithmetic mean) dendrograms for both single marker types and a combination of all marker systems. The locations were assigned to distinct regions and the analysis of AMOVA (analysis of molecular variance) revealed a high genetic diversity within locations. The genetic diversity between both regions and locations was relatively low, but typical for many woody plant species. The results noted for the analyzed marker types were generally comparable with few differences in the genetic relationships among S. purpurea locations. A combination of several marker systems could thus be ideally suited to understand genetic diversity patterns of the species. This study makes the first attempt to broaden our knowledge of the genetic parameters of the purple willow (S. purpurea) from natural location for research and several applications, inter alia breeding purposes. PMID:29301207
Genetic diversity in natural populations of mangaba in Sergipe, the largest producer State in Brazil.

PubMed

Soares, A N R; Vitória, M F; Nascimento, A L S; Ledo, A S; Rabbani, A R C; Silva, A V C

2016-08-19

Mangaba (Hancornia speciosa Gomes) is found in areas of coastal tablelands in the Brazilian Northeast and Cerrado regions. This species has been subjected to habitat fragmentation that is mainly due to human activity, and requires conservation strategies. The aim of this study was to analyze the structure and inter- and intrapopulation genetic diversity of natural populations of H. speciosa Gomes using inter-simple sequence repeat (ISSR) molecular markers. A total of 155 individuals were sampled in 10 natural populations (ITA, PAC, IND, EST, BC, PIR, JAP, BG, NEO, and SANT) in the State of Sergipe, Brazil. Fifteen primers were used to generate 162 fragments with 100% polymorphism. Genetic analysis showed that the variability between populations (77%) was higher than within populations (23%). It was possible to identify five different groups by the unweighted pair group method with arithmetic mean and principal coordinate analysis, and only one individual (E10) remained isolated. Using ISSR markers it was possible to obtain a molecular profile of the populations evaluated, showing that these markers were effective and exhibited sufficient polymorphism to estimate the genetic variability of natural populations of H. speciosa Gomes.
Genetic fidelity of long-term micropropagated shoot cultures of vanilla (Vanilla planifolia Andrews) as assessed by molecular markers.

PubMed

Sreedhar, Reddampalli V; Venkatachalam, Lakshmanan; Bhagyalakshmi, Neelwarne

2007-08-01

Occurrence of genetic variants during micropropagation is occasionally encountered when the cultures are maintained in vitro for long period. Therefore, the micropropagated multiple shoots of Vanilla planifolia Andrews developed from axillary bud explants established 10 years ago were used to determine somaclonal variation using random amplified polymorphic DNA (RAPD) and intersimple sequence repeats markers (ISSR). One thousand micro-plants were established in soil of which 95 plantlets (consisting of four phenotypes) along with the mother plant were subjected to genetic analyses using RAPD and ISSR markers. Out of the 45 RAPD and 20 ISSR primers screened, 30 RAPD and 7 ISSR primers showed 317 clear, distinct and reproducible band classes resulting in a total of 30 115 bands. However, no difference was observed in banding patterns of any of the samples for a particular primer, indicating the absence of variation among the micropropagated plants. Our results allow us to conclude that the micropropagation protocol that we have used for in vitro proliferation of vanilla plantlets for the last 10 years might be applicable for the production of clonal plants over a considerable period of time.
Effect of Gamma Rays on Sophora davidii and Detection of DNA Polymorphism through ISSR Marker

PubMed Central

Wang, Puchang; Mo, Bentian; Luo, Tianqiong

2017-01-01

Sophora davidii (Franch.) Kom. ex Pavol is an important medicinal plant and a feeding scrub with ecological value. The effects of different gamma irradiation doses (20–140 Kr) on seed germination and seedling morphology were investigated in S. davidii, and intersimple sequence repeat (ISSR) markers were used to identify the DNA polymorphism among mutants. Significant variations were observed for seed germination, stem diameter, and number of branches per plant. The improved agronomic traits, such as stem diameter and number of branches per plant, were recorded at 80 Kr dose and 20 Kr dose for seed germination. ISSR analysis generated in total 183 scorable fragments, of which 94 (51.37%) were polymorphic. The percentage of polymorphism ranged from 14.29 to 93.33 with an average of 45.69%. Jaccard's coefficients of dissimilarity varied from 0.6885 to 1.000, indicative of the level of genetic variation among the mutants. The constructed dendrogram grouped the entities into five clusters. Consequently, it was concluded that gamma rays irradiation of seeds generates a sufficient number of induced mutations and that ISSR analysis offered a useful molecular marker for the identification of mutants. PMID:28612030
Genetic variation within and among populations of Rhodiola alsia (Crassulaceae) native to the Tibetan Plateau as detected by ISSR markers.

PubMed

Xia, Tao; Chen, Shilong; Chen, Shengyun; Ge, Xuejun

2005-04-01

Genetic variation of 10 Rhodiola alsia (Crassulaceae) populations from the Qinghai-Tibet Plateau of China was investigated using intersimple sequence repeat (ISSR) markers. R. alsia is an endemic species of the Qinghai-Tibet Plateau. Of the 100 primers screened, 13 were highly polymorphic. Using these primers, 140 discernible DNA fragments were generated with 112 (80%) being polymorphic, indicating pronounced genetic variation at the species level. Also there were high levels of polymorphism at the population level with the percentage of polymorphic bands (PPB) ranging from 63.4 to 88.6%. Analysis of molecular variance (AMOVA) showed that the genetic variation was mainly found among populations (70.3%) and variance within populations was 29.7%. The main factors responsible for the high level of differentiation among populations are probably the isolation from other populations and clonal propagation of this species. Occasional sexual reproduction might occur in order to maintain high levels of variation within populations. Environmental conditions could also influence population genetic structure as they occur in severe habitats. The strong genetic differentiation among populations in our study indicates that the conservation of genetic variability in R. alsia requires maintenance of as many populations as possible.
Assessment of genetic diversity in Vigna unguiculata L. (Walp) accessions using inter-simple sequence repeat (ISSR) and start codon targeted (SCoT) polymorphic markers.

PubMed

Igwe, David Okeh; Afiukwa, Celestine Azubike; Ubi, Benjamin Ewa; Ogbu, Kenneth Idika; Ojuederie, Omena Bernard; Ude, George Nkem

2017-11-17

Assessment of genetic diversity of Vigna unguiculata (L.) Walp (cowpea) accessions using informative molecular markers is imperative for their genetic improvement and conservation. Use of efficacious molecular markers to obtain the required knowledge of the genetic diversity within the local and regional germplasm collections can enhance the overall effectiveness of cowpea improvement programs, hence, the comparative assessment of Inter-simple sequence repeat (ISSR) and Start codon targeted (SCoT) markers in genetic diversity of V. unguiculata accessions from different regions in Nigeria. Comparative analysis of the genetic diversity of eighteen accessions from different locations in Nigeria was investigated using ISSR and SCoT markers. DNA extraction was done using Zymogen Kit according to its manufacturer's instructions followed by amplifications with ISSR and SCoT and agarose gel electrophoresis. The reproducible bands were scored for analyses of dendrograms, principal component analysis, genetic diversity, allele frequency, polymorphic information content, and population structure. Both ISSR and SCoT markers resolved the accessions into five major clusters based on dendrogram and principal component analyses. Alleles of 32 and 52 were obtained with ISSR and SCoT, respectively. Numbers of alleles, gene diversity and polymorphic information content detected with ISSR were 9.4000, 0.7358 and 0.7192, while SCoT yielded 11.1667, 0.8158 and 0.8009, respectively. Polymorphic loci were 70 and 80 in ISSR and SCoT, respectively. Both markers produced high polymorphism (94.44-100%). The ranges of effective number of alleles (Ne) were 1.2887 ± 0.1797-1.7831 ± 0.2944 and 1.7416 ± 0.0776-1.9181 ± 0.2426 in ISSR and SCoT, respectively. The Nei's genetic diversity (H) ranged from 0.2112 ± 0.0600-0.4335 ± 0.1371 and 0.4111 ± 0.0226-0.4778 ± 0.1168 in ISSR and SCoT, respectively. Shannon's information index (I) from ISSR and SCoT were 0.3583 ± 0.0639-0.6237 ± 0.1759 and 0.5911 ± 0.0233-0.6706 ± 0.1604. Total gene diversity (Ht), gene diversity within population (Hs), coefficient of gene differentiation (Gst) and level of gene flow (Nm) revealed by ISSR were 0.4498, 0.3203, 0.2878 and 1.2371 respectively, while SCoT had 0.4808, 0.4522, 0.0594 and 7.9245. Both markers showed highest genetic diversity in accessions from Ebonyi. Our study demonstrated that SCoT markers were more efficient than ISSR for genetic diversity studies in V. unguiculata and can be integrated in the exploration of their genetic diversity for improvement and germplasm utilization.
[Isolation and identification of specific sequences correlated to cytoplasmic male sterility and fertile maintenance in cauliflower (Brassica oleracea var. botrytis)].

PubMed

Wang, Chun Guo; Chen, Xiao Qiang; Li, Hui; Zhao, Qian Cheng; Sun, De Ling; Song, Wen Qin

2008-02-01

Analysis of ISSR (Inter-Simple Sequence Repeat) and DDRT-PCR (Differential Display Reverse Transcriptase Polymerase Chain Reaction) was performed between cytoplasmic male sterility cauliflower ogura-A and its corresponding maintainer line ogura-B. Totally, 306 detectable bands were obtained by ISSR using thirty oligonucleotide primers. Commonly, six to twelve bands were produced per primer. Among all these primers only the amplification of primer ISSR3 was polymorphic, an 1100 bp specific band was only detected in maintainer line, named ISSR3(1100). Analysis of this sequence indicated that ISSR3(1100) was high homologous with the corresponding sequences of mitochondrial genome in Brassica napus and Arabidopsis thaliana,which suggested that ISSR3(1100) may derive from mitochondrial genome in cauliflower. To carry out DDRT-PCR analysis, three anchor primers and fifteen random primers were selected to combine. Totally, 1122 bands from 1 000 bp to 50 bp were detected. However, only four bands, named ogura-A 205, ogura-A383, ogura-B307 and ogura-B352, were confirmed to be different display in both lines. This result was further identified by reverse Northern dot blotting analysis. Among these four bands, ogura-A205 and ogura-A383 only express in cytoplasmic male sterility line, while ogura-B307 and ogura-B352 were only detected in maintainer line. Analysis of these sequences indicated that it was the first time that these four sequences were reported in cauliflower. Interestingly, ogura-A205 and ogura-B307 did not exhibit any similarities to other reported sequences in other species, more investigations were required to obtain further information. ogura-A383 and ogura-B352 were also two new sequences, they showed high similarities to corresponding chloroplast sequences of Arabidopsis thaliana and Brassica rapa subsp. pekinensis. So we speculated that these two sequences may derive from chloroplast genome. All these results obtained in this study offer new and significant information to investigate the molecular mechanism of cytoplasmic male sterility and fertile maintenance in cauliflower.
Molecular evidence of hybridization in sympatric populations of the Enantia jethys complex (Lepidoptera: Pieridae).

PubMed

Jasso-Martínez, Jovana M; Machkour-M'Rabet, Salima; Vila, Roger; Rodríguez-Arnaiz, Rosario; Castañeda-Sortibrán, América Nitxin

2018-01-01

Hybridization events are frequently demonstrated in natural butterfly populations. One interesting butterfly complex species is the Enantia jethys complex that has been studied for over a century; many debates exist regarding the species composition of this complex. Currently, three species that live sympatrically in the Gulf slope of Mexico (Enantia jethys, E. mazai, and E. albania) are recognized in this complex (based on morphological and molecular studies). Where these species live in sympatry, some cases of interspecific mating have been observed, suggesting hybridization events. Considering this, we employed a multilocus approach (analyses of mitochondrial and nuclear sequences: COI, RpS5, and Wg; and nuclear dominant markers: inter-simple sequence repeat (ISSRs) to study hybridization in sympatric populations from Veracruz, Mexico. Genetic diversity parameters were determined for all molecular markers, and species identification was assessed by different methods such as analyses of molecular variance (AMOVA), clustering, principal coordinate analysis (PCoA), gene flow, and PhiPT parameters. ISSR molecular markers were used for a more profound study of hybridization process. Although species of the Enantia jethys complex have a low dispersal capacity, we observed high genetic diversity, probably reflecting a high density of individuals locally. ISSR markers provided evidence of a contemporary hybridization process, detecting a high number of hybrids (from 17% to 53%) with significant differences in genetic diversity. Furthermore, a directional pattern of hybridization was observed from E. albania to other species. Phylogenetic study through DNA sequencing confirmed the existence of three clades corresponding to the three species previously recognized by morphological and molecular studies. This study underlines the importance of assessing hybridization in evolutionary studies, by tracing the lineage separation process that leads to the origin of new species. Our research demonstrates that hybridization processes have a high occurrence in natural populations.
Micropropagation and assessment of genetic fidelity of Henckelia incana: an endemic and medicinal Gesneriad of South India.

PubMed

Prameela, J; Ramakrishnaiah, H; Krishna, V; Deepalakshmi, A P; Naveen Kumar, N; Radhika, R N

2015-07-01

Henckelia incana is an endemic medicinal plant used for the treatment of fever and skin allergy. In the present study shoot regeneration was evaluated on Murashige and Skoog's (MS) medium supplemented with auxins, Indole-3-acetic acid (IAA), Indole-3- butyric acid (IBA), 1-Naphthaleneacetic acid (NAA), 2, 4-Dichlorophenoxyacetic acid (2, 4-D) and cytokinins, 6-Benzylaminopurine (BAP) and Kinetin (Kn) at concentrations of 0.5, 1.0, 2.0, 3.0, 4.0 and 5.0 mgl(-1). MS medium with IBA (18.08), NAA (17.83) and IAA (17.58) at 0.5 mgl(-1) concentrations showed efficient regeneration. Regenerated shoots were rooted on half-strength MS medium with and without 0.5 mgl(-1) IBA or NAA. The plantlets were successfully hardened in rooting trays (peat, vermiculite and sand) and transferred to field mileu. The genetic fidelity of in vitro raised plants was assessed by using three different single primer amplification reaction (SPAR) markers namely random amplified polymorphic DNA (RAPD), inter-simple sequence repeat (ISSR) and direct amplification of mini-satellite DNA region (DAMD). The results consistently demonstrated true-to-true type propagation. This is the first report of in vitro propagation and establishment of true-to-true type genetic fidelity in H. incana.
Chloroplast and nuclear DNA studies in a few members of the Brassica oleracea L. group using PCR-RFLP and ISSR-PCR markers: a population genetic analysis.

PubMed

Panda, S; Martín, J P; Aguinagalde, I

2003-04-01

A population genetic analysis of chloroplast and nuclear DNA was performed covering nine wild populations of Brassica oleracea. Three members of the n = 9 group, all close to B. oleracea, Brassica alboglabra Bailey, Brassica bourgeaui (Webb) O. Kuntze and Brassica montana Pourret, were also studied to better understand their relationship with B. oleracea. Chloroplast DNA was analysed using the PCR-RFLP (polymerase chain reaction - restriction fragment length polymorphism) method. The ISSR-PCR (inter-simple sequence repeat - polymerase chain reaction) technique was adopted to study nuclear DNA. Twelve primer pairs of chloroplast DNA showed very good amplification. The amplified product of each primer pair, digested by three restriction enzymes, revealed no variation of cpDNA among the taxa studied. This indicates they may have the same chloroplast genotype. Seven selected ISSR primers have detected genetic variation, both within and among the populations/taxa surveyed. The information obtained on the intra- and inter-populational genetic diversity of wild populations of B. oleracea neatly defined the individual plants. It could provide important guidelines for backing management and conservation strategies in this species. The study confirms a close relationship between B. alboglabra, B. bourgeaui and B. montana, which is parallel to their morphological similitude.
Diversity and structure of landraces of Agave grown for spirits under traditional agriculture: A comparison with wild populations of A. angustifolia (Agavaceae) and commercial plantations of A. tequilana.

PubMed

Vargas-Ponce, Ofelia; Zizumbo-Villarreal, Daniel; Martínez-Castillo, Jaime; Coello-Coello, Julián; Colunga-Garcíamarín, Patricia

2009-02-01

Traditional farming communities frequently maintain high levels of agrobiodiversity, so understanding their agricultural practices is a priority for biodiversity conservation. The cultural origin of agave spirits (mezcals) from west-central Mexico is in the southern part of the state of Jalisco where traditional farmers cultivate more than 20 landraces of Agave angustifolia Haw. in agroecosystems that include in situ management of wild populations. These systems, rooted in a 9000-year-old tradition of using agaves as food in Mesoamerica, are endangered by the expansion of commercial monoculture plantations of the blue agave variety (A. tequilana Weber var. Azul), the only agave certified for sale as tequila, the best-known mezcal. Using intersimple sequence repeats and Bayesian estimators of diversity and structure, we found that A. angustifolia traditional landraces had a genetic diversity (H(BT) = 0.442) similar to its wild populations (H(BT) = 0.428) and a higher genetic structure ((B) = 0.405; (B) =0. 212). In contrast, the genetic diversity in the blue agave commercial system (H(B) = 0.118) was 73% lower. Changes to agave spirits certification laws to allow the conservation of current genetic, ecological and cultural diversity can play a key role in the preservation of the traditional agroecosystems.
Genetic diversity analysis of Varronia curassavica Jacq. accessions using ISSR markers.

PubMed

Brito, F A; Nizio, D A C; Silva, A V C; Diniz, L E C; Rabbani, A R C; Arrigoni-Blank, M F; Alvares-Carvalho, S V; Figueira, G M; Montanari Júnior, I; Blank, A F

2016-09-02

Varronia curassavica Jacq. is a medicinal and aromatic plant from Brazil with significant economic importance. Studies on genetic diversity in active germplasm banks (AGB) are essential for conservation and breeding programs. The aim of this study was to analyze the genetic diversity of V. curassavica accessions of the AGB of Medicinal and Aromatic Plants of the Federal University of Sergipe (UFS), using inter-simple sequence repeat molecular markers. Twenty-four primers were tested, and 14 were polymorphic and informative, resulting in 149 bands with 97.98% polymorphism. The UPGMA dendrogram divided the accessions into Clusters I and II. Jaccard similarity coefficients for pair-wise comparisons of accessions ranged between 0.24 and 0.78. The pairs of accessions VCUR-001/VCUR-503, VCUR-001/VCUR-504, and VCUR-104/VCUR-501 showed relatively low similarity (0.24), and the pair of accessions VCUR-402/VCUR- 403 showed medium similarity (0.78). Twenty-eight accessions were divided into three distinct clusters, according to the STRUCTURE analysis. The genetic diversity of V. curassavica in the AGB of UFS is low to medium, and it requires expansion. Accession VCUR-802 is the most suitable for selection in breeding program of this species, since it clearly represents all of the diversity present in the AGB.
Application of ISSR markers for verification of F₁ hybrids in mungbean (Vigna radiata).

PubMed

Khajudparn, P; Prajongjai, T; Poolsawat, O; Tantasawat, P A

2012-09-17

Mungbean improvement via hybridization requires the identification of true F(1) hybrids from controlled crosses before further generations of selfing/crossing and selection. We utilized inter-simple sequence repeat (ISSR) markers for identifying putative F(1) hybrids from six cross combinations whose morphological characteristics were very similar to those of their respective female parents and could not be visually discriminated from the self-pollinated progeny. Based on 10 ISSR primers, polymorphisms were found between female and male parents of all six cross combinations. The highest value of genetic differentiation (21.4%) was found between male and female parents of the SUT3 x M5-1 cross. These 10 ISSR primers gave 2.8-25.0% polymorphism between male and female parents, with a mean of 12.1%, and 0-13.0% polymorphism between F(1) hybrid and female parents, with a mean of 4.8%. F(1) hybrids of all six cross combinations could be differentiated from the self-pollinated progeny of their female parents by using only either ISSR 841 or 857 primers, together with the ISSR 835 primer. We conclude that ISSR markers are useful and efficient for identifying mungbean F(1) hybrids in controlled crosses from different genetic background.
Distribution of mating-type alleles and M13 PCR markers in the black leaf spot fungus Mycosphaerella fijiensis of bananas in Brazil.

PubMed

Queiroz, C B; Miranda, E C; Hanada, R E; Sousa, N R; Gasparotto, L; Soares, M A; Silva, G F

2013-02-08

The fungus Mycosphaerella fijiensis is the causative agent of black sigatoka, which is one of the most destructive diseases of banana plants. Infection with this pathogen results in underdeveloped fruit, with no commercial value. We analyzed the distribution of the M. fijiensis mating-type system and its genetic variability using M13 phage DNA markers. We found a 1:1 distribution of mating-type alleles, indicating MAT1-1 and MAT1-2 idiomorphs. A polymorphism analysis using three different primers for M13 markers showed that only the M13 minisatellite primers generated polymorphic products. We then utilized this polymorphism to characterize 40 isolates from various Brazilian states. The largest genetic distances were found between isolates from the same location and between isolates from different parts of the country. Therefore, there was no correlation between the genetic similarity and the geographic origin of the isolates. The M13 marker was used to generate genetic fingerprints for five isolates; these fingerprints were compared with the band profiles obtained from inter-simple sequence repeat (UBC861) and inter-retrotransposon amplified polymorphism analyses. We found that the M13 marker was more effective than the other two markers for differentiating these isolates.
Ex situ conservation of Phyllanthus fraternus Webster and evaluation of genetic fidelity in regenerates using DNA-based molecular marker.

PubMed

Upadhyay, Richa; Kashyap, Sarvesh Pratap; Singh, Chandra Shekhar; Tiwari, Kavindra Nath; Singh, Karuna; Singh, Major

2014-11-01

Germplasm storage of Phyllanthus fraternus by using synseed technology has been optimized. Synseeds were prepared from nodal segments taken from in vitro-grown plantlets. An encapsulation matrix of 3 % sodium alginate and 100 mM calcium chloride with polymerization duration up to 15 min was found most suitable for synseed formation. Maximum plantlet conversion (92.5 ± 2.5 %) was obtained on a growth regulator-free ½-strength solid Murashige and Skoog (MS) medium. Multiple shoot proliferation was optimum on a ½ MS medium containing 0.5 mg/l 6-benzylaminopurine (BAP). Shoots were subjected to rooting on MS media containing 1 mg/l α-naphthaleneacetic acid (NAA) and acclimatized successfully. Encapsulated nodal segments can be stored for up to 90 days with a survival frequency of 47.33 %. The clonal fidelity of synseed-derived plantlets was also assessed and compared with that of the mother plant using rapid amplified polymorphic DNA and inter-simple sequence repeat analysis. No changes in molecular profiles were observed among the synseed-derived plantlets and mother plant, which confirms the genetic stability of regenerates. This synseed production protocol could be useful for in vitro multiplication, short-term storage, and exchange of germplasm of this important antiviral and hepatoprotective plant.
Population structure and genotypic variation of Crataegus pontica inferred by molecular markers.

PubMed

Rahmani, Mohammad-Shafie; Shabanian, Naghi; Khadivi-Khub, Abdollah; Woeste, Keith E; Badakhshan, Hedieh; Alikhani, Leila

2015-11-01

Information about the natural patterns of genetic variability and their evolutionary bases are of fundamental practical importance for sustainable forest management and conservation. In the present study, the genetic diversity of 164 individuals from fourteen natural populations of Crataegus pontica K.Koch was assessed for the first time using three genome-based molecular techniques; inter-retrotransposon amplified polymorphism (IRAP); inter-simple sequence repeats (ISSR) and start codon targeted (SCoT) polymorphism. IRAP, ISSR and SCoT analyses yielded 126, 254 and 199 scorable amplified bands, respectively, of which 90.48, 93.37 and 83.78% were polymorphic. ISSR revealed efficiency over IRAP and SCoT due to high effective multiplex ratio, marker index and resolving power. The dendrograms based on the markers used and combined data divided individuals into three major clusters. The correlation between the coefficient matrices for the IRAP, ISSR and SCoT data was significant. A higher level of genetic variation was observed within populations than among populations based on the markers used. The lower divergence levels depicted among the studied populations could be seen as evidence of gene flow. The promotion of gene exchange will be very beneficial to conserve and utilize the enormous genetic variability. Copyright © 2015 Elsevier B.V. All rights reserved.

Relatedness of Indian flax genotypes (Linum usitatissimum L.): an inter-simple sequence repeat (ISSR) primer assay.

PubMed

Rajwade, Ashwini V; Arora, Ritu S; Kadoo, Narendra Y; Harsulkar, Abhay M; Ghorpade, Prakash B; Gupta, Vidya S

2010-06-01

The objective of this study was to analyze the genetic relationships, using PCR-based ISSR markers, among 70 Indian flax (Linum usitatissimum L.) genotypes actively utilized in flax breeding programs. Twelve ISSR primers were used for the analysis yielding 136 loci, of which 87 were polymorphic. The average number of amplified loci and the average number of polymorphic loci per primer were 11.3 and 7.25, respectively, while the percent loci polymorphism ranged from 11.1 to 81.8 with an average of 63.9 across all the genotypes. The range of polymorphism information content scores was 0.03-0.49, with an average of 0.18. A dendrogram was generated based on the similarity matrix by the Unweighted Pair Group Method with Arithmetic Mean (UPGMA), wherein the flax genotypes were grouped in five clusters. The Jaccard's similarity coefficient among the genotypes ranged from 0.60 to 0.97. When the omega-3 alpha linolenic acid (ALA) contents of the individual genotypes were correlated with the clusters in the dendrogram, the high ALA containing genotypes were grouped in two clusters. This study identified SLS 50, Ayogi, and Sheetal to be the most diverse genotypes and suggested their use in breeding programs and for developing mapping populations.
Analysis of genetic diversity of Brassica rapa var. chinensis using ISSR markers and development of SCAR marker specific for Fragrant Bok Choy, a product of geographic indication.

PubMed

Shen, X L; Zhang, Y M; Xue, J Y; Li, M M; Lin, Y B; Sun, X Q; Hang, Y Y

2016-04-25

Non-heading Chinese cabbage [Brassica rapa var. chinensis (Linnaeus) Kitamura] is a popular vegetable and is also used as a medicinal plant in traditional Chinese medicine. Fragrant Bok Choy is a unique accession of non-heading Chinese cabbage and a product of geographic indication certified by the Ministry of Agriculture of China, which is noted for its rich aromatic flavor. However, transitional and overlapping morphological traits can make it difficult to distinguish this accession from other non-heading Chinese cabbages. This study aimed to develop a molecular method for efficient identification of Fragrant Bok Choy. Genetic diversity analysis, based on inter-simple sequence repeat molecular markers, was conducted for 11 non-heading Chinese cabbage accessions grown in the Yangtze River Delta region. Genetic similarity coefficients between the 11 accessions ranged from 0.5455 to 0.8961, and the genetic distance ranged from 0.0755 to 0.4475. Cluster analysis divided the 11 accessions into two major groups. The primer ISSR-840 amplified a fragment specific for Fragrant Bok Choy. A pair of specific sequence-characterized amplified region (SCAR) primers based on this fragment amplified a target band in Fragrant Bok Choy individuals, but no band was detected in individuals of other accessions. In conclusion, this study has developed an efficient strategy for authentication of Fragrant Bok Choy. The SCAR marker described here will facilitate the conservation and utilization of this unique non-heading Chinese cabbage germplasm resource.
Genetic diversity and structure of the threatened species Sinopodophyllum hexandrum (Royle) Ying.

PubMed

Liu, W; Wang, J; Yin, D X; Yang, M; Wang, P; Han, Q S; Ma, Q Q; Liu, J J; Wang, J X

2016-06-10

Sinopodophyllum hexandrum is an important medicinal plant that has been listed as an endangered species, making the conservation of its genetic diversity a priority. Therefore, the genetic diversity and population structure of S. hexandrum was investigated through inter-simple sequence repeat analysis of eight natural populations. Eleven selected primers generated 141 discernible fragments. The percentage of polymorphic bands was 37.59% at the species level, and 7.66-24.32% at the population level. Genetic diversity of S. hexandrum was low within populations (average HE = 0.0366), but higher at the species level (HE = 0.0963). Clear structure and high genetic differentiation were detected between populations using unweighted pair groups mean arithmetic and principle coordinate analysis. Clustering approaches clustered the eight sampled populations into three major groups, and AMOVA confirmed there to be significant variation between populations (63.27%). Genetic differentiation may have arisen through limited gene flow (Nm = 0.3317) in this species. Isolation by distance among populations was determined by comparing genetic distance versus geographical distance using the Mantel test. The results revealed no correlation between spatial pattern and geographic location. Given the low within-population genetic diversity, high differentiation among populations, and the increasing anthropogenic pressure on this species, in situ conservation measures, in addition to sampling and ex situ preservation, are recommended to preserve S. hexandrum populations and to retain their genetic diversity.
Increased relatedness among the neighboring plants from seedling to adult stages in carnaúba wax palm.

PubMed

Vieira, F A; Sousa, R F; Fajardo, C G; Brandão, M M

2016-12-19

The objective of this study was to assess the spatial genetic structure (SGS) at different life stages (cohorts) in a remnant population (N = 101) of Copernicia prunifera in the semiarid region of northeastern Brazil. Using seven inter-simple sequence repeat molecular markers, we were able to analyze 93 loci with 100% polymorphism. Seedlings had the highest level of genetic diversity (H E = 0.411, H O = 0.599), followed by juveniles (H E = 0.394, H O = 0.579) and adults (H E = 0.267, H O = 0.427). Based on analysis of molecular variance, the majority of genetic variations were observed to occur within the life stages (93.42%) rather than between the life stages (6.58%). We found a recent reduction in the population size (bottleneck) based on the number of loci with heterozygosity excess for the two models used (infinite allele = 92 and stepwise = 91). All the life stages showed significant SGS, with positive and significant kinship values. Sp values were 0.040 for seedlings, 0.093 for juveniles, 0.156 for adults, and 0.053 for the total population. We found an increase in SGS from the seedling to adult stages, indicating that the plants were from related adult progenitors. Data from this study can be used in designing effective management and conservation strategies for the species.
Genetic variability and resistance of cultivars of cowpea [Vigna unguiculata (L.) Walp] to cowpea weevil (Callosobruchus maculatus Fabr.).

PubMed

Vila Nova, M X; Leite, N G A; Houllou, L M; Medeiros, L V; Lira Neto, A C; Hsie, B S; Borges-Paluch, L R; Santos, B S; Araujo, C S F; Rocha, A A; Costa, A F

2014-03-31

The cowpea weevil (Callosobruchus maculatus Fabr.) is the most destructive pest of the cowpea bean; it reduces seed quality. To control this pest, resistance testing combined with genetic analysis using molecular markers has been widely applied in research. Among the markers that show reliable results, the inter-simple sequence repeats (ISSRs) (microsatellites) are noteworthy. This study was performed to evaluate the resistance of 27 cultivars of cowpea bean to cowpea weevil. We tested the resistance related to the genetic variability of these cultivars using ISSR markers. To analyze the resistance of cultivars to weevil, a completely randomized test design with 4 replicates and 27 treatments was adopted. Five pairs of the insect were placed in 30 grains per replicate. Analysis of variance showed that the number of eggs and emerged insects were significantly different in the treatments, and the means were compared by statistical tests. The analysis of the large genetic variability in all cultivars resulted in the formation of different groups. The test of resistance showed that the cultivar Inhuma was the most sensitive to both number of eggs and number of emerged adults, while the TE96-290-12-G and MNC99-537-F4 (BRS Tumucumaque) cultivars were the least sensitive to the number of eggs and the number of emerged insects, respectively.
Molecular characterization and population structure study of cambuci: strategy for conservation and genetic improvement.

PubMed

Santos, D N; Nunes, C F; Setotaw, T A; Pio, R; Pasqual, M; Cançado, G M A

2016-12-19

Cambuci (Campomanesia phaea) belongs to the Myrtaceae family and is native to the Atlantic Forest of Brazil. It has ecological and social appeal but is exposed to problems associated with environmental degradation and expansion of agricultural activities in the region. Comprehensive studies on this species are rare, making its conservation and genetic improvement difficult. Thus, it is important to develop research activities to understand the current situation of the species as well as to make recommendations for its conservation and use. This study was performed to characterize the cambuci accessions found in the germplasm bank of Coordenadoria de Assistência Técnica Integral using inter-simple sequence repeat markers, with the goal of understanding the plant's population structure. The results showed the existence of some level of genetic diversity among the cambuci accessions that could be exploited for the genetic improvement of the species. Principal coordinate analysis and discriminant analysis clustered the 80 accessions into three groups, whereas Bayesian model-based clustering analysis clustered them into two groups. The formation of two cluster groups and the high membership coefficients within the groups pointed out the importance of further collection to cover more areas and more genetic variability within the species. The study also showed the lack of conservation activities; therefore, more attention from the appropriate organizations is needed to plan and implement natural and ex situ conservation activities.
Species clarification of Isaria isolates used as biocontrol agents against Diaphorina citri (Hemiptera: Liviidae) in Mexico.

PubMed

Gallou, Adrien; Serna-Domínguez, María G; Berlanga-Padilla, Angélica M; Ayala-Zermeño, Miguel A; Mellín-Rosas, Marco A; Montesinos-Matías, Roberto; Arredondo-Bernal, Hugo C

2016-03-01

Entomopathogenic fungi belonging to the genus Isaria (Hypocreales: Cordycipitaceae) are promising candidates for microbial control of insect pests. Currently, the Mexican government is developing a biological control program based on extensive application of Isaria isolates against Diaphorina citri (Hemiptera: Liviidae), a vector of citrus huanglongbing disease. Previous research identified three promising Isaria isolates (CHE-CNRCB 303, 305, and 307; tentatively identified as Isaria fumosorosea) from Mexico. The goal of this work was to obtain a complete morphological and molecular characterization of these isolates. Comparative analysis of morphology established that the isolates showed similar characteristics to Isaria javanica. Multi-gene analysis confirmed the morphological identification by including the three isolates within the I. javanica clade. Additionally, this work demonstrated the misidentifications of three other Isaria isolates (CHE-CNRCB 310 and 324: I. javanica, formerly I. fumosorosea; CHE-CNRCB 393: I. fumosorosea, formerly Isaria farinosa), underlying the need for a full and correct characterization of an isolate before developing a biological control program. Finally, the inter-simple sequence repeat (ISSR) genotyping method revealed that the CHE-CNRCB 303, 305, and 307 isolates belong to three different genotypes. This result indicates that ISSR markers could be used as a tool to monitor their presence in field conditions. Copyright © 2015 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Characterization and identification of ISSR markers associated with oil content in sea buckthorn berries.

PubMed

Ding, J; Ruan, C J; Guan, Y; Shan, J Y; Li, H; Bao, Y H

2016-08-19

Bioactive oils extracted from sea buckthorn (Hippophae rhamnoides) berries contain highly nutritional and medicinal compounds; however, the oil contents of sea buckthorn berries are very low. Thirteen inter-simple sequence repeat (ISSR) primers were used to identify markers associated with oil content of dry pulp in 51 cultivars and lines, which clustered into three major groups based on 137 polymorphic markers. Dry pulp oil contents in 45 cultivars and lines in Group I ranged from 6.6 to 33.1%; these accessions belonged to H. rhamnoides ssp mongolica and its hybrids with H. rhamnoides ssp sinensis. Three lines (H. rhamnoides ssp mongolica) in Group II had high dry pulp oil contents (33.7 to 37.5%), whereas three lines of hybrids in Group III had low dry pulp oil contents (10.9 to 17.5%). The dry pulp oil content of H. rhamnoides ssp mongolica (27.2 ± 0.9%) was higher than that of hybrids (12.0 ± 1.2%) (P < 0.01). Four ISSR markers (881 340 , 825 1000 , 817 380 , and 807 1100 ) had positive association with high dry pulp oil content (P < 0.01) using stepwise multiple regression analysis. The use of these ISSR markers is a potential strategy to select genotypes with high dry pulp oil content and suitable parental combinations for improvement of sea buckthorn berries.
Conservation Genetics of an Endangered Lady’s Slipper Orchid: Cypripedium japonicum in China

PubMed Central

Qian, Xin; Li, Quan-Jian; Liu, Fen; Gong, Mao-Jiang; Wang, Cai-Xia; Tian, Min

2014-01-01

Knowledge about the population genetic variation of the endangered orchid, Cypripedium japonicum, is conducive to the development of conservation strategies. Here, we examined the levels and partitioning of inter-simple sequence repeat (ISSR) diversity (109 loci) in five populations of this orchid to gain insight into its genetic variation and population structure in Eastern and Central China. It harbored considerably lower levels of genetic diversity both at the population (percentage of polymorphic loci (PPL) = 11.19%, Nei’s gene diversity (H) = 0.0416 and Shannon’s information index (I) = 0.0613) and species level (PPL = 38.53%, H = 0.1273 and I = 0.1928) and a significantly higher degree of differentiation among populations (the proportion of the total variance among populations (Φpt) = 0.698) than those typical of ISSR-based studies in other orchid species. Furthermore, the Nei’s genetic distances between populations were independent of the corresponding geographical distances. Two main clusters are shown in an arithmetic average (UPGMA) dendrogram, which is in agreement with the results of principal coordinate analysis (PCoA) analysis and the STRUCTURE program. In addition, individuals within a population were more similar to each other than to those in other populations. Based on the genetic data and our field survey, the development of conservation management for this threatened orchid should include habitat protection, artificial gene flow and ex situ measures. PMID:24983476
Genetic diversity of Cosmos species revealed by RAPD and ISSR markers.

PubMed

Rodríguez-Bernal, A; Piña-Escutia, J L; Vázquez-García, L M; Arzate-Fernández, A M

2013-12-04

The genus Cosmos is native of America and is constituted by 34 species; 28 of them are endemic of Mexico. The cosmos are used as a nematicide, antimalarial, and antioxidative agent. The aim of this study was to estimate the genetic diversity among 7 cosmos species based on random amplified polymorphic DNA (RAPD) and inter-simple sequences repeats (ISSR) markers. With RAPD markers, the obtained polymorphism was 91.7 % and the genetic diversity was 0.33, whereas these values were 65.6%, and 0.22 from ISSR markers, respectively, indicating the presence of high genetic diversity among the Cosmos species that were analyzed. The unweighted pair group method with arithmetic mean dendrograms that were obtained with both markers were notably similar, revealing 2 clusters and indicating a clear genetic differentiation among the Cosmos species that were assessed. The first cluster comprised the species Cosmos sulphureus, Cosmos pacificus, and Cosmos diversifolius, while the second cluster included the species Cosmos purpureus, Cosmos crithmifolius, Cosmos bipinnatus, and Cosmos parviflorus. Besides this, the Cosmos species were clustered according to their collection sites. The Mantel test corroborates the correlation between the genetic distance and the geographic altitude of each Cosmos species. The results suggest that it is necessary to preserve the Cosmos species in their natural habitat in addition to the germoplasm collection for ex situ conservation.
Polyploidy creates higher diversity among Cynodon accessions as assessed by molecular markers.

PubMed

Gulsen, Osman; Sever-Mutlu, Songul; Mutlu, Nedim; Tuna, Metin; Karaguzel, Osman; Shearman, Robert C; Riordan, Terrance P; Heng-Moss, Tiffany M

2009-05-01

Developing a better understanding of associations among ploidy level, geographic distribution, and genetic diversity of Cynodon accessions could be beneficial to bermudagrass breeding programs, and would enhance our understanding of the evolutionary biology of this warm season grass species. This study was initiated to: (1) determine ploidy analysis of Cynodon accessions collected from Turkey, (2) investigate associations between ploidy level and diversity, (3) determine whether geographic and ploidy distribution are related to nuclear genome variation, and (4) correlate among four nuclear molecular marker systems for Cynodon accessions' genetic analyses. One hundred and eighty-two Cynodon accessions collected in Turkey from an area south of the Taurus Mountains along the Mediterranean cost and ten known genotypes were genotyped using sequence related amplified polymorphism (SRAP), peroxidase gene polymorphism (POGP), inter-simple sequence repeat (ISSR), and random amplified polymorphic DNA (RAPD). The diploids, triploids, tetraploids, pentaploids, and hexaploids revealed by flow cytometry had a linear present band frequency of 0.36, 0.47, 0.49, 0.52, and 0.54, respectively. Regression analysis explained that quadratic relationship between ploidy level and band frequency was the most explanatory (r = 0.62, P < 0.001). The AMOVA results indicated that 91 and 94% of the total variation resided within ploidy level and provinces, respectively. The UPGMA analysis suggested that commercial bermudagrass cultivars only one-third of the available genetic variation. SRAP, POGP, ISSR, and RAPD markers differed in detecting relationships among the bermudagrass genotypes and rare alleles, suggesting more efficiency of combinatory analysis of molecular marker systems. Elucidating Cynodon accessions' genetic structure can aid to enhance breeding programs and broaden genetic base of commercial cultivars.
Selection and characterization of Argentine isolates of Trichoderma harzianum for effective biocontrol of Septoria leaf blotch of wheat.

PubMed

Stocco, Marina C; Mónaco, Cecilia I; Abramoff, Cecilia; Lampugnani, Gladys; Salerno, Graciela; Kripelz, Natalia; Cordo, Cristina A; Consolo, Verónica F

2016-03-01

Species of the genus Trichoderma are economically important as biocontrol agents, serving as a potential alternative to chemical control. The applicability of Trichoderma isolates to different ecozones will depend on the behavior of the strains selected from each zone. The present study was undertaken to isolate biocontrol populations of Trichoderma spp. from the Argentine wheat regions and to select and characterize the best strains of Trichoderma harzianum by means of molecular techniques. A total of 84 out of the 240 strains of Trichoderma were able to reduce the disease severity of the leaf blotch of wheat. Thirty-seven strains were selected for the reduction equal to or greater than 50% of the severity, compared with the control. The percentage values of reduction of the pycnidial coverage ranged between 45 and 80%. The same last strains were confirmed as T. harzianum by polymerase chain reaction amplification of internal transcribed spacers, followed by sequencing. Inter-simple sequence repeat was used to examine the genetic variability among isolates. This resulted in a total of 132 bands. Further numerical analysis revealed 19 haplotypes, grouped in three clusters (I, II, III). Shared strains, with different geographical origins and isolated in different years, were observed within each cluster. The origin of the isolates and the genetic group were partially related. All isolates from Paraná were in cluster I, all isolates from Lobería were in cluster II, and all isolates from Pergamino and Santa Fe were in cluster III. Our results suggest that the 37 native strains of T. harzianum are important in biocontrol programs and could be advantageous for the preparation of biopesticides adapted to the agroecological conditions of wheat culture.
Genetic diversity and geographic differentiation analysis of duckweed using inter-simple sequence repeat markers.

PubMed

Xue, Huiling; Xiao, Yao; Jin, Yanling; Li, Xinbo; Fang, Yang; Zhao, Hai; Zhao, Yun; Guan, Jiafa

2012-01-01

Duckweed, with rapid growth rate and high starch content, is a new alternate feedstock for bioethanol production. The genetic diversity among 27 duckweed populations of seven species in genus Lemna and Spirodela from China and Vietnam was analyzed by ISSR-PCR. Eight ISSR primers generating a reproducible amplification banding pattern had been screened. 89 polymorphic bands were scored out of the 92 banding patterns of 16 Lemna populations, accounting for 96.74% of the polymorphism. 98 polymorphic bands of 11 Spirodela populations were scored out of 99 banding patterns, and the polymorphism was 98.43%. The genetic distance of Lemna varied from 0.127 to 0.784, and from 0.138 to 0.902 for Spirodela, which indicated a high level of genetic variation among the populations studied. The unweighted pair group method with arithmetic average (UPGMA) cluster analysis corresponded well with the genetic distance. Populations from Sichuan China grouped together and so did the populations from Vietnam, which illuminated populations collected from the same region clustered into one group. Especially, the only one population from Tibet was included in subgroup A2 alone. Clustering analysis indicated that the geographic differentiation of collected sites correlated closely with the genetic differentiation of duckweeds. The results suggested that geographic differentiation had great influence on genetic diversity of duckweed in China and Vietnam at the regional scale. This study provided primary guidelines for collection, conservation, characterization of duckweed resources for bioethanol production etc.
Prediction of industrial tomato hybrids from agronomic traits and ISSR molecular markers.

PubMed

Figueiredo, A S T; Resende, J T V; Faria, M V; Da-Silva, P R; Fagundes, B S; Morales, R G F

2016-05-13

Heterosis is a highly relevant phenomenon in plant breeding. This condition is usually established in hybrids derived from crosses of highly divergent parents. The success of a breeder in obtaining heterosis is directly related to the correct identification of genetically contrasting parents. Currently, the diallel cross is the most commonly used methodology to detect contrasting parents; however, it is a time- and cost-consuming procedure. Therefore, new tools capable of performing this task quickly and accurately are required. Thus, the purpose of this study was to estimate the genetic divergence in industrial tomato lines, based on agronomic traits, and to compare with estimates obtained using inter-simple sequence repeat (ISSR) molecular markers. The genetic divergence among 10 industrial tomato lines, based on nine morphological characters and 12 ISSR primers was analyzed. For data analysis, Pearson and Spearman correlation coefficients were calculated between the genetic dissimilarity measures estimated by Mahalanobis distance and Jaccard's coefficient of genetic dissimilarity from the heterosis estimates, combining ability, and means of important traits of industrial tomato. The ISSR markers efficiently detected contrasting parents for hybrid production in tomato. Parent RVTD-08 was indicated as the most divergent, both by molecular and morphological markers, that positively contributed to increased heterosis and by the specific combining ability in the crosses in which it participated. The genetic dissimilarity estimated by ISSR molecular markers aided the identification of the best hybrids of the experiment in terms of total fruit yield, pulp yield, and soluble solids content.
Genetic diversity and population structure of an important wild berry crop

PubMed Central

Zoratti, Laura; Palmieri, Luisa; Jaakola, Laura; Häggman, Hely

2015-01-01

The success of plant breeding in the coming years will be associated with access to new sources of variation, which will include landraces and wild relatives of crop species. In order to access the reservoir of favourable alleles within wild germplasm, knowledge about the genetic diversity and the population structure of wild species is needed. Bilberry (Vaccinium myrtillus) is one of the most important wild crops growing in the forests of Northern European countries, noted for its nutritional properties and its beneficial effects on human health. Assessment of the genetic diversity of wild bilberry germplasm is needed for efforts such as in situ conservation, on-farm management and development of plant breeding programmes. However, to date, only a few local (small-scale) genetic studies of this species have been performed. We therefore conducted a study of genetic variability within 32 individual samples collected from different locations in Iceland, Norway, Sweden, Finland and Germany, and analysed genetic diversity among geographic groups. Four selected inter-simple sequence repeat primers allowed the amplification of 127 polymorphic loci which, based on analysis of variance, made it possible to identify 85 % of the genetic diversity within studied bilberry populations, being in agreement with the mixed-mating system of bilberry. Significant correlations were obtained between geographic and genetic distances for the entire set of samples. The analyses also highlighted the presence of a north–south genetic gradient, which is in accordance with recent findings on phenotypic traits of bilberry. PMID:26483325
High genetic diversity of Jatropha curcas assessed by ISSR.

PubMed

Díaz, B G; Argollo, D M; Franco, M C; Nucci, S M; Siqueira, W J; de Laat, D M; Colombo, C A

2017-05-31

Jatropha curcas L. is a highly promising oilseed for sustainable production of biofuels and bio-kerosene due to its high oil content and excellent quality. However, it is a perennial and incipiently domesticated species with none stable cultivar created until now despite genetic breeding programs in progress in several countries. Knowledge of the genetic structure and diversity of the species is a necessary step for breeding programs. The molecular marker can be used as a tool for speed up the process. This study was carried out to assess genetic diversity of a germplasm bank represented by J. curcas accessions from different provenance beside interspecific hybrid and backcrosses generated by IAC breeding programs using inter-simple sequence repeat markers. The molecular study revealed 271 bands of which 98.9% were polymorphic with an average of 22.7 polymorphic bands per primer. Genetic diversity of the germplasm evaluated was slightly higher than other germplasm around the world and ranged from 0.55 to 0.86 with an average of 0.59 (Jaccard index). Cluster analysis (UPGMA) revealed no clear grouping as to the geographical origin of accessions, consistent with genetic structure analysis using the Structure software. For diversity analysis between groups, accessions were divided into eight groups by origin. Nei's genetic distance between groups was 0.14. The results showed the importance of Mexican accessions, congeneric wild species, and interspecific hybrids for conservation and development of new genotypes in breeding programs.
Morphoagronomic and molecular profiling of Capsicum spp from southwest Mato Grosso, Brazil.

PubMed

Campos, A L; Marostega, T N; Cabral, N S S; Araújo, K L; Serafim, M E; Seabra-Júnior, S; Sudré, C P; Rodrigues, R; Neves, L G

2016-07-15

The genus Capsicum ranks as the second most exported vegetable in Brazil, which is also considered to be a center of diversity for this genus. The aim of this study was to rescue genetic variability in the genus Capsicum in the southwest region of Mato Grosso, and to characterize and estimate the genetic diversity of accessions based on morphoagronomic descriptors and inter-simple sequence repeat molecular markers. Data were obtained following the criteria of the International Plant Genetic Resources Institute, renamed Bioversity International for Capsicum. Data were analyzed using different multivariate statistical techniques. An array of binary data was used to analyze molecular data, and the arithmetic complement of the Jaccard index was used to estimate the genetic dissimilarity among accessions. Six well-defined groups were formed based on the morphological characterization. The most divergent accessions were 142 and 126, with 125 and 126 being the most similar. The groups formed following agronomic characterization differed from those formed by morphological characterization, and there was a need to subdivide the groups for better distinction of accessions. Based on molecular analysis, accessions were divided into two groups, and there was also a need to subdivide the groups. Based on joint analysis (morphological + agronomic + molecular), six groups were formed with no duplicates. For all groups, the cophenetic correlation coefficient was higher than 0.8. These results provide useful information for the better management of the work collection. All correlations between the combined distance matrix were significant by the Mantel test.
Genetic diversity in the candidate trees of Madhuca indica J. F. Gmel. (Mahua) revealed by inter-simple sequence repeats (ISSRs).

PubMed

Nimbalkar, S D; Jade, S S; Kauthale, V K; Agale, S; Bahulikar, R A

2018-03-01

Madhuca indica provides livelihood to several tribal people in India, where the flowers are used for extraction of sweet juices having multiple applications. Certain trees have more value as judged by the tribal people mainly based on yield and quality performance of the trees, and these trees were selected for the genetic diversity analyses. Genetic diversity of 48 candidate Mahua trees from Etapalli, Dadagaon, and Jawhar, Maharashtra, India, was assessed using ISSR markers. Fourteen ISSR primers revealed a total of 132 polymorphic bands giving overall 92% polymorphism. Genetic diversity, in terms of expected number of alleles (Ne), the observed number of alleles (Na), Nei's genetic diversity (H), and Shannon's information index ( I ) was 1.921, 1.333, 0.211, and 0.337, respectively, and suggested lower genetic diversity. Region wise analysis revealed higher genetic diversity for site Etapalli ( H = 0.206) and lowest at Dhadgaon ( H = 0.140). Etapalli area possesses higher forest cover than Dhadgaon and Jawhar. Additionally, in Dhadgaon and Jawhar M. indica trees are restricted to field bunds; both reasons might contribute to lower genetic diversity in these regions. The dendrogram and the principal coordinate analyses showed no region-specific clustering. The clustering patterns were supported by AMOVA where higher genetic variance was observed within trees and lower variance among regions. Long-distance dispersal and/or higher human interference might be responsible for low diversity and higher genetic variance within the candidate trees.
Assessment of genetic characteristics of Aconitum germplasms in Xinjiang Province (China) by RAPD and ISSR markers

PubMed Central

Zhao, Feicui; Nie, Jihong; Chen, Muzhi; Wu, Guirong

2015-01-01

Aconitum is a medicinal treasure trove that grows extensively on fertile pastures in Xinjiang Province (China); however, its molecular genetic characteristics are still poorly studied. We studied Aconitum kusnezoffii Reichb., Aconitum soongaricum Stapf., Aconitum carmichaelii Debx. and Aconitum leucostomum Worosch, using random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) techniques, to evaluate their genetic relationship and potential medicinal value. Our results showed that A. kusnezoffii Reichb. and A. soongaricum Stapf. have close genetic relationship and cluster together. Polymorphism rates of 97.25% and 98.92% were achieved by using 15 RAPD and 15 ISSR primers, respectively. Based on Nei's gene diversity (H) and Shannon's index (I), the inter-population diversity (Hs) was higher when compared with the intra-population diversity (Hp). Among the three Aconitum populations, the coefficient of gene differentiation (Gst) was 0.4358 when evaluated by RAPD and 0.5005 by ISSR. The genetic differentiation among the three Aconitum populations was highly significant, suggesting low gene flow (Nm). This was confirmed by the estimates of gene flow (Nm = 0.6473 and Nm = 0.4991, based on ISSR and RAPD data, respectively). Comparing the RAPD and ISSR results, the two DNA markers proved similarly effective in the assessment of the genetic characteristics of the studied Aconitum populations and could be used for reliable fingerprinting and mapping in studies on Aconitum diversity in view of Aconitum suitability for development and protection. PMID:26019645
Comparison of RAPD and ISSR markers for assessment of genetic diversity among endangered rare Dalbergia oliveri (Fabaceae) genotypes in Vietnam.

PubMed

Phong, D T; Hien, V T T; Thanh, T T V; Tang, D V

2011-10-06

Dalbergia oliveri is a leguminous tree of the Fabaceae family. This species is popular and valuable in Vietnam and is currently listed on the Vietnam Red List and on the IUCN Red List as endangered. Two PCR techniques using RAPD and inter-simple sequence repeat (ISSR) markers were used to make a comparative analysis of genetic diversity in this species. Fifty-six polymorphic primers (29 RAPD and 27 ISSR) were used. The RAPD primers produced 63 bands across 35 genotypes, of which 24 were polymorphic. The number of amplified bands varied from one to four, with a size range from 250 to 1400 bp. The percentage polymorphism ranged from 0 to 75. Amplification of genomic DNA of the 35 genotypes, using ISSR analysis, yielded 104 fragments, of which 63 were polymorphic. The number of amplified fragments using ISSR primers ranged from one to nine and varied in size from 250 to 1500 bp. The percentage polymorphism ranged from 0 to 100. ISSR markers were relatively more efficient than RAPDs. The mental test between two Jaccard's similarity matrices gave r ≥0.802, showing good fit correlation between ISSRs and RAPDs. Clustering of isolates remained more or less the same for RAPDs compared to combined RAPD and ISSR data. The similarity coefficient ranged from 0.785 to 1.000, 0.698 to 0.956 and 0.752 to 0.964 with RAPD, ISSR, and the combined RAPD-ISSR dendrogram, respectively.

Emergence of a New Population of Rathayibacter toxicus: An Ecologically Complex, Geographically Isolated Bacterium

PubMed Central

Arif, Mohammad; Busot, Grethel Y.; Mann, Rachel; Rodoni, Brendan; Liu, Sanzhen; Stack, James P.

2016-01-01

Rathayibacter toxicus is a gram-positive bacterium that infects the floral parts of several Poaceae species in Australia. Bacterial ooze is often produced on the surface of infected plants and bacterial galls are produced in place of seed. R. toxicus is a regulated plant pathogen in the U.S. yet reliable detection and diagnostic tools are lacking. To better understand this geographically-isolated plant pathogen, genetic variation as a function of geographic location, host species, and date of isolation was determined for isolates collected over a forty-year period. Discriminant analyses of recently collected and archived isolates using Multi-Locus Sequence Typing (MLST) and Inter-Simple Sequence Repeats (ISSR) identified three populations of R. toxicus; RT-I and RT-II from South Australia and RT-III from Western Australia. Population RT-I, detected in 2013 and 2014 from the Yorke Peninsula in South Australia, is a newly emerged population of R. toxicus not previously reported. Commonly used housekeeping genes failed to discriminate among the R. toxicus isolates. However, strategically selected and genome-dispersed MLST genes representing an array of cellular functions from chromosome replication, antibiotic resistance and biosynthetic pathways to bacterial acquired immunity were discriminative. Genetic variation among isolates within the RT-I population was less than the within-population variation for the previously reported RT-II and RT-III populations. The lower relative genetic variation within the RT-I population and its absence from sampling over the past 40 years suggest its recent emergence. RT-I was the dominant population on the Yorke Peninsula during the 2013–2014 sampling period perhaps indicating a competitive advantage over the previously detected RT-II population. The potential for introduction of this bacterial plant pathogen into new geographic areas provide a rationale for understanding the ecological and evolutionary trajectories of R. toxicus. PMID:27219107
Emergence of a New Population of Rathayibacter toxicus: An Ecologically Complex, Geographically Isolated Bacterium.

PubMed

Arif, Mohammad; Busot, Grethel Y; Mann, Rachel; Rodoni, Brendan; Liu, Sanzhen; Stack, James P

2016-01-01

Rathayibacter toxicus is a gram-positive bacterium that infects the floral parts of several Poaceae species in Australia. Bacterial ooze is often produced on the surface of infected plants and bacterial galls are produced in place of seed. R. toxicus is a regulated plant pathogen in the U.S. yet reliable detection and diagnostic tools are lacking. To better understand this geographically-isolated plant pathogen, genetic variation as a function of geographic location, host species, and date of isolation was determined for isolates collected over a forty-year period. Discriminant analyses of recently collected and archived isolates using Multi-Locus Sequence Typing (MLST) and Inter-Simple Sequence Repeats (ISSR) identified three populations of R. toxicus; RT-I and RT-II from South Australia and RT-III from Western Australia. Population RT-I, detected in 2013 and 2014 from the Yorke Peninsula in South Australia, is a newly emerged population of R. toxicus not previously reported. Commonly used housekeeping genes failed to discriminate among the R. toxicus isolates. However, strategically selected and genome-dispersed MLST genes representing an array of cellular functions from chromosome replication, antibiotic resistance and biosynthetic pathways to bacterial acquired immunity were discriminative. Genetic variation among isolates within the RT-I population was less than the within-population variation for the previously reported RT-II and RT-III populations. The lower relative genetic variation within the RT-I population and its absence from sampling over the past 40 years suggest its recent emergence. RT-I was the dominant population on the Yorke Peninsula during the 2013-2014 sampling period perhaps indicating a competitive advantage over the previously detected RT-II population. The potential for introduction of this bacterial plant pathogen into new geographic areas provide a rationale for understanding the ecological and evolutionary trajectories of R. toxicus.
Evaluating genetic diversity and constructing core collections of Chinese Lentinula edodes cultivars using ISSR and SRAP markers.

PubMed

Liu, Jun; Wang, Zhuo-Ren; Li, Chuang; Bian, Yin-Bing; Xiao, Yang

2015-06-01

Genetic diversity among 89 Chinese Lentinula edodes cultivars was analyzed by inter-simple sequence repeat (ISSR) and sequence-related amplified polymorphism (SRAP) markers. A 123 out of 126 ISSR loci (97.62%) and 108 out of 129 SRAP loci (83.73%) were polymorphic between two or more strains. A dendrogram constructed by cluster analysis based on the ISSR and SRAP markers separated the L. edodes strains into two major groups, of which group B was further divided into five subgroups. Clustering results also showed a positive correlation with the main agronomic traits of the strains, and that strains with similar traits clustered together into the same groups or subgroups in most cases. The average coefficient of pairwise genetic similarity was 0.820 (range: 0.576-0.988). Compared to the wild strains, Chinese L. edodes cultivars indicated a lower level of genetic diversity. Two preliminary core collections of L. edodes, Core1 and Core2, were established based on the ISSR and SRAP data, respectively. Core1 was constructed by the advanced M (maximization) strategy using the PowerCore version 1.0 software and contained 21 strains, whereas Core2 was created by the allele preferred sampling strategy using the cluster method and contained 18 strains. Both core collections were highly representative of the genetic diversity of the original germplasm, as confirmed by the values of Na (observed number of alleles), Ne (effective number of alleles), H (Nei's gene diversity) and I (Shannon's information index), as well as results of principal coordinate analysis. The loci retention ratio of Core1 (99.61%) was higher than that of Core2 (97.65%). Moreover, Core1 contained strains with more types of agronomic traits than those in Core2. This study builds the basis for further effective protection, management and use of L. edodes germplasm resource. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Genetic differentiation induced by spaceflight treatment of Cistanche deserticola and identification of inter-simple sequence repeat markers associated with its medicinal constituent contents

NASA Astrophysics Data System (ADS)

Wu, Y.; Yang, D. Y.; Tu, P. F.; Tian, Y. Z.; Guo, Y. H.; Wang, X. M.; Li, X. B.

2011-02-01

The dried, fleshy stems of Cistanche deserticola (Orobanchaceae) are popular tonics in Traditional Chinese Medicine (TCM) to treat the inability of kidney in expelling extra fluid in the body, causing fluid retention, and reform reproductive system. However, the wild plants of C. deserticola have become endangered due to habitat downsizing and over-harvesting for its medicinal usages. The present research was carried out for the following purposes: (1) promoting the space-breeding research; (2) providing molecular evidence for agricultural selective breeding; and (3) protecting this endangered herbal medicine and conserving its genetic resources.In this study, plants were cultivated from seeds specifically treated in spaceflight for seven days, and sampled to screen positive mutants and identify ISSR markers associated with their medicinal constituents. As a result, nine out of the 94 ISSR primers were showed high polymorphism, and a total of 118 bands were generated, of which 80 were polymorphic, ranging from 250 to 2600 bp. The spaceflight mutants were found to have lower coefficient of gene differentiation (Gst = 0.0269), and higher gene flow (Nm = 18.0740) than those of the controls (Gst = 0.2067 and Nm = 1.9185). The results demonstrated that most of the genetic variation were harnessed within populations (>97%). The Analysis of Molecular Variance (AMOVA) revealed high genetic variation within populations (88.03%) and low genetic differentiation among regions (-18.83%) and populations (30.79%), respectively. The results also indicated a profound difference between spaceflight condition and that on the earth. The unique vacuum environment of spaceflight was suggested to induce DNA mutation and various variations of C. deserticola. In addition, six particular ISSR markers were identified, cloned and sequenced; one of them, CA41939-934, was found positively correlated with acteoside with correlation coefficient values of 0.264 (P < 0.05). Our work may provide a valuable molecular evidence for establishing conservation strategies and space-breeding research program.
Comparison of old and new wheat cultivars in Iran by measuring germination related traits, osmotic tolerance and ISSR diversity.

PubMed

Ramshini, Hossein; Mirzazadeh, Tahere; Moghaddam, Mohsen Esmaeilzadeh; Amiri, Reza

2016-07-01

A primary concern of modern plant breeding is that genetic diversity has decreased during the past century. This study set out to explore changes in genetic variation during 84 years of breeding by investigating the germination-related traits, inter-simple sequence repeat (ISSR) fingerprinting and osmotic stress tolerance of 30 Iranian wheat ( Triticum aestivum L.) cultivars. Seeds were planted under control and osmotic stress (-2, -4 and -6 bar) in three replications. The ISSR experiment was carried out using 32 different primers. Genotypes were divided into two groups (old and new) each containing 15 members. The results of ANOVA showed that highly significant differences existed among genotypes and among growth conditions. The results showed that during breeding in some traits such as coleoptile length and seedling vigor index, a significant decrease has been occurred. New cultivars had a mean coleoptile length of 33 mm, shorter than that of old cultivars (42 mm) under osmotic stress of -6 bar. Genetic variance of root length, shoot length and seedling vigor index for old cultivars were 1.59, 1.93 and 45,763, respectively, significantly higher than those for new cultivars (0.55, 1.08 and 27,996, respectively). This difference was also verified by ISSR results as the polymorphism information content was 0.28 in old cultivars, higher than that of new cultivars (0.26). These results prove this claim that during breeding, genetic diversity has decreased for many germination-related traits and breeders are better to pay more attention to genetic diversity.
High genetic diversity and insignificant interspecific differentiation in Opisthopappus Shih, an endangered cliff genus endemic to the Taihang Mountains of China.

PubMed

Guo, Rongmin; Zhou, Lihua; Zhao, Hongbo; Chen, Fadi

2013-01-01

Opisthopappus Shih is endemic to the Taihang Mountains, China. It grows in the crevice of cliffs and is in fragmented distribution. This genus consists of two species, namely, O. taihangensis (Ling) Shih and O. longilobus Shih, which are both endangered plants in China. This study adopted intersimple sequence repeat markers (ISSR) to analyze the genetic diversity and genetic structure from different levels (genus, species, and population) in this genus. A total of 253 loci were obtained from 27 primers, 230 of which were polymorphic loci with a proportion of polymorphic bands (PPB) of up to 90.91% at genus level. At species level, both O. taihangensis (PPB = 90.12%, H = 0.1842, and I = 0.289) and O. longilobus (PPB = 95.21%, H = 0.2226, and I = 0.3542) have high genetic diversity. Their respective genetic variation mostly existed within the population. And genetic variation in O. longilobus (84.95%) was higher than that in O. taihangensis (80.45%). A certain genetic differentiation among populations in O. taihangensis was found (G(st) = 0.2740, Φ(st) = 0.196) and genetic differentiation in O. longilobus was very small (G(st) = 0.1034, Φ(st) = 0.151). Gene flow in different degrees (N(m) = 1.325 and 4.336, resp.) and mating system can form the existing genetic structures of these two species. Furthermore, genetic differentiation coefficient (G(st) = 0.0453) between species and the clustering result based on the genetic distance showed that interspecific differentiation between O. taihangensis and O. longilobus was not significant and could occur lately.
Variability among Capsicum baccatum accessions from Goiás, Brazil, assessed by morphological traits and molecular markers.

PubMed

Martinez, A L A; Araújo, J S P; Ragassi, C F; Buso, G S C; Reifschneider, F J B

2017-07-06

Capsicum peppers are native to the Americas, with Brazil being a significant diversity center. Capsicum baccatum accessions at Instituto Federal (IF) Goiano represent a portion of the species genetic resources from central Brazil. We aimed to characterize a C. baccatum working collection comprising 27 accessions and 3 commercial cultivars using morphological traits and molecular markers to describe its genetic and morphological variability and verify the occurrence of duplicates. This set included 1 C. baccatum var. praetermissum and 29 C. baccatum var. pendulum with potential for use in breeding programs. Twenty-two morphological descriptors, 57 inter-simple sequence repeat, and 34 random amplified polymorphic DNA markers were used. Genetic distance was calculated through the Jaccard similarity index and genetic variability through cluster analysis using the unweighted pair group method with arithmetic mean, resulting in dendrograms for both morphological analysis and molecular analysis. Genetic variability was found among C. baccatum var. pendulum accessions, and the distinction between the two C. baccatum varieties was evident in both the morphological and molecular analyses. The 29 C. baccatum var. pendulum genotypes clustered in four groups according to fruit type in the morphological analysis. They formed seven groups in the molecular analysis, without a clear correspondence with morphology. No duplicates were found. The results describe the genetic and morphological variability, provide a detailed characterization of genotypes, and discard the possibility of duplicates within the IF Goiano C. baccatum L. collection. This study will foment the use of this germplasm collection in C. baccatum breeding programs.
Genetic divergence through joint analysis of morphoagronomic and molecular characters in accessions of Jatropha curcas.

PubMed

Pestana-Caldas, C N; Silva, S A; Machado, E L; de Souza, D R; Cerqueira-Pereira, E C; Silva, M S

2016-10-05

The aim of this study was to investigate the genetic divergence between accessions of Jatropha curcas through joint analysis of morphoagronomic and molecular characters. To this end, we investigated 11 morphoagronomic characters and performed molecular genotyping, using 23 inter-simple sequence repeat (ISSR) primers in 46 accessions of J. curcas. We calculated the contribution of each character on divergence using analysis of variance. The grouping among accessions was performed using the Ward-MLM (modified location model) method, using morphoagronomic and molecular data, whereas the cophenetic correlation was obtained based on Gower's algorithm. There were significant differences in all growth-related characteristics: number of primary and secondary branches per plant, plant height, and stem diameter. For characters related to grain production, differences were found for number of fruit clusters per plant and number of inflorescence clusters per plant and average number of seeds per fruit. The greatest phenotypic variation was found in plant height (59.67- 222.33 cm), whereas the smallest variation was found in average number of seeds per fruit (0-2.90), followed by the number of fruit clusters per plant (0-8.67). In total, 94 polymorphic ISSR fragments were obtained. The genotypic grouping identified six groups, indicating that there is genetic divergence among the accessions. The most promising crossings for future hybridization were identified among accessions UFRB60 and UFVJC45, and UFRB61 and UFVJC18. In conclusion, the joint analysis of morphoagronomic characters and ISSR markers is an efficient method to assess the genetic divergence in J. curcas.
High efficiency and reliability of inter-simple sequence repeats (ISSR) markers for evaluation of genetic diversity in Brazilian cultivated Jatropha curcas L. accessions.

PubMed

Grativol, Clícia; da Fonseca Lira-Medeiros, Catarina; Hemerly, Adriana Silva; Ferreira, Paulo Cavalcanti Gomes

2011-10-01

Jatropha curcas L. is found in all tropical regions and has garnered lot of attention for its potential as a source of biodiesel. As J. curcas is a plant that is still in the process of being domesticated, interest in improving its agronomic traits has increased in an attempt to select more productive varieties, aiming at sustainable utilization of this plant for biodiesel production. Therefore, the study of genetic diversity in different accessions of J. curcas in Brazil constitutes a necessary first step in genetic programs designed to improve this species. In this study we have used ISSR markers to assess the genetic variability of 332 accessions from eight states in Brazil that produce J. curcas seeds for commercialization. Seven ISSR primers amplified a total of 21,253 bands, of which 19,472 bands (91%) showed polymorphism. Among the polymorphic bands 275 rare bands were identified (present in fewer than 15% of the accessions). Polymorphic information content (PIC), marker index (MI) and resolving power (RP) averaged 0.26, 17.86 and 19.87 per primer, respectively, showing the high efficiency and reliability of the markers used. ISSR markers analyses as number of polymorphic loci, genetic diversity and accession relationships through UPGMA-phenogram and MDS showed that Brazilian accessions are closely related but have a higher level of genetic diversity than accessions from other countries, and the accessions from Natal (RN) are the most diverse, having high value as a source of genetic diversity for breeding programs of J. curcas in the world.
Loss of genetic connectivity and diversity in urban microreserves in a southern California endemic Jerusalem cricket (Orthoptera: Stenopelmatidae: Stenopelmatus n. sp. "santa monica")

USGS Publications Warehouse

Vandergast, A.G.; Lewallen, E.A.; Deas, J.; Bohonak, A.J.; Weissman, D.B.; Fisher, R.N.

2009-01-01

Microreserves may be useful in protecting native arthropod diversity in urbanized landscapes. However, species that do not disperse through the urban matrix may eventually be lost from these fragments. Population extinctions may be precipitated by an increase in genetic differentiation among fragments and loss of genetic diversity within fragments, and these effects should become stronger with time. We analyzed population genetic structure in the dispersal limited Jerusalem cricket Stenopelmatus n. sp. "santa monica" in the Santa Monica Mountains and Simi Hills north of Los Angeles, California (CA), to determine the impacts of fragmentation over the past 70 years. MtDNA divergence was greater among urban fragments than within contiguous habitat and was positively correlated with fragment age. MtDNA genetic diversity within fragments increased with fragment size and decreased with fragment age. Genetic divergence across 38 anonymous nuclear Inter-Simple Sequence Repeat (ISSR) loci was influenced by the presence of major highways and highway age, but there was no effect of additional urban fragmentation. ISSR diversity was not correlated with fragment size or age. Differing results between markers may be due to male-biased dispersal, or different effective population sizes, sorting rates, or mutation rates among sampled genes. Results suggest that genetic connectivity among populations has been disrupted by highways and urban development, prior to declines in local population sizes. We emphasize that genetic connectivity can rapidly erode in fragmented landscapes and that flightless arthropods can serve as sensitive indicators for these effects. ?? Springer Science+Business Media B.V. 2008.
Molecular characterization of accessions of Cratylia argentea (Camaratuba) using ISSR markers.

PubMed

Luz, G A; Gomes, S O; Araujo Neto, R B; Nascimento, M S C B; Lima, P S C

2015-11-27

Cratylia argentea (Desv.) Kuntze (Fabaceae) is a drought-tolerant, perennial legume found primarily in Brazil, Bolivia, and Peru. The shrub is well adapted to acid soils and exhibits high productivity and nutritional value, characteristics that would favor its use as a dry season animal forage supplement in semiarid regions. In plant improvement programs, the production of elite hybrids with superior traits is generally achieved by crossing parents that exhibit the highest level of genetic divergence. Therefore, the aim of the present study was to assess genetic diversity among 13 accessions of C. argentea from the same population maintained in the active germplasm bank of Embrapa Meio-Norte using inter-simple sequence repeat (ISSR) markers. Genetic similarities between C. argentea accessions were estimated from Jaccard coefficients, and a dendrogram was constructed using the unweighted pair group method with arithmetic average (UPGMA). The set of 15 primers selected for ISSR analysis generated a total of 313 loci of which 79.23% were polymorphic. The mean number of bands per primer was 20.87, and the amplicons ranged from 280 to 3000 bp in size. Primers UBC834 and UBC827 generated the largest number of polymorphic loci and exhibited 90.91 and 100% polymorphism, respectively. The coefficients of genetic similarity among accessions varied between 0.49 and 0.73. UPGMA cluster analysis allowed the identification of four genotypic groups and demonstrated the existence of considerable variability within the collection. Potential progenitors were selected that would offer good possibilities of obtaining unusual and favorable combinations of genes in a plant breeding program.
Genetic, epigenetic, and HPLC fingerprint differentiation between natural and ex situ populations of Rhodiola sachalinensis from Changbai Mountain, China.

PubMed

Zhao, Wei; Shi, Xiaozheng; Li, Jiangnan; Guo, Wei; Liu, Chengbai; Chen, Xia

2014-01-01

Rhodiola sachalinensis is an endangered species with important medicinal value. We used inter-simple sequence repeat (ISSR) and methylation-sensitive amplified polymorphism (MSAP) markers to analyze genetic and epigenetic differentiation in different populations of R. sachalinensis, including three natural populations and an ex situ population. Chromatographic fingerprint was used to reveal HPLC fingerprint differentiation. According to our results, the ex situ population of R. sachalinensis has higher level genetic diversity and greater HPLC fingerprint variation than natural populations, but shows lower epigenetic diversity. Most genetic variation (54.88%) was found to be distributed within populations, and epigenetic variation was primarily distributed among populations (63.87%). UPGMA cluster analysis of ISSR and MSAP data showed identical results, with individuals from each given population grouping together. The results of UPGMA cluster analysis of HPLC fingerprint patterns was significantly different from results obtained from ISSR and MSAP data. Correlation analysis revealed close relationships among altitude, genetic structure, epigenetic structure, and HPLC fingerprint patterns (R2 = 0.98 for genetic and epigenetic distance; R2 = 0.90 for DNA methylation level and altitude; R2 = -0.95 for HPLC fingerprint and altitude). Taken together, our results indicate that ex situ population of R. sachalinensis show significantly different genetic and epigenetic population structures and HPLC fingerprint patterns. Along with other potential explanations, these findings suggest that the ex situ environmental factors caused by different altitude play an important role in keeping hereditary characteristic of R. sachalinensis.
Genetic variability in selected date palm (Phoenix dactylifera L.) cultivars of United Arab Emirates using ISSR and DAMD markers.

PubMed

Purayil, Fayas T; Robert, Gabriel A; Gothandam, Kodiveri M; Kurup, Shyam S; Subramaniam, Sreeramanan; Cheruth, Abdul Jaleel

2018-02-01

Nine (9) different date palm ( Phoenix dactylifera L.) cultivars from UAE, which differ in their flower timings were selected to determine the polymorphism and genetic relationship between these cultivars. Hereditary differences and interrelationships were assessed utilizing inter-simple sequence repeat (ISSR) and directed amplification of minisatellite DNA region (DAMD) primers. Analysis on eight DAMD and five ISSR markers produced total of 113 amplicon including 99 polymorphic and 14 monomorphic alleles with a polymorphic percentage of 85.45. The average polymorphic information content for the two-marker system was almost similar (DAMD, 0.445 and ISSR, 0.459). UPGMA based clustering of DAMD and ISSR revealed that mid-season cultivars, Mkh (Khlas) and MB (Barhee) grouped together to form a subcluster in both the marker systems. The genetic similarity analysis followed by clustering of the cumulative data from the DAMD and ISSR resulted in two major clusters with two early-season cultivars (ENg and Ekn), two mid-season cultivars (MKh and MB) and one late-season cultivar (Lkhs) in cluster 1, cluster 2 includes two late-season cultivars, one early-season cultivar and one mid-season cultivar. The cluster analysis of both DAMD and ISSR marker revealed that, the patterns of variation between some of the tested cultivars were similar in both DNA marker systems. Hence, the present study signifies the applicability of DAMD and ISSR marker system in detecting genetic diversity of date palm cultivars flowering at different seasons. This may facilitate the conservation and improvement of date palm cultivars in the future.
Genetic, Epigenetic, and HPLC Fingerprint Differentiation between Natural and Ex Situ Populations of Rhodiola sachalinensis from Changbai Mountain, China

PubMed Central

Zhao, Wei; Shi, Xiaozheng; Li, Jiangnan; Guo, Wei; Liu, Chengbai; Chen, Xia

2014-01-01

Rhodiola sachalinensis is an endangered species with important medicinal value. We used inter-simple sequence repeat (ISSR) and methylation-sensitive amplified polymorphism (MSAP) markers to analyze genetic and epigenetic differentiation in different populations of R. sachalinensis, including three natural populations and an ex situ population. Chromatographic fingerprint was used to reveal HPLC fingerprint differentiation. According to our results, the ex situ population of R. sachalinensis has higher level genetic diversity and greater HPLC fingerprint variation than natural populations, but shows lower epigenetic diversity. Most genetic variation (54.88%) was found to be distributed within populations, and epigenetic variation was primarily distributed among populations (63.87%). UPGMA cluster analysis of ISSR and MSAP data showed identical results, with individuals from each given population grouping together. The results of UPGMA cluster analysis of HPLC fingerprint patterns was significantly different from results obtained from ISSR and MSAP data. Correlation analysis revealed close relationships among altitude, genetic structure, epigenetic structure, and HPLC fingerprint patterns (R2 = 0.98 for genetic and epigenetic distance; R2 = 0.90 for DNA methylation level and altitude; R2 = –0.95 for HPLC fingerprint and altitude). Taken together, our results indicate that ex situ population of R. sachalinensis show significantly different genetic and epigenetic population structures and HPLC fingerprint patterns. Along with other potential explanations, these findings suggest that the ex situ environmental factors caused by different altitude play an important role in keeping hereditary characteristic of R. sachalinensis. PMID:25386983
Investigating genetic diversity and habitat dynamics in Plantago brutia (Plantaginaceae), implications for the management of narrow endemics in Mediterranean mountain pastures.

PubMed

De Vita, A; Bernardo, L; Gargano, D; Palermo, A M; Peruzzi, L; Musacchio, A

2009-11-01

Many factors have contributed to the richness of narrow endemics in the Mediterranean, including long-lasting human impact on pristine landscapes. The abandonment of traditional land-use practices is causing forest recovery throughout the Mediterranean mountains, by increasing reduction and fragmentation of open habitats. We investigated the population genetic structure and habitat dynamics of Plantago brutia Ten., a narrow endemic in mountain pastures of S Italy. Some plants were cultivated in the botanical garden to explore the species' breeding system. Genetic diversity was evaluated based on inter-simple sequence repeat (ISSR) polymorphisms in 150 individuals from most of known stands. Recent dynamics in the species habitat were checked over a 14-year period. Flower phenology, stigma receptivity and experimental pollinations revealed protogyny and self-incompatibility. With the exception of very small and isolated populations, high genetic diversity was found at the species and population level. amova revealed weak differentiation among populations, and the Mantel test suggested absence of isolation-by-distance. Multivariate analysis of population and genetic data distinguished the populations based on genetic richness, size and isolation. Landscape analyses confirmed recent reduction and isolation of potentially suitable habitats. Low selfing, recent isolation and probable seed exchange may have preserved P. brutia populations from higher loss of genetic diversity. Nonetheless, data related to very small populations suggest that this species may suffer further fragmentation and isolation. To preserve most of the species' genetic richness, future management efforts should consider the large and isolated populations recognised in our analyses.
Analysis of sequence repeats of proteins in the PDB.

PubMed

Mary Rajathei, David; Selvaraj, Samuel

2013-12-01

Internal repeats in protein sequences play a significant role in the evolution of protein structure and function. Applications of different bioinformatics tools help in the identification and characterization of these repeats. In the present study, we analyzed sequence repeats in a non-redundant set of proteins available in the Protein Data Bank (PDB). We used RADAR for detecting internal repeats in a protein, PDBeFOLD for assessing structural similarity, PDBsum for finding functional involvement and Pfam for domain assignment of the repeats in a protein. Through the analysis of sequence repeats, we found that identity of the sequence repeats falls in the range of 20-40% and, the superimposed structures of the most of the sequence repeats maintain similar overall folding. Analysis sequence repeats at the functional level reveals that most of the sequence repeats are involved in the function of the protein through functionally involved residues in the repeat regions. We also found that sequence repeats in single and two domain proteins often contained conserved sequence motifs for the function of the domain. Copyright © 2013 Elsevier Ltd. All rights reserved.
An improved micropropagation of Arnebia hispidissima (Lehm.) DC. and assessment of genetic fidelity of micropropagated plants using DNA-based molecular markers.

PubMed

Phulwaria, Mahendra; Rai, Manoj K; Shekhawat, N S

2013-07-01

An efficient and improved in vitro propagation method has been developed for Arnebia hispidissima, a medicinally and pharmaceutically important plant species of arid and semiarid regions. Nodal segments (3-4 cm) with two to three nodes obtained from field grown plants were used as explants for shoot proliferation. Murashige and Skoog's (MS) medium supplemented with cytokinins with or without indole-3-acetic acid (IAA) or naphthalene acetic acid was used for shoot multiplication. Out of different PGRs combinations, MS medium containing 0.5 mg l(-1) 6-benzylaminopurine and 0.1 mg l(-1) IAA was optimal for shoot multiplication. On this medium, explants produced the highest number of shoots (47.50 ± 0.38). About 90 % of shoots rooted ex vitro on sterile soilrite under the greenhouse condition when the base (2-4 mm) of shoots was treated with 300 mg l(-1) of indole-3-butyric acid for 5 min. The plantlets were hardened successfully in the greenhouse with 85-90 % survival rate. Random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) markers were employed to assess the genetic stability of in vitro-regenerated plants of A. hispidissima. Out of 40 (25 RAPD and 15 ISSR) primers screened, 15 RAPD and 7 ISSR primers produced a total number of 111 (77 RAPD and 34 ISSR) reproducible amplicons. The amplified products were monomorphic across all the micropropagated plants and were similar to the mother plant. To the best of our knowledge, it is the first report on the assessment of the genetic fidelity in micropropagated plants of A. hispidissima.
In vitro propagation and assessment of genetic stability of acclimated plantlets of Cornus alba L. using RAPD and ISSR markers.

PubMed

Ilczuk, Agnieszka; Jacygrad, Ewelina

2016-01-01

Cornus alba L. (white dogwood) is an important ornamental shrub having a wide range of applications such as reforestation programs and soil retention systems. The vegetative propagation of dogwood by cuttings may be slow, difficult, and cultivar dependent; therefore, an improved micropropagation method was developed. Nodal stem segments of C. alba cultivars 'Aurea' and 'Elegantissima' were cultured on media enriched with six different sources of macronutrients. Media were supplemented with either N 6 -benzyladenine (BA) or thidiazuron (TDZ) in combination with 1-naphthaleneacetic acid (NAA). Regardless of the cultivar, the best shoot proliferation was observed on Lloyd and McCown medium (woody plant medium (WPM)) at pH 6.2, containing 1.0 mg L -1 BA, 0.1 mg L -1 NAA, and 20-30 g L -1 sucrose. Rooting of regenerated shoots was achieved by an in vitro method when different concentrations of NAA or indole-3-butyric acid (IBA) were tested. Microcuttings were rooted for 8 wk on medium enriched with 0.25 mg L -1 NAA and potted into P9 containers in the greenhouse. The final survival rate of the plants after 20 wk was 80% for 'Aurea' and 90% for 'Elegantissima'. Genetic stability of the micropropagated plants was confirmed by using two DNA-based molecular marker techniques. A total of 30 random amplified polymorphic DNA (RAPD) and 20 inter-simple sequence repeat (ISSR) primers resulted in 197-199 and 184-187 distinct and reproducible band classes, respectively, in 'Aurea' and 'Elegantissima' plantlets. All of the RAPD and ISSR profiles were monomorphic and comparable with the mother plant.
Genetic diversity and structure of Sinopodophyllum hexandrum (Royle) Ying in the Qinling Mountains, China.

PubMed

Liu, Wei; Yin, Dongxue; Liu, Jianjun; Li, Na

2014-01-01

Sinopodophyllum hexandrum is an important medicinal plant whose genetic diversity must be conserved because it is endangered. The Qinling Mts. are a S. hexandrum distribution area that has unique environmental features that highly affect the evolution of the species. To provide the reference data for evolutionary and conservation studies, the genetic diversity and population structure of S. hexandrum in its overall natural distribution areas in the Qinling Mts. were investigated through inter-simple sequence repeats analysis of 32 natural populations. The 11 selected primers generated a total of 135 polymorphic bands. S. hexandrum genetic diversity was low within populations (average He = 0.0621), but higher at the species level (He = 0.1434). Clear structure and high genetic differentiation among populations were detected by using the unweighted pair group method for arithmetic averages, principle coordinate analysis and Bayesian clustering. The clustering approaches supported a division of the 32 populations into three major groups, for which analysis of molecular variance confirmed significant variation (63.27%) among populations. The genetic differentiation may have been attributed to the limited gene flow (Nm = 0.3587) in the species. Isolation by distance among populations was determined by comparing genetic distance versus geographic distance by using the Mantel test. Result was insignificant (r = 0.212, P = 0.287) at 0.05, showing that their spatial pattern and geographic locations are not correlated. Given the low within-population genetic diversity, high differentiation among populations and the increasing anthropogenic pressure on the species, in situ conservation measures were recommended to preserve S. hexandrum in Qinling Mts., and other populations must be sampled to retain as much genetic diversity of the species to achieve ex situ preservation as a supplement to in situ conservation.
Genetic Diversity and Structure of Sinopodophyllum hexandrum (Royle) Ying in the Qinling Mountains, China

PubMed Central

Liu, Wei; Yin, Dongxue; Liu, Jianjun; Li, Na

2014-01-01

Sinopodophyllum hexandrum is an important medicinal plant whose genetic diversity must be conserved because it is endangered. The Qinling Mts. are a S. hexandrum distribution area that has unique environmental features that highly affect the evolution of the species. To provide the reference data for evolutionary and conservation studies, the genetic diversity and population structure of S. hexandrum in its overall natural distribution areas in the Qinling Mts. were investigated through inter-simple sequence repeats analysis of 32 natural populations. The 11 selected primers generated a total of 135 polymorphic bands. S. hexandrum genetic diversity was low within populations (average He = 0.0621), but higher at the species level (He = 0.1434). Clear structure and high genetic differentiation among populations were detected by using the unweighted pair group method for arithmetic averages, principle coordinate analysis and Bayesian clustering. The clustering approaches supported a division of the 32 populations into three major groups, for which analysis of molecular variance confirmed significant variation (63.27%) among populations. The genetic differentiation may have been attributed to the limited gene flow (Nm = 0.3587) in the species. Isolation by distance among populations was determined by comparing genetic distance versus geographic distance by using the Mantel test. Result was insignificant (r = 0.212, P = 0.287) at 0.05, showing that their spatial pattern and geographic locations are not correlated. Given the low within-population genetic diversity, high differentiation among populations and the increasing anthropogenic pressure on the species, in situ conservation measures were recommended to preserve S. hexandrum in Qinling Mts., and other populations must be sampled to retain as much genetic diversity of the species to achieve ex situ preservation as a supplement to in situ conservation. PMID:25333788

Morphometric Differentiation Among Anastrepha fraterculus (Diptera: Tephritidae) Exploiting Sympatric Alternate Hosts.

PubMed

Gómez-Cendra, P V; Paulin, L E; Oroño, L; Ovruski, S M; Vilardi, J C

2016-04-01

Anastrepha fraterculus (Wiedemann) is currently considered a complex of cryptic species infesting fruits from Mexico to Argentina and represents an interesting biological model for evolutionary studies. Moreover, detecting and quantifying behavioral, morphological, and genetic differentiation among populations is also relevant to the application of environment-friendly control programs. Here, phenotypic differentiation among individuals coexisting in the wild in a Northern region of Argentina was unveiled and associated with host choice. Six morphometric traits were measured in sympatric flies exploiting three different host species. Phenotypic variation was shown to be host-dependent regardless of geographical or temporal overlap. Flies collected from synchronous alternate hosts (peach and walnut) differed from each other despite the lack of geographical isolation. By contrast, flies emerging from guavas that ripen about two months later than peach and walnut showed no significant differentiation in comparison to flies collected from walnuts, but they differ significantly from flies originating from peaches. This result is consistent with the hypothesis that the same population of flies shifts from walnuts to guavas throughout the year, whereas the population of flies that uses peaches as a host is probably exploiting other alternate hosts when peach availability decreases. Further research is needed to study the underlying mechanism. Results are consistent with previous molecular markers (inter-simple sequence repeat-ISSR) research on flies stemming from the same hosts and the same area, suggesting that differentiation among flies emerging from alternative hosts occurs at both genetic and phenotypic levels. The contribution of host preference in long-term genetic differentiation is discussed. © The Authors 2016. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Genetic variability within and among populations of an invasive, exotic orchid

PubMed Central

Ueno, Sueme; Rodrigues, Jucelene Fernandes; Alves-Pereira, Alessandro; Pansarin, Emerson Ricardo; Veasey, Elizabeth Ann

2015-01-01

Despite the fact that invasive species are of great evolutionary interest because of their success in colonizing and spreading into new areas, the factors underlying this success often remain obscure. In this sense, studies on population genetics and phylogenetic relationships of invasive species could offer insights into mechanisms of invasions. Originally from Africa, the terrestrial orchid Oeceoclades maculata, considered an invasive plant, is the only species of the genus throughout the Americas. Considering the lack of information on population genetics of this species, the aim of this study was to evaluate the genetic diversity and structure of Brazilian populations of O. maculata. We used 13 inter-simple sequence repeat primers to assess the genetic diversity of 152 individuals of O. maculata distributed in five sampled sites from three Brazilian states (São Paulo, Mato Grosso and Paraná). Low diversity was found within samples, with estimates of the Shannon index (H) ranging from 0.0094 to 0.1054 and estimates of Nei's gene diversity (He) ranging from 0.0054 to 0.0668. However, when evaluated together, the sampling locations showed substantially higher diversity estimates (H = 0.3869, He = 0.2556), and most of the genetic diversity was found among populations (ΦST = 0.933). Both clustering and principal coordinate analysis indicate the existence of five distinct groups, corresponding to the sampled localities, and which were also recovered in the Bayesian analysis. A substructure was observed in one of the localities, suggesting a lack of gene flow even between very small distances. The patterns of genetic structure found in this study may be understood considering the interaction of several probable reproductive strategies with its history of colonization involving possible genetic drift, selective pressures and multiple introductions. PMID:26162896
Molecular and biochemical characterization in Rauvolfia tetraphylla plantlets grown from synthetic seeds following in vitro cold storage.

PubMed

Faisal, Mohammad; Alatar, Abdularhaman A; Hegazy, Ahmad K

2013-01-01

Synseed technology is one of the most important applications of plant biotechnology for in vitro conservation and regeneration of medicinal and aromatic plants. In the present investigation, synseeds of Rauvolfia tetraphylla were produced using in vitro-proliferated shoots upon complexation of 3 % sodium alginate and 100 mM CaCl(2). The encapsulated buds were stored at 4, 8, 12, and 16 °C and high conversion was observed in synseeds stored at 4 °C for 4 weeks. The effect of different medium strength on in vitro conversion response of synseed was evaluated and the maximum conversion (80.6 %) into plantlets was recorded on half-strength woody plant medium supplemented with 7.5 μM 6-benzyladenine and 2.5 μM α-naphthalene acetic acid after 8 weeks of culture. Plantlets with well-developed shoot and roots were hardened and successfully transplanted in field condition. After 4 weeks of transfer to ex vitro conditions, the performance of synseed-derived plantlets was evaluated on the basis of some physiological and biochemical parameters and compared with the in vivo-grown plants. Short-term storage of synthetic seeds at low temperature had no negative impact on physiological and biochemical profile of the plants that survived the storage process. Furthermore, clonal fidelity of synseed-derived plantlets was also assessed and compared with mother plant using rapid amplified polymorphic DNA and inter-simple sequence repeats analysis. No changes in molecular profiles were found among the regenerated plantlets and comparable to mother plant, which confirm the genetic stability among clones. This synseed protocol could be useful for in vitro clonal multiplication, conservation, and short-term storage and exchange of germplasm of this antihypertensive drug-producing plant.
Variability and population genetic structure in Achyrocline flaccida (Weinm.) DC., a species with high value in folk medicine in South America.

PubMed

Rosa, Juliana da; Weber, Gabriela Gomes; Cardoso, Rafaela; Górski, Felipe; Da-Silva, Paulo Roberto

2017-01-01

Better knowledge of medicinal plant species and their conservation is an urgent need worldwide. Decision making for conservation strategies can be based on the knowledge of the variability and population genetic structure of the species and on the events that may influence these genetic parameters. Achyrocline flaccida (Weinm.) DC. is a native plant from the grassy fields of South America with high value in folk medicine. In spite of its importance, no genetic and conservation studies are available for the species. In this work, microsatellite and ISSR (inter-simple sequence repeat) markers were used to estimate the genetic variability and structure of seven populations of A. flaccida from southern Brazil. The microsatellite markers were inefficient in A. flaccida owing to a high number of null alleles. After the evaluation of 42 ISSR primers on one population, 10 were selected for further analysis of seven A. flaccida populations. The results of ISSR showed that the high number of exclusive absence of loci might contribute to the inter-population differentiation. Genetic variability of the species was high (Nei's diversity of 0.23 and Shannon diversity of 0.37). AMOVA indicated higher genetic variability within (64.7%) than among (33.96%) populations, and the variability was unevenly distributed (FST 0.33). Gene flow among populations ranged from 1.68 to 5.2 migrants per generation, with an average of 1.39. The results of PCoA and Bayesian analyses corroborated and indicated that the populations are structured. The observed genetic variability and population structure of A. flaccida are discussed in the context of the vegetation formation history in southern Brazil, as well as the possible anthropogenic effects. Additionally, we discuss the implications of the results in the conservation of the species.
In vitro propagation and assessment of the genetic fidelity of Musa acuminata (AAA) cv. Vaibalhla derived from immature male flowers.

PubMed

Hrahsel, Lalremsiami; Basu, Adreeja; Sahoo, Lingaraj; Thangjam, Robert

2014-02-01

An efficient in vitro propagation method has been developed for the first time for Musa acuminata (AAA) cv. Vaibalhla, an economically important banana cultivar of Mizoram, India. Immature male flowers were used as explants. Murashige and Skoog's (MS) medium supplemented with plant growth regulators (PGRs) were used for the regeneration process. Out of different PGR combinations, MS medium supplemented with 2 mg L(-1) 6-benzylaminopurine (BAP) + 0.5 mg L(-1) α-naphthalene acetic acid (NAA) was optimal for production of white bud-like structures (WBLS). On this medium, explants produced the highest number of buds per explant (4.30). The highest percentage (77.77) and number (3.51) of shoot formation from each explants was observed in MS medium supplemented with 2 mg L(-1) kinetin + 0.5 mg L(-1) NAA. While MS medium supplemented with a combination of 2 mg L(-1) BAP + 0.5 mg L(-1) NAA showed the maximum shoot length (14.44 cm). Rooting efficiency of the shoots was highest in the MS basal medium without any PGRs. The plantlets were hardened successfully in the greenhouse with 96% survival rate. Random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) markers were employed to assess the genetic stability of in vitro regenerated plantlets of M. acuminata (AAA) cv. Vaibalhla. Eight RAPD and 8 ISSR primers were successfully used for the analysis from the 40 RAPD and 30 ISSR primers screened initially. The amplified products were monomorphic across all the regenerated plants and were similar to the mother plant. The present standardised protocol will find application in mass production, conservation and genetic transformation studies of this commercially important banana.
Genetic diversity of Schizolobium parahyba var. amazonicum (Huber ex. Ducke) Barneby, in a forest area in Brazil.

PubMed

Júnior, A L Silva; Souza, L C; Pereira, A G; Caldeira, M V W; Miranda, F D

2017-09-21

Schizolobium parahyba var. amazonicum (Fabaceae) is an arboreal species, endemic to the Amazon Rainforest, popularly known as paricá. It is used on a commercial scale in the timber sector, pulp and paper production, reclamation projects in degraded and landscaped areas. However, there is no availability of genetically improved material selected for the environmental conditions of the State of Espírito Santo, Brazil. In this sense, the present study aimed to characterize the genetic diversity in a population of S. amazonicum, established in a forest area in the southern region of the State of Espírito Santo, using inter-simple sequence repeat (ISSR) molecular markers. DNA samples from 171 individuals were analyzed using 11 ISSR primers, which generated 79 polymorphic bands in a total of 136 fragments (58%). The polymorphic information content performed for the ISSR markers revealed a mean of 0.37, classifying them as moderately informative. The number of loci found (N = 79) was greater than that established as the optimal number (N = 69) for the analyses. High genetic diversity was found with the parameters, genetic diversity of Nei (H E = 0.375) and Shannon index (I = 0.554). The data demonstrated in the dendrogram, based on the UPGMA cluster analysis, corroborated by the Bayesian analysis performed by the STRUCTURE program, which indicated the formation of two distinct clusters (K = 2). One of the groups was formed with the majority of the individuals (153 genotypes) and the second with the minority (18 genotypes). The results revealed high genetic diversity in the population of S. amazonicum evaluated in the present study, determining the potential of the population to be used as an orchard for seed collection and production of seedlings with confirmed genetic variability.
Greek PDO saffron authentication studies using species specific molecular markers.

PubMed

Bosmali, I; Ordoudi, S A; Tsimidou, M Z; Madesis, P

2017-10-01

Saffron, the spice produced from the red stigmas of the flower of Crocus sativus L. is a frequent target of fraud and mislabeling practices that cannot be fully traced using the ISO 3632 trade standard specifications and test methods. A molecular approach is proposed herein as a promising branding strategy for the authentication of highly esteemed saffron brands such as the Greek Protected Designation of Origin (PDO) "Krokos Kozanis". Specific ISSR (inter-simple sequence repeat) markers were used to assess for the first time, the within species variability of several populations of C. sativus L. from the cultivation area of "Krokos Kozanis" as well as the potential differences with the band pattern produced by other Crocus species. Then, species-specific markers were developed taking advantage of an advanced molecular technique such as the HRM analysis coupled with universal DNA barcoding regions (trnL) (Bar-HRM) and applied to saffron admixtures with some of the most common plant adulterants (Calendula officinalis, Carthamus tinctorius, Gardenia jasminoides, Zea mays and Curcuma longa). The sensitivity of the procedure was tested for turmeric as a case study whereas HPLC-fluorescence determination of secondary metabolites was also employed for comparison. The overall results indicated that the Bar-HRM approach is quite effective in terms of specificity and sensitivity. Its effectiveness regarding the detection of turmeric was comparable to that of a conventional HPLC method (0.5% vs 1.0%, w/w). Yet, the proposed DNA-based method is much faster, cost-effective and can be used even by non-geneticists, in any laboratory having access to an HRM-capable real-time PCR instrumentation. It can be, thus, regarded as a strong analytical tool in saffron authentication studies. Copyright © 2017 Elsevier Ltd. All rights reserved.
Population genetic variation in the tree fern Alsophila spinulosa (Cyatheaceae): effects of reproductive strategy.

PubMed

Wang, Ting; Su, Yingjuan; Li, Yuan

2012-01-01

Essentially all ferns can perform both sexual and asexual reproduction. Their populations represent suitable study objects to test the population genetic effects of different reproductive systems. Using the diploid homosporous fern Alsophila spinulosa as an example species, the main purpose of this study was to assess the relative impact of sexual and asexual reproduction on the level and structure of population genetic variation. Inter-simple sequence repeats analysis was conducted on 140 individuals collected from seven populations (HSG, LCH, BPC, MPG, GX, LD, and ZHG) in China. Seventy-four polymorphic bands discriminated a total of 127 multilocus genotypes. Character compatibility analysis revealed that 50.0 to 70.0% of the genotypes had to be deleted in order to obtain a tree-like structure in the data set from populations HSG, LCH, MPG, BPC, GX, and LD; and there was a gradual decrease of conflict in the data set when genotypes with the highest incompatibility counts were successively deleted. In contrast, in population ZHG, only 33.3% of genotypes had to be removed to achieve complete compatibility in the data set, which showed a sharp decline in incompatibility upon the deletion of those genotypes. All populations examined possessed similar levels of genetic variation. Population ZHG was not found to be more differentiated than the other populations. Sexual recombination is the predominant source of genetic variation in most of the examined populations of A. spinulosa. However, somatic mutation contributes most to the genetic variation in population ZHG. This change of the primary mode of reproduction does not cause a significant difference in the population genetic composition. Character compatibility analysis represents an effective approach to separate the role of sexual and asexual components in shaping the genetic pattern of fern populations.
Sequence repeats and protein structure

NASA Astrophysics Data System (ADS)

Hoang, Trinh X.; Trovato, Antonio; Seno, Flavio; Banavar, Jayanth R.; Maritan, Amos

2012-11-01

Repeats are frequently found in known protein sequences. The level of sequence conservation in tandem repeats correlates with their propensities to be intrinsically disordered. We employ a coarse-grained model of a protein with a two-letter amino acid alphabet, hydrophobic (H) and polar (P), to examine the sequence-structure relationship in the realm of repeated sequences. A fraction of repeated sequences comprises a distinct class of bad folders, whose folding temperatures are much lower than those of random sequences. Imperfection in sequence repetition improves the folding properties of the bad folders while deteriorating those of the good folders. Our results may explain why nature has utilized repeated sequences for their versatility and especially to design functional proteins that are intrinsically unstructured at physiological temperatures.
Antimicrobial Potential, Identification and Phylogenetic Affiliation of Wild Mushrooms from Two Sub-Tropical Semi-Evergreen Indian Forest Ecosystems.

PubMed

Lallawmsanga; Passari, Ajit Kumar; Mishra, Vineet Kumar; Leo, Vincent Vineeth; Singh, Bhim Pratap; Valliammai Meyyappan, Geetha; Gupta, Vijai Kumar; Uthandi, Sivakumar; Upadhyay, Ramesh Chandra

2016-01-01

The diversity of wild mushrooms was investigated from two protected forest areas in India and 231 mushroom specimens were morphologically identified. Among them, 76 isolates were screened for their antimicrobial potential against seven bacterial and fungal pathogens. Out of 76 isolates, 45 isolates which displayed significant antimicrobial activities were identified using ITS rRNA gene amplification and subsequently phylogenetically characterized using random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) markers. Sequencing of the ITS rRNA region classified the isolates into 16 genera belonging to 11 families. In total, 11 RAPD and 10 ISSR primers were selected to evaluate genetic diversity based on their banding profile produced. In total 337 RAPD and 312 ISSR bands were detected, among which percentage of polymorphism ranges from 34.2% to 78.8% and 38.6% to 92.4% by using RAPD and ISSR primers respectively. Unweighted Pair-Group Method with Arithmetic Mean (UPGMA) trees of selected two methods were structured similarly, grouping the 46 isolates into two clusters which clearly showed a significant genetic distance among the different strains of wild mushroom, with an similarity coefficient ranges from 0.58 to 1.00 and 0.59 to 1.00 with RAPD and ISSR analysis respectively. This reporthas highlighted both DTR and MNP forests provide a habitat for diverse macrofungal species, therefore having the potential to be used for the discovery of antimicrobials. The report has also demonstrated that both RAPD and ISSR could efficiently differentiate wild mushrooms and could thus be considered as efficient markers for surveying genetic diversity. Additionally, selected six wild edible mushroom strains (Schizophyllum commune BPSM01, Panusgiganteus BPSM27, Pleurotussp. BPSM34, Lentinussp. BPSM37, Pleurotusdjamor BPSM41 and Lentinula sp. BPSM45) were analysed for their nutritional (proteins, carbohydrates, fat and ash content), antioxidant potential. The present findings also suggested that the wild edible mushroom strains do not have only nutritional values but also can be used as an accessible source of natural antioxidants.
Antimicrobial Potential, Identification and Phylogenetic Affiliation of Wild Mushrooms from Two Sub-Tropical Semi-Evergreen Indian Forest Ecosystems

PubMed Central

Lallawmsanga; Passari, Ajit Kumar; Mishra, Vineet Kumar; Leo, Vincent Vineeth; Valliammai Meyyappan, Geetha; Gupta, Vijai Kumar; Uthandi, Sivakumar; Upadhyay, Ramesh Chandra

2016-01-01

The diversity of wild mushrooms was investigated from two protected forest areas in India and 231 mushroom specimens were morphologically identified. Among them, 76 isolates were screened for their antimicrobial potential against seven bacterial and fungal pathogens. Out of 76 isolates, 45 isolates which displayed significant antimicrobial activities were identified using ITS rRNA gene amplification and subsequently phylogenetically characterized using random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) markers. Sequencing of the ITS rRNA region classified the isolates into 16 genera belonging to 11 families. In total, 11 RAPD and 10 ISSR primers were selected to evaluate genetic diversity based on their banding profile produced. In total 337 RAPD and 312 ISSR bands were detected, among which percentage of polymorphism ranges from 34.2% to 78.8% and 38.6% to 92.4% by using RAPD and ISSR primers respectively. Unweighted Pair-Group Method with Arithmetic Mean (UPGMA) trees of selected two methods were structured similarly, grouping the 46 isolates into two clusters which clearly showed a significant genetic distance among the different strains of wild mushroom, with an similarity coefficient ranges from 0.58 to 1.00 and 0.59 to 1.00 with RAPD and ISSR analysis respectively. This reporthas highlighted both DTR and MNP forests provide a habitat for diverse macrofungal species, therefore having the potential to be used for the discovery of antimicrobials. The report has also demonstrated that both RAPD and ISSR could efficiently differentiate wild mushrooms and could thus be considered as efficient markers for surveying genetic diversity. Additionally, selected six wild edible mushroom strains (Schizophyllum commune BPSM01, Panusgiganteus BPSM27, Pleurotussp. BPSM34, Lentinussp. BPSM37, Pleurotusdjamor BPSM41 and Lentinula sp. BPSM45) were analysed for their nutritional (proteins, carbohydrates, fat and ash content), antioxidant potential. The present findings also suggested that the wild edible mushroom strains do not have only nutritional values but also can be used as an accessible source of natural antioxidants. PMID:27902725
Comparison of simple sequence repeats in 19 Archaea.

PubMed

Trivedi, S

2006-12-05

All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.
[Mutation Analysis of 19 STR Loci in 20 723 Cases of Paternity Testing].

PubMed

Bi, J; Chang, J J; Li, M X; Yu, C Y

2017-06-01

To observe and analyze the confirmed cases of paternity testing, and to explore the mutation rules of STR loci. The mutant STR loci were screened from 20 723 confirmed cases of paternity testing by Goldeneye 20A system．The mutation rates, and the sources, fragment length, steps and increased or decreased repeat sequences of mutant alleles were counted for the analysis of the characteristics of mutation-related factors. A total of 548 mutations were found on 19 STR loci, and 557 mutation events were observed. The loci mutation rate was 0.07‰-2.23‰. The ratio of paternal to maternal mutant events was 3.06:1. One step mutation was the main mutation, and the number of the increased repeat sequences was almost the same as the decreased repeat sequences. The repeat sequences were more likely to decrease in two steps mutation and above. Mutation mainly occurred in the medium allele, and the number of the increased repeat sequences was almost the same as the decreased repeat sequences. In long allele mutations, the decreased repeat sequences were significantly more than the increased repeat sequences. The number of the increased repeat sequences was almost the same as the decreased repeat sequences in paternal mutation, while the decreased repeat sequences were more than the increased in maternal mutation. There are significant differences in the mutation rate of each locus. When one or two loci do not conform to the genetic law, other detection system should be added, and PI value should be calculated combined with the information of the mutate STR loci in order to further clarify the identification opinions. Copyright© by the Editorial Department of Journal of Forensic Medicine
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats

PubMed Central

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-01-01

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. PMID:26481363
Sequences characterization of microsatellite DNA sequences in Pacific abalone ( Haliotis discus hannai)

NASA Astrophysics Data System (ADS)

Li, Qi; Akihiro, Kijima

2007-01-01

The microsatellite-enriched library was constructed using magnetic bead hybridization selection method, and the microsatellite DNA sequences were analyzed in Pacific abalone Haliotis discus hannai. Three hundred and fifty white colonies were screened using PCR-based technique, and 84 clones were identified to potentially contain microsatellite repeat motif. The 84 clones were sequenced, and 42 microsatellites and 4 minisatellites with a minimum of five repeats were found (13.1% of white colonies screened). Besides the motif of CA contained in the oligoprobe, we also found other 16 types of microsatellite repeats including a dinucleotide repeat, two tetranucleotide repeats, twelve pentanucleotide repeats and a hexanucleotide repeat. According to Weber (1990), the microsatellite sequences obtained could be categorized structurally into perfect repeats (73.3%), imperfect repeats (13.3%), and compound repeats (13.4%). Among the microsatellite repeats, relatively short arrays (<20 repeats) were most abundant, accounting for 75.0%. The largest length of microsatellites was 48 repeats, and the average number of repeats was 13.4. The data on the composition and length distribution of microsatellites obtained in the present study can be useful for choosing the repeat motifs for microsatellite isolation in other abalone species.
Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution.

PubMed

Melters, Daniël P; Bradnam, Keith R; Young, Hugh A; Telis, Natalie; May, Michael R; Ruby, J Graham; Sebra, Robert; Peluso, Paul; Eid, John; Rank, David; Garcia, José Fernando; DeRisi, Joseph L; Smith, Timothy; Tobias, Christian; Ross-Ibarra, Jeffrey; Korf, Ian; Chan, Simon W L

2013-01-30

Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes.
Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution

PubMed Central

2013-01-01

Background Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Results Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. Conclusions While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes. PMID:23363705
Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm

PubMed Central

Glunčić, Matko; Paar, Vladimir

2013-01-01

The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K-string ensemble which enables a new method of direct mapping of symbolic DNA sequence into frequency domain, with straightforward identification of repeats as peaks in GRM diagram. In this way, we obtain very fast, efficient and highly automatized repeat finding tool. The method is robust to substitutions and insertions/deletions, as well as to various complexities of the sequence pattern. We present several case studies of GRM use, in order to illustrate its capabilities: identification of α-satellite tandem repeats and higher order repeats (HORs), identification of Alu dispersed repeats and of Alu tandems, identification of Period 3 pattern in exons, implementation of ‘magnifying glass’ effect, identification of complex HOR pattern, identification of inter-tandem transitional dispersed repeat sequences and identification of long segmental duplications. GRM algorithm is convenient for use, in particular, in cases of large repeat units, of highly mutated and/or complex repeats, and of global repeat maps for large genomic sequences (chromosomes and genomes). PMID:22977183
Identification and characterization of tandem repeats in exon III of dopamine receptor D4 (DRD4) genes from different mammalian species.

PubMed

Larsen, Svend Arild; Mogensen, Line; Dietz, Rune; Baagøe, Hans Jørgen; Andersen, Mogens; Werge, Thomas; Rasmussen, Henrik Berg

2005-12-01

In this study we have identified and characterized dopamine receptor D4 (DRD4) exon III tandem repeats in 33 public available nucleotide sequences from different mammalian species. We found that the tandem repeat in canids could be described in a novel and simple way, namely, as a structure composed of 15- and 12- bp modules. Tandem repeats composed of 18-bp modules were found in sequences from the horse, zebra, onager, and donkey, Asiatic bear, polar bear, common raccoon, dolphin, harbor porpoise, and domestic cat. Several of these sequences have been analyzed previously without a tandem repeat being found. In the domestic cow and gray seal we identified tandem repeats composed of 36-bp modules, each consisting of two closely related 18-bp basic units. A tandem repeat consisting of 9-bp modules was identified in sequences from mink and ferret. In the European otter we detected an 18-bp tandem repeat, while a tandem repeat consisting of 27-bp modules was identified in a sequence from European badger. Both these tandem repeats were composed of 9-bp basic units, which were closely related with the 9-bp repeat modules identified in the mink and ferret. Tandem repeats could not be identified in sequences from rodents. All tandem repeats possessed a high GC content with a strong bias for C. On phylogenetic analysis of the tandem repeats evolutionary related species were clustered into the same groups. The degree of conservation of the tandem repeats varied significantly between species. The deduced amino acid sequences of most of the tandem repeats exhibited a high propensity for disorder. This was also the case with an amino acid sequence of the human DRD4 exon III tandem repeat, which was included in the study for comparative purposes. We identified proline-containing motifs for SH3 and WW domain binding proteins, potential phosphorylation sites, PDZ domain binding motifs, and FHA domain binding motifs in the amino acid sequences of the tandem repeats. The numbers of potential functional sites varied pronouncedly between species. Our observations provide a platform for future studies of the architecture and evolution of the DRD4 exon III tandem repeat, and they suggest that differences in the structure of this tandem repeat contribute to specialization and generation of diversity in receptor function.
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

PubMed

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-11-16

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

Repeated sequence sets in mitochondrial DNA molecules of root knot nematodes (Meloidogyne): nucleotide sequences, genome location and potential for host-race identification.

PubMed Central

Okimoto, R; Chamberlin, H M; Macfarlane, J L; Wolstenholme, D R

1991-01-01

Within a 7 kb segment of the mtDNA molecule of the root knot nematode, Meloidogyne javanica, that lacks standard mitochondrial genes, are three sets of strictly tandemly arranged, direct repeat sequences: approximately 36 copies of a 102 ntp sequence that contains a TaqI site; 11 copies of a 63 ntp sequence, and 5 copies of an 8 ntp sequence. The 7 kb repeat-containing segment is bounded by putative tRNAasp and tRNAf-met genes and the arrangement of sequences within this segment is: the tRNAasp gene; a unique 1,528 ntp segment that contains two highly stable hairpin-forming sequences; the 102 ntp repeat set; the 8 ntp repeat set; a unique 1,068 ntp segment; the 63 ntp repeat set; and the tRNAf-met gene. The nucleotide sequences of the 102 ntp copies and the 63 ntp copies have been conserved among the species examined. Data from Southern hybridization experiments indicate that 102 ntp and 63 ntp repeats occur in the mtDNAs of three, two and two races of M.incognita, M.hapla and M.arenaria, respectively. Nucleotide sequences of the M.incognita Race-3 102 ntp repeat were found to be either identical or highly similar to those of the M.javanica 102 ntp repeat. Differences in migration distance and number of 102 ntp repeat-containing bands seen in Southern hybridization autoradiographs of restriction-digested mtDNAs of M.javanica and the different host races of M.incognita, M.hapla and M.arenaria are sufficient to distinguish the different host races of each species. Images PMID:2027769
Variation, Repetition, And Choice

PubMed Central

Abreu-Rodrigues, Josele; Lattal, Kennon A; dos Santos, Cristiano V; Matos, Ricardo A

2005-01-01

Experiment 1 investigated the controlling properties of variability contingencies on choice between repeated and variable responding. Pigeons were exposed to concurrent-chains schedules with two alternatives. In the REPEAT alternative, reinforcers in the terminal link depended on a single sequence of four responses. In the VARY alternative, a response sequence in the terminal link was reinforced only if it differed from the n previous sequences (lag criterion). The REPEAT contingency generated low, constant levels of sequence variation whereas the VARY contingency produced levels of sequence variation that increased with the lag criterion. Preference for the REPEAT alternative tended to increase directly with the degree of variation required for reinforcement. Experiment 2 examined the potential confounding effects in Experiment 1 of immediacy of reinforcement by yoking the interreinforcer intervals in the REPEAT alternative to those in the VARY alternative. Again, preference for REPEAT was a function of the lag criterion. Choice between varying and repeating behavior is discussed with respect to obtained behavioral variability, probability of reinforcement, delay of reinforcement, and switching within a sequence. PMID:15828592
Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

PubMed

Rehm, Charlotte; Wurmthaler, Lena A; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S

2015-01-01

In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1-5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6-9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria.
Investigation of a Quadruplex-Forming Repeat Sequence Highly Enriched in Xanthomonas and Nostoc sp.

PubMed Central

Rehm, Charlotte; Wurmthaler, Lena A.; Li, Yuanhao; Frickey, Tancred; Hartig, Jörg S.

2015-01-01

In prokaryotes simple sequence repeats (SSRs) with unit sizes of 1–5 nucleotides (nt) are causative for phase and antigenic variation. Although an increased abundance of heptameric repeats was noticed in bacteria, reports about SSRs of 6–9 nt are rare. In particular G-rich repeat sequences with the propensity to fold into G-quadruplex (G4) structures have received little attention. In silico analysis of prokaryotic genomes show putative G4 forming sequences to be abundant. This report focuses on a surprisingly enriched G-rich repeat of the type GGGNATC in Xanthomonas and cyanobacteria such as Nostoc. We studied in detail the genomes of Xanthomonas campestris pv. campestris ATCC 33913 (Xcc), Xanthomonas axonopodis pv. citri str. 306 (Xac), and Nostoc sp. strain PCC7120 (Ana). In all three organisms repeats are spread all over the genome with an over-representation in non-coding regions. Extensive variation of the number of repetitive units was observed with repeat numbers ranging from two up to 26 units. However a clear preference for four units was detected. The strong bias for four units coincides with the requirement of four consecutive G-tracts for G4 formation. Evidence for G4 formation of the consensus repeat sequences was found in biophysical studies utilizing CD spectroscopy. The G-rich repeats are preferably located between aligned open reading frames (ORFs) and are under-represented in coding regions or between divergent ORFs. The G-rich repeats are preferentially located within a distance of 50 bp upstream of an ORF on the anti-sense strand or within 50 bp from the stop codon on the sense strand. Analysis of whole transcriptome sequence data showed that the majority of repeat sequences are transcribed. The genetic loci in the vicinity of repeat regions show increased genomic stability. In conclusion, we introduce and characterize a special class of highly abundant and wide-spread quadruplex-forming repeat sequences in bacteria. PMID:26695179
Assessment of genetic fidelity in Rauvolfia serpentina plantlets grown from synthetic (encapsulated) seeds following in vitro storage at 4 °C.

PubMed

Faisal, Mohammad; Alatar, Abdulrahman A; Ahmad, Naseem; Anis, Mohammad; Hegazy, Ahmad K

2012-05-03

An efficient method was developed for plant regeneration and establishment from alginate encapsulated synthetic seeds of Rauvolfia serpentina. Synthetic seeds were produced using in vitro proliferated microshoots upon complexation of 3% sodium alginate prepared in Llyod and McCown woody plant medium (WPM) and 100 mM calcium chloride. Re-growth ability of encapsulated nodal segments was evaluated after storage at 4 °C for 0, 1, 2, 4, 6 and 8 weeks and compared with non-encapsulated buds. Effects of different media viz; Murashige and Skoog medium; Lloyd and McCown woody Plant medium, Gamborg’s B5 medium and Schenk and Hildebrandt medium was also investigated for conversion into plantlets. The maximum frequency of conversion into plantlets from encapsulated nodal segments stored at 4 °C for 4 weeks was achieved on woody plant medium supplement with 5.0 μM BA and 1.0 μM NAA. Rooting in plantlets was achieved in half-strength Murashige and Skoog liquid medium containing 0.5 μM indole-3-acetic acid (IAA) on filter paper bridges. Plantlets obtained from stored synseeds were hardened, established successfully ex vitro and were morphologically similar to each other as well as their mother plant. The genetic fidelity of Rauvolfia clones raised from synthetic seeds following four weeks of storage at 4 °C were assessed by using random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) markers. All the RAPD and ISSR profiles from generated plantlets were monomorphic and comparable to the mother plant, which confirms the genetic stability among the clones. This synseed protocol could be useful for establishing a particular system for conservation, short-term storage and production of genetically identical and stable plants before it is released for commercial purposes.
Isozyme, ISSR and RAPD profiling of genotypes in marvel grass (Dichanthium annulatum).

PubMed

Saxena, Raghvendra; Chandra, Amaresh

2010-11-01

Genetic analysis of 30 accessions of marvel grass (Dichanthium annulatum Forsk.), a tropical range grass collected from grasslands and open fields of drier regions, was carried out with the objectives of identifying unique materials that could be used in developing the core germplasm for such regions as well as to explore gene (s) for drought tolerance. Five inter-simple sequence repeat (ISSR) primers [(CA)4, (AGAC), (GACA) 4; 27 random amplified polymorphic DNA (RAPD) and four enzyme systems were employed in the present study. In total, ISSR yielded 61 (52 polymorphic), RAPD 269 (253 polymorphic) and enzyme 55 isozymes (44 polymorphic) bands. The average polymorphic information content (PIC) and marker index (MI) across all polymorphic bands of 3 markers systems ranged from 0.419 to 0.480 and 4.34 to 5.25 respectively Dendrogram analysis revealed three main clusters with all three markers. Four enzymes namely esterase (EST), polyphenoloxidase (PPO), peroxidase (PRX) and superoxide dismutase (SOD) revealed 55 alleles from a total of 16 enzyme-coding loci. Of these, 14 loci and 44 alleles were polymorphic. The mean number of alleles per locus was 3.43. Mean heterozygosity observed among the polymorphic loci ranged from 0.406 (SOD) to 0.836 (EST) and accession wise from 0.679 (1G3108) to 0.743 (IGKMD-10). Though there was intermixing of few accessions of one agro-climatic region to another largely groupings of accessions were with their regions of collections. Bootstrap analysis at 1000 iterations also showed large numbers of nodes (11 to 17) having strong clustering (> 50 bootstrap values) in all three marker systems. The accessions of the arid and drier regions forming one cluster are assigned as distinct core collection of Dichanthium and can be targeted for isolation of gene (s) for drought tolerance. Variations in isozyme allele numbers and high PIC (0.48) and MI (4.98) as observed with ISSR markers indicated their usefulness for germplasm characterization.
Population Genetic Effects of Urban Habitat Fragmentation in the Perennial Herb Viola pubescens (Violaceae) using ISSR Markers

PubMed Central

Culley, Theresa M.; Sbita, Sarah J.; Wick, Anne

2007-01-01

Background and Aims Fragmentation of natural habitats can negatively impact plant populations by leading to reduced genetic variation and increased genetic distance as populations become geographically and genetically isolated from one another. To test whether such detrimental effects occur within an urban landscape, the genetic structure of six populations of the perennial herb Viola pubescens was characterized in the metropolitan area of Greater Cincinnati in southwestern Ohio, USA. Methods Using three inter-simple sequence repeat (ISSR) markers, 51 loci amplified across all urban populations. For reference, four previously examined agricultural populations in central/northern Ohio and a geographically distant population in Michigan were also included in the analysis. Key Results Urban populations retained high levels of genetic variation (percentage of polymorphic loci, Pp = 80·7 %) with similar genetic distances among populations and an absence of unique alleles. Geographic and genetic distances were correlated with one another, and all populations grouped according to region. Individuals from urban populations clustered together and away from individuals from agricultural populations and from the Michigan population in a principle coordinates analysis. Hierarchical analysis of molecular variance (AMOVA) revealed that most of the genetic variability was partitioned within populations (69·1 %) and among groups (22·2 %) of southwestern Ohio, central/northern Ohio and Michigan groups. Mean Fst was 0·308, indicating substantial population differentiation. Conclusions It is concluded that urban fragmentation does not appear to impede gene flow in V. pubescens in southwestern Ohio. These results are consistent with life history traits of this species and the possibility of high insect abundance in urban habitats due to diverse floral resources and nesting sites. Combined with the cleistogamous breeding system of this species, pollinator availability in the urban matrix may buffer populations against detrimental effects of habitat fragmentation, at least in larger forest fragments. Consequently, it may be inappropriate to generalize about genetic effects of fragmentation across landscapes or even across plant species with different pollination systems. PMID:17556381
Recovery patterns, histological observations and genetic integrity in Malus shoot tips cryopreserved using droplet-vitrification and encapsulation-dehydration procedures.

PubMed

Li, Bai-Quan; Feng, Chao-Hong; Wang, Min-Rui; Hu, Ling-Yun; Volk, Gayle; Wang, Qiao-Chun

2015-11-20

A droplet-vitrification procedure is described for cryopreservation of Malus shoot tips. Survival patterns, recovery types, histological observations, and genetic integrity were compared for Malus shoot tips cryopreserved using this droplet-vitrification procedure and an encapsulation-dehydration procedure that was previously reported by us. In both procedures, three types of shoot tip recovery were observed following cryopreservation: callus formation without shoot regrowth, leaf formation without shoot regrowth, and shoot regrowth. Three categories of histological observations were also identified in cross-sections of shoot tips recovered after cryopreservation using the two cryogenic procedures. In category 1, almost all of the cells (94-95%) in the apical dome (AD) were damaged or killed and only some cells (30-32%) in the leaf primordia (LPs) survived. In category 2, only a few cells (18-20%) in the AD and some cells (30-31%) in the LPs survived. In category 3, majority of the cells (60-62%) in the AD and some cells (30-33%) in the LPs survived. These data suggest that shoot regrowth is correlated to the presence of a majority of surviving cells in the AD after liquid nitrogen exposure. No polymorphic bands were detected by inter-simple sequence repeats or by random amplified polymorphic DNA assessments, and ploidy levels analyzed by flow cytometry were unchanged when plants recovered after cryoexposure were compared to controls. The droplet-vitrification procedure appears to be robust since seven genotypes representing four Malus species and one hybrid recovered shoots following cryopreservation. Mean shoot regrowth levels of these seven genotypes were 48% in the droplet-vitrification method, which were lower than those (61%) in the encapsulation-dehydration procedure reported in our previous study, suggesting the latter may be preferred for routine cryobanking applications for Malus shoot tips. Copyright © 2015 Elsevier B.V. All rights reserved.
Analysis of simple sequence repeat (SSR) structure and sequence within Epichloë endophyte genomes reveals impacts on gene structure and insights into ancestral hybridization events.

PubMed

Clayton, William; Eaton, Carla Jane; Dupont, Pierre-Yves; Gillanders, Tim; Cameron, Nick; Saikia, Sanjay; Scott, Barry

2017-01-01

Epichloë grass endophytes comprise a group of filamentous fungi of both sexual and asexual species. Known for the beneficial characteristics they endow upon their grass hosts, the identification of these endophyte species has been of great interest agronomically and scientifically. The use of simple sequence repeat loci and the variation in repeat elements has been used to rapidly identify endophyte species and strains, however, little is known of how the structure of repeat elements changes between species and strains, and where these repeat elements are located in the fungal genome. We report on an in-depth analysis of the structure and genomic location of the simple sequence repeat locus B10, commonly used for Epichloë endophyte species identification. The B10 repeat was found to be located within an exon of a putative bZIP transcription factor, suggesting possible impacts on polypeptide sequence and thus protein function. Analysis of this repeat in the asexual endophyte hybrid Epichloë uncinata revealed that the structure of B10 alleles reflects the ancestral species that hybridized to give rise to this species. Understanding the structure and sequence of these simple sequence repeats provides a useful set of tools for readily distinguishing strains and for gaining insights into the ancestral species that have undergone hybridization events.
TRAP: automated classification, quantification and annotation of tandemly repeated sequences.

PubMed

Sobreira, Tiago José P; Durham, Alan M; Gruber, Arthur

2006-02-01

TRAP, the Tandem Repeats Analysis Program, is a Perl program that provides a unified set of analyses for the selection, classification, quantification and automated annotation of tandemly repeated sequences. TRAP uses the results of the Tandem Repeats Finder program to perform a global analysis of the satellite content of DNA sequences, permitting researchers to easily assess the tandem repeat content for both individual sequences and whole genomes. The results can be generated in convenient formats such as HTML and comma-separated values. TRAP can also be used to automatically generate annotation data in the format of feature table and GFF files.
Regions of conservation and divergence in the 3' untranslated sequences of genomic RNA from Ross River virus isolates.

PubMed

Faragher, S G; Dalgarno, L

1986-07-20

The 3' untranslated (UT) sequences of the genomic RNAs of five geographic variants of the alphavirus Ross River virus (RRV) were determined and compared with the 3' UT sequence of RRV T48, the prototype strain. Part of the 3' UT region of Getah virus, a close serological relative of RRV, was also sequenced. The RRV 3' UT region varies markedly in length between variants. Large deletions or insertions, sequence rearrangements and single nucleotide substitutions are observed. A sequence tract of 49 to 58 nucleotides, which is repeated as four blocks in the RRV T48 3' UT region, occurs only once in the 3' UT region of one RRV strain (NB5092), indicating that the existence of repeat sequence blocks is not essential for RRV replication. However, the precise sequence of the 3' proximal copy of the repeat block and its position relative to the poly(A) tail were identical in all RRV isolates examined, suggesting that it has an important role in RRV replication. Nucleotide substitutions between RRV variants are distributed non-randomly along the length of the 3' UT region. The sequence of 120 to 130 nucleotides adjacent to the poly(A) tail is strongly conserved. Getah virus RNA contains three repeat sequence blocks in the 3' UT region. These are similar in sequence to those in RRV RNA but differ in their arrangement. Homology between the RRV and Getah 3' UT sequences is greatest in the 3' proximal repeat sequence block that shows three differences in 49 nucleotides. The 3' proximal repeat in Getah RNA occurs at the same position, relative to the poly(A) tail, as in all RRV variants. The RRV and Getah virus 3' UT sequences show extensive homology in the region between the 3' proximal repeat and the poly(A) tail but, apart from the repeat blocks themselves, they show no significant homology elsewhere.
Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure.

PubMed

Ruhlman, Tracey A; Zhang, Jin; Blazier, John C; Sabir, Jamal S M; Jansen, Robert K

2017-04-01

There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements. We used Illumina paired-end and PacBio single-molecule real-time (SMRT) sequences to characterize repeat structure in the plastome of Monsonia emarginata (Geraniaceae). We used OrgConv and inspected nucleotide alignments to infer ancestral nucleotides and identify gene conversion among repeats and mapped long (>1 kb) SMRT reads against the unit-genome assembly to identify alternative sequence arrangements. Although M. emarginata lacks the canonical IR, we found that large repeats (>1 kilobase; kb) represent ∼22% of the plastome nucleotide content. Among the largest repeats (>2 kb), we identified GC-biased gene conversion and mapping filtered, long SMRT reads to the M. emarginata unit-genome assembly revealed alternative, substoichiometric sequence arrangements. We offer a model based on RDR and gene conversion between long repeated sequences in the M. emarginata plastome and provide support that both intra-and intermolecular recombination between large repeats, particularly in repeat-rich plastomes, varies unit-genome structure while homogenizing the nucleotide sequence of repeats. © 2017 Botanical Society of America.
Leaf crinkle disease in urdbean (Vigna mungo L. Hepper): An overview on causal agent, vector and host.

PubMed

Gautam, Narinder Kumar; Kumar, Krishna; Prasad, Manoj

2016-05-01

Urdbean leaf crinkle disease (ULCD) is an economically significant widespread and devastating disease resulting in extreme crinkling, puckering and rugosity of leaves inflicting heavy yield losses annually in major urdbean-producing countries of the world. This disease is caused by urdbean leaf crinkle virus (ULCV). Urdbean (Vigna mungo L. Hepper) is relatively more susceptible than other pulses to leaf crinkle disease. Urdbean is an important and useful crop cultivated in various parts of South-East Asia and well adapted for cultivation under semi-arid and subtropical conditions. Aphids, insects and whiteflies have been reported as vectors of the disease. The virus is also transmitted through sap inoculation, grafting and seed. The loss in seed yield in ULCD-affected urdbean crop ranges from 35 to 81%, which is dependent upon type of genotype location and infection time. The diseased material and favourable climatic conditions contribute for the widespread viral disease. Anatomical and biochemical changes take place in the affected diseased plants. Genetic variations have been reported in the germplasm screening which suggest continuous screening of available varieties and new germplasm to search for new traits (new genes) and identify new sources of disease resistance. There are very few reports on breeding programmes for the development and release of varieties tolerant to ULCD. Mostly random amplified polymorphic DNA (RAPD) as well as inter-simple sequence repeat (ISSR) molecular markers have been utilized for fingerprinting of blackgram, and a few reports are there on sequence-tagged micro-satellite site (STMS) markers. There are so many RNA viruses which have also developed strategies to counteract silencing process by encoding suppressor proteins that create hindrances in the process. But, in the case of ULCV, there is no report available indicating which defence pathway is operating for its resistance in the plants and whether same silencing suppression strategy is also followed by this virus causing leaf crinkle disease in urdbean. The antiviral principles (AVP) present in leaf extracts of several plants are known to inhibit infection by many viruses. Many chemicals have been reported as inhibitors of virus replication in plants. Raising the barrier crops also offers an effective solution to control the spread of virus.
Methods for sequencing GC-rich and CCT repeat DNA templates

DOEpatents

Robinson, Donna L.

2007-02-20

The present invention is directed to a PCR-based method of cycle sequencing DNA and other polynucleotide sequences having high CG content and regions of high GC content, and includes for example DNA strands with a high Cytosine and/or Guanosine content and repeated motifs such as CCT repeats.
The Contribution of Short Repeats of Low Sequence Complexity to Large Conifer Genomes

Treesearch

A. Schmidt; R.L. Doudrick; J.S. Heslop-Harrison; T. Schmidt

2000-01-01

Abstract: The abundance and genomic organization of six simple sequence repeats, consisting of di-, tri-, and tetranucleotide sequence motifs, and a minisatellite repeat have been analyzed in different gymnosperms by Southern hybridization. Within the gymnosperm genomes investigated, the abundance and genomic organization of micro- and...
Always look on both sides: Phylogenetic information conveyed by simple sequence repeat allele sequences

USDA-ARS?s Scientific Manuscript database

Simple sequence repeat (SSR) markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily,...
[Bioinformatics Analysis of Clustered Regularly Interspaced Short Palindromic Repeats in the Genomes of Shigella].

PubMed

Wang, Pengfei; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Wang, Linlin; Guo, Xiangjiao; Yang, Haiyan; Xi, Yuanlin

2015-04-01

This study was aimed to explore the features of clustered regularly interspaced short palindromic repeats (CRISPR) structures in Shigella by using bioinformatics. We used bioinformatics methods, including BLAST, alignment and RNA structure prediction, to analyze the CRISPR structures of Shigella genomes. The results showed that the CRISPRs existed in the four groups of Shigella, and the flanking sequences of upstream CRISPRs could be classified into the same group with those of the downstream. We also found some relatively conserved palindromic motifs in the leader sequences. Repeat sequences had the same group with corresponding flanking sequences, and could be classified into two different types by their RNA secondary structures, which contain "stem" and "ring". Some spacers were found to homologize with part sequences of plasmids or phages. The study indicated that there were correlations between repeat sequences and flanking sequences, and the repeats might act as a kind of recognition mechanism to mediate the interaction between foreign genetic elements and Cas proteins.
A novel species-specific tandem repeat DNA family from Sinapis arvensis: detection of telomere-like sequences.

PubMed

Kapila, R; Das, S; Srivastava, P S; Lakshmikumaran, M

1996-08-01

DNA sequences representing a tandemly repeated DNA family of the Sinapis arvensis genome were cloned and characterized. The 700-bp tandem repeat family is represented by two clones, pSA35 and pSA52, which are 697 and 709 bp in length, respectively. Dot matrix analysis of the sequences indicates the presence of repeated elements within each monomeric unit. Sequence analysis of the repetitive region of clones pSA35 and pSA52 shows that there are several copies of a 7-bp repeat element organized in tandem. The consensus sequence of this repeat element is 5'-TTTAGGG-3'. These elements are highly mutated and the difference in length between the two clones is due to different copy numbers of these elements. The repetitive region of clone pSA35 has 26 copies of the element TTTAGGG, whereas clone pSA52 has 28 copies. The repetitive region in both clones is flanked on either side by inverted repeats that may be footprints of a transposition event. Sequence comparison indicates that the element TTTAGGG is identical to telomeric repeats present in Arabidopsis, maize, tomato, and other plants. However, Bal31 digestion kinetics indicates non-telomeric localization of the 700-bp tandem repeats. The clones represent a novel repeat family as (i) they contain telomere-like motifs as subrepeats within each unit; and (ii) they do not hybridize to related crucifers and are species-specific in nature.
Stability of Tandem Repeats in the Drosophila Melanogaster HSR-Omega Nuclear RNA

PubMed Central

Hogan, N. C.; Slot, F.; Traverse, K. L.; Garbe, J. C.; Bendena, W. G.; Pardue, M. L.

1995-01-01

The Drosophila melanogaster Hsr-omega locus produces a nuclear RNA containing >5 kb of tandem repeat sequences. These repeats are unique to Hsr-omega and show concerted evolution similar to that seen with classical satellite DNAs. In D. melanogaster the monomer is ~280 bp. Sequences of 191/2 monomers differ by 8 +/- 5% (mean +/- SD), when all pairwise comparisons are considered. Differences are single nucleotide substitutions and 1-3 nucleotide deletions/insertions. Changes appear to be randomly distributed over the repeat unit. Outer repeats do not show the decrease in monomer homogeneity that might be expected if homogeneity is maintained by recombination. However, just outside the last complete repeat at each end, there are a few fragments of sequence similar to the monomer. The sequences in these flanking regions are not those predicted for sequences decaying in the absence of recombination. Instead, the fragmentation of the sequence homology suggests that flanking regions have undergone more severe disruptions, possibly during an insertion or amplification event. Hsr-omega alleles differing in the number of repeats are detected and appear to be stable over a few thousand generations; however, both increases and decreases in repeat numbers have been observed. The new alleles appear to be as stable as their predecessors. No alleles of less than ~5 kb nor more than ~16 kb of repeats were seen in any stocks examined. The evidence that there is a limit on the minimum number of repeats is consistent with the suggestion that these repeats are important in the function of the unusual Hsr-omega nuclear RNA. PMID:7540581
Typing Clostridium difficile strains based on tandem repeat sequences

PubMed Central

2009-01-01

Background Genotyping of epidemic Clostridium difficile strains is necessary to track their emergence and spread. Portability of genotyping data is desirable to facilitate inter-laboratory comparisons and epidemiological studies. Results This report presents results from a systematic screen for variation in repetitive DNA in the genome of C. difficile. We describe two tandem repeat loci, designated 'TR6' and 'TR10', which display extensive sequence variation that may be useful for sequence-based strain typing. Based on an investigation of 154 C. difficile isolates comprising 75 ribotypes, tandem repeat sequencing demonstrated excellent concordance with widely used PCR ribotyping and equal discriminatory power. Moreover, tandem repeat sequences enabled the reconstruction of the isolates' largely clonal population structure and evolutionary history. Conclusion We conclude that sequence analysis of the two repetitive loci introduced here may be highly useful for routine typing of C. difficile. Tandem repeat sequence typing resolves phylogenetic diversity to a level equivalent to PCR ribotypes. DNA sequences may be stored in databases accessible over the internet, obviating the need for the exchange of reference strains. PMID:19133124

[Comparative analysis of clustered regularly interspaced short palindromic repeats (CRISPRs) loci in the genomes of halophilic archaea].

PubMed

Zhang, Fan; Zhang, Bing; Xiang, Hua; Hu, Songnian

2009-11-01

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is a widespread system that provides acquired resistance against phages in bacteria and archaea. Here we aim to genome-widely analyze the CRISPR in extreme halophilic archaea, of which the whole genome sequences are available at present time. We used bioinformatics methods including alignment, conservation analysis, GC content and RNA structure prediction to analyze the CRISPR structures of 7 haloarchaeal genomes. We identified the CRISPR structures in 5 halophilic archaea and revealed a conserved palindromic motif in the flanking regions of these CRISPR structures. In addition, we found that the repeat sequences of large CRISPR structures in halophilic archaea were greatly conserved, and two types of predicted RNA secondary structures derived from the repeat sequences were likely determined by the fourth base of the repeat sequence. Our results support the proposal that the leader sequence may function as recognition site by having palindromic structures in flanking regions, and the stem-loop secondary structure formed by repeat sequences may function in mediating the interaction between foreign genetic elements and CAS-encoded proteins.
Genome Wide Characterization of Simple Sequence Repeats in Cucumber

USDA-ARS?s Scientific Manuscript database

The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...
Telomere extension by telomerase and ALT generates variant repeats by mechanistically distinct processes

PubMed Central

Lee, Michael; Hills, Mark; Conomos, Dimitri; Stutz, Michael D.; Dagg, Rebecca A.; Lau, Loretta M.S.; Reddel, Roger R.; Pickett, Hilda A.

2014-01-01

Telomeres are terminal repetitive DNA sequences on chromosomes, and are considered to comprise almost exclusively hexameric TTAGGG repeats. We have evaluated telomere sequence content in human cells using whole-genome sequencing followed by telomere read extraction in a panel of mortal cell strains and immortal cell lines. We identified a wide range of telomere variant repeats in human cells, and found evidence that variant repeats are generated by mechanistically distinct processes during telomerase- and ALT-mediated telomere lengthening. Telomerase-mediated telomere extension resulted in biased repeat synthesis of variant repeats that differed from the canonical sequence at positions 1 and 3, but not at positions 2, 4, 5 or 6. This indicates that telomerase is most likely an error-prone reverse transcriptase that misincorporates nucleotides at specific positions on the telomerase RNA template. In contrast, cell lines that use the ALT pathway contained a large range of variant repeats that varied greatly between lines. This is consistent with variant repeats spreading from proximal telomeric regions throughout telomeres in a stochastic manner by recombination-mediated templating of DNA synthesis. The presence of unexpectedly large numbers of variant repeats in cells utilizing either telomere maintenance mechanism suggests a conserved role for variant sequences at human telomeres. PMID:24225324
Identification of Simple Sequence Repeats in Chloroplast Genomes of Magnoliids Through Bioinformatics Approach.

PubMed

Srivastava, Deepika; Shanker, Asheesh

2016-12-01

Basal angiosperms or Magnoliids is an important clade of commercially important plants which mainly include spices and edible fruits. In this study, 17 chloroplast genome sequences belonging to clade Magnoliids were screened for the identification of chloroplast simple sequence repeats (cpSSRs). Simple sequence repeats or microsatellites are short stretches of DNA up to 1-6 base pair in length. These repeats are ubiquitous and play important role in the development of molecular markers and to study the mapping of traits of economic, medical or ecological interest. A total of 479 SSRs were detected, showing average density of 1 SSR/6.91 kb. Depending on the repeat units, the length of SSRs ranged from 12 to 24 bp for mono-, 12 to 18 bp for di-, 12 to 26 bp for tri-, 12 to 24 bp for tetra-, 15 bp for penta- and 18 bp for hexanucleotide repeats. Mononucleotide repeats were the most frequent (207, 43.21 %) followed by tetranucleotide repeats (130, 27.13 %). Penta- and hexanucleotide repeats were least frequent or absent in these chloroplast genomes.
Molecular characterization and distribution of a 145-bp tandem repeat family in the genus Populus.

PubMed

Rajagopal, J; Das, S; Khurana, D K; Srivastava, P S; Lakshmikumaran, M

1999-10-01

This report aims to describe the identification and molecular characterization of a 145-bp tandem repeat family that accounts for nearly 1.5% of the Populus genome. Three members of this repeat family were cloned and sequenced from Populus deltoides and P. ciliata. The dimers of the repeat were sequenced in order to confirm the head-to-tail organization of the repeat. Hybridization-based analysis using the 145-bp tandem repeat as a probe on genomic DNA gave rise to ladder patterns which were identified to be a result of methylation and (or) sequence heterogeneity. Analysis of the methylation pattern of the repeat family using methylation-sensitive isoschizomers revealed variable methylation of the C residues and lack of methylation of the A residues. Sequence comparisons between the monomers revealed a high degree of sequence divergence that ranged between 6% and 11% in P. deltoides and between 4.2% and 8.3% in P. ciliata. This indicated the presence of sub-families within the 145-bp tandem family of repeats. Divergence was mainly due to the accumulation of point mutations and was concentrated in the central region of the repeat. The 145-bp tandem repeat family did not show significant homology to known tandem repeats from plants. A short stretch of 36 bp was found to show homology of 66.7% to a centromeric repeat from Chironomus plumosus. Dot-blot analysis and Southern hybridization data revealed the presence of the repeat family in 13 of the 14 Populus species examined. The absence of the 145-bp repeat from P. euphratica suggested that this species is relatively distant from other members of the genus, which correlates with taxonomic classifications. The widespread occurrence of the tandem family in the genus indicated that this family may be of ancient origin.
Small tandemly repeated DNA sequences of higher plants likely originate from a tRNA gene ancestor.

PubMed Central

Benslimane, A A; Dron, M; Hartmann, C; Rode, A

1986-01-01

Several monomers (177 bp) of a tandemly arranged repetitive nuclear DNA sequence of Brassica oleracea have been cloned and sequenced. They share up to 95% homology between one another and up to 80% with other satellite DNA sequences of Cruciferae, suggesting a common ancestor. Both strands of these monomers show more than 50% homology with many tRNA genes; the best homologies have been obtained with Lys and His yeast mitochondrial tRNA genes (respectively 64% and 60%). These results suggest that small tandemly repeated DNA sequences of plants may have evolved from a tRNA gene ancestor. These tandem repeats have probably arisen via a process involving reverse transcription of polymerase III RNA intermediates, as is the case for interspersed DNA sequences of mammalians. A model is proposed to explain the formation of such small tandemly repeated DNA sequences. Images PMID:3774553
Two tandemly repeated telomere-associated sequences in Nicotiana plumbaginifolia.

PubMed

Chen, C M; Wang, C T; Wang, C J; Ho, C H; Kao, Y Y; Chen, C C

1997-12-01

Two tandemly repeated telomere-associated sequences, NP3R and NP4R, have been isolated from Nicotiana plumbaginifolia. The length of a repeating unit for NP3R and NP4R is 165 and 180 nucleotides respectively. The abundance of NP3R, NP4R and telomeric repeats is, respectively, 8.4 x 10(4), 6 x 10(3) and 1.5 x 10(6) copies per haploid genome of N. plumbaginifolia. Fluorescence in situ hybridization revealed that NP3R is located at the ends and/or in interstitial regions of all 10 chromosomes and NP4R on the terminal regions of three chromosomes in the haploid genome of N. plumbaginifolia. Sequence homology search revealed that not only are NP3R and NP4R homologous to HRS60 and GRS, respectively, two tandem repeats isolated from N. tabacum, but that NP3R and NP4R are also related to each other, suggesting that they originated from a common ancestral sequence. The role of these repeated sequences in chromosome healing is discussed based on the observation that two to three copies of a telomere-similar sequence were present in each repeating unit of NP3R and NP4R.
Development of Pineapple Microsatellite Markers and Germplasm Genetic Diversity Analysis

PubMed Central

Tong, Helin; Chen, You; Wang, Jingyi; Chen, Yeyuan; Sun, Guangming; He, Junhu; Wu, Yaoting

2013-01-01

Two methods were used to develop pineapple microsatellite markers. Genomic library-based SSR development: using selectively amplified microsatellite assay, 86 sequences were generated from pineapple genomic library. 91 (96.8%) of the 94 Simple Sequence Repeat (SSR) loci were dinucleotide repeats (39 AC/GT repeats and 52 GA/TC repeats, accounting for 42.9% and 57.1%, resp.), and the other three were mononucleotide repeats. Thirty-six pairs of SSR primers were designed; 24 of them generated clear bands of expected sizes, and 13 of them showed polymorphism. EST-based SSR development: 5659 pineapple EST sequences obtained from NCBI were analyzed; among 1397 nonredundant EST sequences, 843 were found containing 1110 SSR loci (217 of them contained more than one SSR locus). Frequency of SSRs in pineapple EST sequences is 1SSR/3.73 kb, and 44 types were found. Mononucleotide, dinucleotide, and trinucleotide repeats dominate, accounting for 95.6% in total. AG/CT and AGC/GCT were the dominant type of dinucleotide and trinucleotide repeats, accounting for 83.5% and 24.1%, respectively. Thirty pairs of primers were designed for each of randomly selected 30 sequences; 26 of them generated clear and reproducible bands, and 22 of them showed polymorphism. Eighteen pairs of primers obtained by the one or the other of the two methods above that showed polymorphism were selected to carry out germplasm genetic diversity analysis for 48 breeds of pineapple; similarity coefficients of these breeds were between 0.59 and 1.00, and they can be divided into four groups accordingly. Amplification products of five SSR markers were extracted and sequenced, corresponding repeat loci were found and locus mutations are mainly in copy number of repeats and base mutations in the flanking region. PMID:24024187
A novel tandem repeat sequence located on human chromosome 4p: isolation and characterization.

PubMed

Kogi, M; Fukushige, S; Lefevre, C; Hadano, S; Ikeda, J E

1997-06-01

In an effort to analyze the genomic region of the distal half of human chromosome 4p, to where Huntington disease and other diseases have been mapped, we have isolated the cosmid clone (CRS447) that was likely to contain a region with specific repeat sequences. Clone CRS447 was subjected to detailed analysis, including chromosome mapping, restriction mapping, and DNA sequencing. Chromosome mapping by both a human-CHO hybrid cell panel and FISH revealed that CRS447 was predominantly located in the 4p15.1-15.3 region. CRS447 was shown to consist of tandem repeats of 4.7-kb units present on chromosome 4p. A single EcoRI unit was subcloned (pRS447), and the complete sequence was determined as 4752 nucleotides. When pRS447 was used as a probe, the number of copies of this repeat per haploid genome was estimated to be 50-70. Sequence analysis revealed that it contained two internal CA repeats and one putative ORF. Database search established that this sequence was unreported. However, two homologous STS markers were found in the database. We concluded that CRS447/pRS447 is a novel tandem repeat sequence that is mainly specific to human chromosome 4p.
Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

PubMed

Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

2011-01-01

Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.
Survey and Analysis of Microsatellites in the Silkworm, Bombyx mori

PubMed Central

Prasad, M. Dharma; Muthulakshmi, M.; Madhu, M.; Archak, Sunil; Mita, K.; Nagaraju, J.

2005-01-01

We studied microsatellite frequency and distribution in 21.76-Mb random genomic sequences, 0.67-Mb BAC sequences from the Z chromosome, and 6.3-Mb EST sequences of Bombyx mori. We mined microsatellites of ≥15 bases of mononucleotide repeats and ≥5 repeat units of other classes of repeats. We estimated that microsatellites account for 0.31% of the genome of B. mori. Microsatellite tracts of A, AT, and ATT were the most abundant whereas their number drastically decreased as the length of the repeat motif increased. In general, tri- and hexanucleotide repeats were overrepresented in the transcribed sequences except TAA, GTA, and TGA, which were in excess in genomic sequences. The Z chromosome sequences contained shorter repeat types than the rest of the chromosomes in addition to a higher abundance of AT-rich repeats. Our results showed that base composition of the flanking sequence has an influence on the origin and evolution of microsatellites. Transitions/transversions were high in microsatellites of ESTs, whereas the genomic sequence had an equal number of substitutions and indels. The average heterozygosity value for 23 polymorphic microsatellite loci surveyed in 13 diverse silkmoth strains having 2–14 alleles was 0.54. Only 36 (18.2%) of 198 microsatellite loci were polymorphic between the two divergent silkworm populations and 10 (5%) loci revealed null alleles. The microsatellite map generated using these polymorphic markers resulted in 8 linkage groups. B. mori microsatellite loci were the most conserved in its immediate ancestor, B. mandarina, followed by the wild saturniid silkmoth, Antheraea assama. PMID:15371363
Spatio-temporal Variations of Characteristic Repeating Earthquake Sequences along the Middle America Trench in Mexico

NASA Astrophysics Data System (ADS)

Dominguez, L. A.; Taira, T.; Hjorleifsdottir, V.; Santoyo, M. A.

2015-12-01

Repeating earthquake sequences are sets of events that are thought to rupture the same area on the plate interface and thus provide nearly identical waveforms. We systematically analyzed seismic records from 2001 through 2014 to identify repeating earthquakes with highly correlated waveforms occurring along the subduction zone of the Cocos plate. Using the correlation coefficient (cc) and spectral coherency (coh) of the vertical components as selection criteria, we found a set of 214 sequences whose waveforms exceed cc≥95% and coh≥95%. Spatial clustering along the trench shows large variations in repeating earthquakes activity. Particularly, the rupture zone of the M8.1, 1985 earthquake shows an almost absence of characteristic repeating earthquakes, whereas the Guerrero Gap zone and the segment of the trench close to the Guerrero-Oaxaca border shows a significantly larger number of repeating earthquakes sequences. Furthermore, temporal variations associated to stress changes due to major shows episodes of unlocking and healing of the interface. Understanding the different components that control the location and recurrence time of characteristic repeating sequences is a key factor to pinpoint areas where large megathrust earthquakes may nucleate and consequently to improve the seismic hazard assessment.
Algorithm to find distant repeats in a single protein sequence

PubMed Central

Banerjee, Nirjhar; Sarani, Rangarajan; Ranjani, Chellamuthu Vasuki; Sowmiya, Govindaraj; Michael, Daliah; Balakrishnan, Narayanasamy; Sekar, Kanagaraj

2008-01-01

Distant repeats in protein sequence play an important role in various aspects of protein analysis. A keen analysis of the distant repeats would enable to establish a firm relation of the repeats with respect to their function and three-dimensional structure during the evolutionary process. Further, it enlightens the diversity of duplication during the evolution. To this end, an algorithm has been developed to find all distant repeats in a protein sequence. The scores from Point Accepted Mutation (PAM) matrix has been deployed for the identification of amino acid substitutions while detecting the distant repeats. Due to the biological importance of distant repeats, the proposed algorithm will be of importance to structural biologists, molecular biologists, biochemists and researchers involved in phylogenetic and evolutionary studies. PMID:19052663
Characterization of (CA)n microsatellite repeats from large-insert clones.

PubMed

Litt, M; Browne, D

2001-05-01

The most laborious part of developing (CA)n microsatellite repeats as genetic markers is constructing DNA clones to permit determination of sequences flanking the microsatellites. When cosmids or large-insert phage clones are used as primary sources of (CA)n repeat markers, they have traditionally been subcloned into plasmid vectors such as pUC18 or M13 mp 18/19 cloning vectors to obtain fragments of suitable size for DNA sequencing. This unit presents an alternative approach whereby a set of degenerate sequencing primers that anneal directly to (CA)n microsatellites can be used to determine sequences that are inaccessible with vector-derived primers. Because the primers anneal to the repeat and not to the vector, they can be used with subclones containing inserts of several kilobases and should, in theory, always give sequence in the regions directly flanking the repeat. Degeneracy at the 3 end of each of these primers prevents elongation of primers that have annealed out-of-register. The most laborious part of developing (CA)n microsatellite repeats as genetic markers is constructing DNA clones to permit.
Structural analysis of the rDNA intergenic spacer of Brassica nigra: evolutionary divergence of the spacers of the three diploid Brassica species.

PubMed

Bhatia, S; Singh Negi, M; Lakshmikumaran, M

1996-11-01

EcoRI restriction of the B. nigra rDNA recombinants, isolated from a lambda genomic library, showed that the 3.9-kb fragment corresponded to the Intergenic Spacer (IGS), which was sequenced and found to be 3,928 bp in size. Sequence and dot-matrix analyses showed that the organization of the B. nigra rDNA IGS was typical of most rDNA spacers, consisting of a central repetitive region and flanking unique sequences on either side. The repetitive region was composed of two repeat families-RF 'A' and RF 'B.' The B. nigra RF 'A' consisted of a tandem array of three full-length copies of a 106-bp sequence element. RF 'B' was composed of 66 tandemly repeated elements. Each 'B' element was only 21-bp in size and this is the smallest repeat unit identified in plant rDNA to date. The putative transcription initiation site (TIS) was identified as nucleotide position 3,110. Based on the sequence analysis it was suggested that the present organization of the repeat families was generated by successive cycles of deletions and amplifications and was being maintained by homogenization processes such as gene conversion and crossing-over.A detailed comparison of the rDNA IGS sequences of the three diploid Brassica species-namely, B. nigra, B. campestris, and B. oleracea-was carried out. First, comparisons revealed that B. campestris and B. oleracea were close to each other as the repeat families in both showed high sequence homology between each other. Second, the repeat elements in both the species were organized in an interspersed manner. Third, a 52-bp sequence, present just downstream of the repeats in B. campestris, was found to be identical to the B. oleracea repeats, thereby suggesting a common progenitor. On the other hand, in B. nigra no interspersion pattern of organization of repeats was observed. Further, the B. nigra RF 'A' was identified as distinct from the repeat families of B. campestris and B. oleracea. Based on this analysis, it was suggested that during speciation B. campestris and B. oleracea evolved in one lineage whereas B. nigra diverged into a separate lineage. The comparative analysis of the IGS helped in identifying not only conserved ancestral sequence motifs of possible functional significance such as promoters and enhancers, but also sequences which showed variation between the three diploid species and were therefore identified as species-specific sequences.
Interstitial telomeric sequences in vertebrate chromosomes: Origin, function, instability and evolution.

PubMed

Bolzán, Alejandro D

2017-07-01

By definition, telomeric sequences are located at the very ends or terminal regions of chromosomes. However, several vertebrate species show blocks of (TTAGGG)n repeats present in non-terminal regions of chromosomes, the so-called interstitial telomeric sequences (ITSs), interstitial telomeric repeats or interstitial telomeric bands, which include those intrachromosomal telomeric-like repeats located near (pericentromeric ITSs) or within the centromere (centromeric ITSs) and those telomeric repeats located between the centromere and the telomere (i.e., truly interstitial telomeric sequences) of eukaryotic chromosomes. According with their sequence organization, localization and flanking sequences, ITSs can be classified into four types: 1) short ITSs, 2) subtelomeric ITSs, 3) fusion ITSs, and 4) heterochromatic ITSs. The first three types have been described mainly in the human genome, whereas heterochromatic ITSs have been found in several vertebrate species but not in humans. Several lines of evidence suggest that ITSs play a significant role in genome instability and evolution. This review aims to summarize our current knowledge about the origin, function, instability and evolution of these telomeric-like repeats in vertebrate chromosomes. Copyright © 2017 Elsevier B.V. All rights reserved.
Clustered regularly interspaced short palindromic repeats (CRISPRs) for the genotyping of bacterial pathogens.

PubMed

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2009-01-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) are DNA sequences composed of a succession of repeats (23- to 47-bp long) separated by unique sequences called spacers. Polymorphism can be observed in different strains of a species and may be used for genotyping. We describe protocols and bioinformatics tools that allow the identification of CRISPRs from sequenced genomes, their comparison, and their component determination (the direct repeats and the spacers). A schematic representation of the spacer organization can be produced, allowing an easy comparison between strains.
Mapping Simple Repeated DNA Sequences in Heterochromatin of Drosophila Melanogaster

PubMed Central

Lohe, A. R.; Hilliker, A. J.; Roberts, P. A.

1993-01-01

Heterochromatin in Drosophila has unusual genetic, cytological and molecular properties. Highly repeated DNA sequences (satellites) are the principal component of heterochromatin. Using probes from cloned satellites, we have constructed a chromosome map of 10 highly repeated, simple DNA sequences in heterochromatin of mitotic chromosomes of Drosophila melanogaster. Despite extensive sequence homology among some satellites, chromosomal locations could be distinguished by stringent in situ hybridizations for each satellite. Only two of the localizations previously determined using gradient-purified bulk satellite probes are correct. Eight new satellite localizations are presented, providing a megabase-level chromosome map of one-quarter of the genome. Five major satellites each exhibit a multichromosome distribution, and five minor satellites hybridize to single sites on the Y chromosome. Satellites closely related in sequence are often located near one another on the same chromosome. About 80% of Y chromosome DNA is composed of nine simple repeated sequences, in particular (AAGAC)(n) (8 Mb), (AAGAG)(n) (7 Mb) and (AATAT)(n) (6 Mb). Similarly, more than 70% of the DNA in chromosome 2 heterochromatin is composed of five simple repeated sequences. We have also generated a high resolution map of satellites in chromosome 2 heterochromatin, using a series of translocation chromosomes whose breakpoints in heterochromatin were ordered by N-banding. Finally, staining and banding patterns of heterochromatic regions are correlated with the locations of specific repeated DNA sequences. The basis for the cytochemical heterogeneity in banding appears to depend exclusively on the different satellite DNAs present in heterochromatin. PMID:8375654
Identification of the centromeric repeat in the threespine stickleback fish (Gasterosteus aculeatus).

PubMed

Cech, Jennifer N; Peichel, Catherine L

2015-12-01

Centromere sequences exist as gaps in many genome assemblies due to their repetitive nature. Here we take an unbiased approach utilizing centromere protein A (CENP-A) chomatin immunoprecipitation followed by high-throughput sequencing to identify the centromeric repeat sequence in the threespine stickleback fish (Gasterosteus aculeatus). A 186-bp, AT-rich repeat was validated as centromeric using both fluorescence in situ hybridization (FISH) and immunofluorescence combined with FISH (IF-FISH) on interphase nuclei and metaphase spreads. This repeat hybridizes strongly to the centromere on all chromosomes, with the exception of weak hybridization to the Y chromosome. Together, our work provides the first validated sequence information for the threespine stickleback centromere.
Variation in the genomic locations and sequence conservation of STAR elements among staphylococcal species provides insight into DNA repeat evolution

PubMed Central

2012-01-01

Background Staphylococcus aureus Repeat (STAR) elements are a type of interspersed intergenic direct repeat. In this study the conservation and variation in these elements was explored by bioinformatic analyses of published staphylococcal genome sequences and through sequencing of specific STAR element loci from a large set of S. aureus isolates. Results Using bioinformatic analyses, we found that the STAR elements were located in different genomic loci within each staphylococcal species. There was no correlation between the number of STAR elements in each genome and the evolutionary relatedness of staphylococcal species, however higher levels of repeats were observed in both S. aureus and S. lugdunensis compared to other staphylococcal species. Unexpectedly, sequencing of the internal spacer sequences of individual repeat elements from multiple isolates showed conservation at the sequence level within deep evolutionary lineages of S. aureus. Whilst individual STAR element loci were demonstrated to expand and contract, the sequences associated with each locus were stable and distinct from one another. Conclusions The high degree of lineage and locus-specific conservation of these intergenic repeat regions suggests that STAR elements are maintained due to selective or molecular forces with some of these elements having an important role in cell physiology. The high prevalence in two of the more virulent staphylococcal species is indicative of a potential role for STAR elements in pathogenesis. PMID:23020678

Repeatless and repeat-based centromeres in potato: implications for centromere evolution.

PubMed

Gong, Zhiyun; Wu, Yufeng; Koblízková, Andrea; Torres, Giovana A; Wang, Kai; Iovene, Marina; Neumann, Pavel; Zhang, Wenli; Novák, Petr; Buell, C Robin; Macas, Jirí; Jiang, Jiming

2012-09-01

Centromeres in most higher eukaryotes are composed of long arrays of satellite repeats. By contrast, most newly formed centromeres (neocentromeres) do not contain satellite repeats and instead include DNA sequences representative of the genome. An unknown question in centromere evolution is how satellite repeat-based centromeres evolve from neocentromeres. We conducted a genome-wide characterization of sequences associated with CENH3 nucleosomes in potato (Solanum tuberosum). Five potato centromeres (Cen4, Cen6, Cen10, Cen11, and Cen12) consisted primarily of single- or low-copy DNA sequences. No satellite repeats were identified in these five centromeres. At least one transcribed gene was associated with CENH3 nucleosomes. Thus, these five centromeres structurally resemble neocentromeres. By contrast, six potato centromeres (Cen1, Cen2, Cen3, Cen5, Cen7, and Cen8) contained megabase-sized satellite repeat arrays that are unique to individual centromeres. The satellite repeat arrays likely span the entire functional cores of these six centromeres. At least four of the centromeric repeats were amplified from retrotransposon-related sequences and were not detected in Solanum species closely related to potato. The presence of two distinct types of centromeres, coupled with the boom-and-bust cycles of centromeric satellite repeats in Solanum species, suggests that repeat-based centromeres can rapidly evolve from neocentromeres by de novo amplification and insertion of satellite repeats in the CENH3 domains.
Repeatless and Repeat-Based Centromeres in Potato: Implications for Centromere Evolution[C][W

PubMed Central

Gong, Zhiyun; Wu, Yufeng; Koblížková, Andrea; Torres, Giovana A.; Wang, Kai; Iovene, Marina; Neumann, Pavel; Zhang, Wenli; Novák, Petr; Buell, C. Robin; Macas, Jiří; Jiang, Jiming

2012-01-01

Centromeres in most higher eukaryotes are composed of long arrays of satellite repeats. By contrast, most newly formed centromeres (neocentromeres) do not contain satellite repeats and instead include DNA sequences representative of the genome. An unknown question in centromere evolution is how satellite repeat-based centromeres evolve from neocentromeres. We conducted a genome-wide characterization of sequences associated with CENH3 nucleosomes in potato (Solanum tuberosum). Five potato centromeres (Cen4, Cen6, Cen10, Cen11, and Cen12) consisted primarily of single- or low-copy DNA sequences. No satellite repeats were identified in these five centromeres. At least one transcribed gene was associated with CENH3 nucleosomes. Thus, these five centromeres structurally resemble neocentromeres. By contrast, six potato centromeres (Cen1, Cen2, Cen3, Cen5, Cen7, and Cen8) contained megabase-sized satellite repeat arrays that are unique to individual centromeres. The satellite repeat arrays likely span the entire functional cores of these six centromeres. At least four of the centromeric repeats were amplified from retrotransposon-related sequences and were not detected in Solanum species closely related to potato. The presence of two distinct types of centromeres, coupled with the boom-and-bust cycles of centromeric satellite repeats in Solanum species, suggests that repeat-based centromeres can rapidly evolve from neocentromeres by de novo amplification and insertion of satellite repeats in the CENH3 domains. PMID:22968715
Molecular structure and chromosome distribution of three repetitive DNA families in Anemone hortensis L. (Ranunculaceae).

PubMed

Mlinarec, Jelena; Chester, Mike; Siljak-Yakovlev, Sonja; Papes, Drazena; Leitch, Andrew R; Besendorfer, Visnja

2009-01-01

The structure, abundance and location of repetitive DNA sequences on chromosomes can characterize the nature of higher plant genomes. Here we report on three new repeat DNA families isolated from Anemone hortensis L.; (i) AhTR1, a family of satellite DNA (stDNA) composed of a 554-561 bp long EcoRV monomer; (ii) AhTR2, a stDNA family composed of a 743 bp long HindIII monomer and; (iii) AhDR, a repeat family composed of a 945 bp long HindIII fragment that exhibits some sequence similarity to Ty3/gypsy-like retroelements. Fluorescence in-situ hybridization (FISH) to metaphase chromosomes of A. hortensis (2n = 16) revealed that both AhTR1 and AhTR2 sequences co-localized with DAPI-positive AT-rich heterochromatic regions. AhTR1 sequences occur at intercalary DAPI bands while AhTR2 sequences occur at 8-10 terminally located heterochromatic blocks. In contrast AhDR sequences are dispersed over all chromosomes as expected of a Ty3/gypsy-like element. AhTR2 and AhTR1 repeat families include polyA- and polyT-tracks, AT/TA-motifs and a pentanucleotide sequence (CAAAA) that may have consequences for chromatin packing and sequence homogeneity. AhTR2 repeats also contain TTTAGGG motifs and degenerate variants. We suggest that they arose by interspersion of telomeric repeats with subtelomeric repeats, before hybrid unit(s) amplified through the heterochromatic domain. The three repetitive DNA families together occupy approximately 10% of the A. hortensis genome. Comparative analyses of eight Anemone species revealed that the divergence of the A. hortensis genome was accompanied by considerable modification and/or amplification of repeats.
Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

DOEpatents

Weier, H.U.G.; Gray, J.W.

1995-06-27

A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.
Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

DOEpatents

Weier, Heinz-Ulrich G.; Gray, Joe W.

1995-01-01

A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.
De novo identification of highly diverged protein repeats by probabilistic consistency.

PubMed

Biegert, A; Söding, J

2008-03-15

An estimated 25% of all eukaryotic proteins contain repeats, which underlines the importance of duplication for evolving new protein functions. Internal repeats often correspond to structural or functional units in proteins. Methods capable of identifying diverged repeated segments or domains at the sequence level can therefore assist in predicting domain structures, inferring hypotheses about function and mechanism, and investigating the evolution of proteins from smaller fragments. We present HHrepID, a method for the de novo identification of repeats in protein sequences. It is able to detect the sequence signature of structural repeats in many proteins that have not yet been known to possess internal sequence symmetry, such as outer membrane beta-barrels. HHrepID uses HMM-HMM comparison to exploit evolutionary information in the form of multiple sequence alignments of homologs. In contrast to a previous method, the new method (1) generates a multiple alignment of repeats; (2) utilizes the transitive nature of homology through a novel merging procedure with fully probabilistic treatment of alignments; (3) improves alignment quality through an algorithm that maximizes the expected accuracy; (4) is able to identify different kinds of repeats within complex architectures by a probabilistic domain boundary detection method and (5) improves sensitivity through a new approach to assess statistical significance. Server: http://toolkit.tuebingen.mpg.de/hhrepid; Executables: ftp://ftp.tuebingen.mpg.de/pub/protevo/HHrepID
Detecting and Characterizing Repeating Earthquake Sequences During Volcanic Eruptions

NASA Astrophysics Data System (ADS)

Tepp, G.; Haney, M. M.; Wech, A.

2017-12-01

A major challenge in volcano seismology is forecasting eruptions. Repeating earthquake sequences often precede volcanic eruptions or lava dome activity, providing an opportunity for short-term eruption forecasting. Automatic detection of these sequences can lead to timely eruption notification and aid in continuous monitoring of volcanic systems. However, repeating earthquake sequences may also occur after eruptions or along with magma intrusions that do not immediately lead to an eruption. This additional challenge requires a better understanding of the processes involved in producing these sequences to distinguish those that are precursory. Calculation of the inverse moment rate and concepts from the material failure forecast method can lead to such insights. The temporal evolution of the inverse moment rate is observed to differ for precursory and non-precursory sequences, and multiple earthquake sequences may occur concurrently. These observations suggest that sequences may occur in different locations or through different processes. We developed an automated repeating earthquake sequence detector and near real-time alarm to send alerts when an in-progress sequence is identified. Near real-time inverse moment rate measurements can further improve our ability to forecast eruptions by allowing for characterization of sequences. We apply the detector to eruptions of two Alaskan volcanoes: Bogoslof in 2016-2017 and Redoubt Volcano in 2009. The Bogoslof eruption produced almost 40 repeating earthquake sequences between its start in mid-December 2016 and early June 2017, 21 of which preceded an explosive eruption, and 2 sequences in the months before eruptive activity. Three of the sequences occurred after the implementation of the alarm in late March 2017 and successfully triggered alerts. The nearest seismometers to Bogoslof are over 45 km away, requiring a detector that can work with few stations and a relatively low signal-to-noise ratio. During the Redoubt eruption, earthquake sequences were observed in the months leading up to the eruptive activity beginning in March 2009 as well as immediately preceding 7 of the 19 explosive events. In contrast to Bogoslof, Redoubt has a local monitoring network which allows for better detection and more detailed analysis of the repeating earthquake sequences.
The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats

PubMed Central

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2007-01-01

Background In Archeae and Bacteria, the repeated elements called CRISPRs for "clustered regularly interspaced short palindromic repeats" are believed to participate in the defence against viruses. Short sequences called spacers are stored in-between repeated elements. In the current model, motifs comprising spacers and repeats may target an invading DNA and lead to its degradation through a proposed mechanism similar to RNA interference. Analysis of intra-species polymorphism shows that new motifs (one spacer and one repeated element) are added in a polarised fashion. Although their principal characteristics have been described, a lot remains to be discovered on the way CRISPRs are created and evolve. As new genome sequences become available it appears necessary to develop automated scanning tools to make available CRISPRs related information and to facilitate additional investigations. Description We have produced a program, CRISPRFinder, which identifies CRISPRs and extracts the repeated and unique sequences. Using this software, a database is constructed which is automatically updated monthly from newly released genome sequences. Additional tools were created to allow the alignment of flanking sequences in search for similarities between different loci and to build dictionaries of unique sequences. To date, almost six hundred CRISPRs have been identified in 475 published genomes. Two Archeae out of thirty-seven and about half of Bacteria do not possess a CRISPR. Fine analysis of repeated sequences strongly supports the current view that new motifs are added at one end of the CRISPR adjacent to the putative promoter. Conclusion It is hoped that availability of a public database, regularly updated and which can be queried on the web will help in further dissecting and understanding CRISPR structure and flanking sequences evolution. Subsequent analyses of the intra-species CRISPR polymorphism will be facilitated by CRISPRFinder and the dictionary creator. CRISPRdb is accessible at PMID:17521438
Assembly of a phased diploid Candida albicans genome facilitates allele-specific measurements and provides a simple model for repeat and indel structure

PubMed Central

2013-01-01

Background Candida albicans is a ubiquitous opportunistic fungal pathogen that afflicts immunocompromised human hosts. With rare and transient exceptions the yeast is diploid, yet despite its clinical relevance the respective sequences of its two homologous chromosomes have not been completely resolved. Results We construct a phased diploid genome assembly by deep sequencing a standard laboratory wild-type strain and a panel of strains homozygous for particular chromosomes. The assembly has 700-fold coverage on average, allowing extensive revision and expansion of the number of known SNPs and indels. This phased genome significantly enhances the sensitivity and specificity of allele-specific expression measurements by enabling pooling and cross-validation of signal across multiple polymorphic sites. Additionally, the diploid assembly reveals pervasive and unexpected patterns in allelic differences between homologous chromosomes. Firstly, we see striking clustering of indels, concentrated primarily in the repeat sequences in promoters. Secondly, both indels and their repeat-sequence substrate are enriched near replication origins. Finally, we reveal an intimate link between repeat sequences and indels, which argues that repeat length is under selective pressure for most eukaryotes. This connection is described by a concise one-parameter model that explains repeat-sequence abundance in C. albicans as a function of the indel rate, and provides a general framework to interpret repeat abundance in species ranging from bacteria to humans. Conclusions The phased genome assembly and insights into repeat plasticity will be valuable for better understanding allele-specific phenomena and genome evolution. PMID:24025428
Unrelated sequences at the 5' end of mouse LINE-1 repeated elements define two distinct subfamilies.

PubMed Central

Wincker, P; Jubier-Maurin, V; Roizès, G

1987-01-01

Some full length members of the mouse long interspersed repeated DNA family L1Md have been shown to be associated at their 5' end with a variable number of tandem repetitions, the A repeats, that have been suggested to be transcription controlling elements. We report that the other type of repeat, named F, found at the 5' end of a few L1 elements is also an integral part of full length L1 copies. Sequencing shows that the F repeats are GC rich, and organized in tandem. The L1 copies associated with either A or F repeats can be correlated with two different subsets of L1 sequences distinguished by a series of variant nucleotides specific to each and by unassociated but frequent restriction sites. These findings suggest that sequence replacement has occurred at least once in 5' of L1Md, and is related to the generation of specific subfamilies. Images PMID:3684566
Plant chromosomes from end to end: telomeres, heterochromatin and centromeres.

PubMed

Lamb, Jonathan C; Yu, Weichang; Han, Fangpu; Birchler, James A

2007-04-01

Recent evidence indicates that heterochromatin in plants is composed of heterogeneous sequences, which are usually composed of transposable elements or tandem repeat arrays. These arrays are associated with chromatin modifications that produce a closed configuration that limits transcription. Centromere sequences in plants are usually composed of tandem repeat arrays that are homogenized across the genome. Analysis of such arrays in closely related taxa suggests a rapid turnover of the repeat unit that is typical of a particular species. In addition, two lines of evidence for an epigenetic component of centromere specification have been reported, namely an example of a neocentromere formed over sequences without the typical repeat array and examples of centromere inactivation. Although the telomere repeat unit is quite prevalent in the plant kingdom, unusual repeats have been found in some families. Recently, it was demonstrated that the introduction of telomere sequences into plants cells causes truncation of the chromosomes, and that this technique can be used to produce artificial chromosome platforms.
SSRscanner: a program for reporting distribution and exact location of simple sequence repeats.

PubMed

Anwar, Tamanna; Khan, Asad U

2006-02-20

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com.
A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

PubMed Central

2010-01-01

Background Intragenic tandem repeats occur throughout all domains of life and impart functional and structural variability to diverse translation products. Repeat proteins confer distinctive surface phenotypes to many unicellular organisms, including those with minimal genomes such as the wall-less bacterial monoderms, Mollicutes. One such repeat pattern in this clade is distributed in a manner suggesting its exchange by horizontal gene transfer (HGT). Expanding genome sequence databases reveal the pattern in a widening range of bacteria, and recently among eucaryotic microbes. We examined the genomic flux and consequences of the motif by determining its distribution, predicted structural features and association with membrane-targeted proteins. Results Using a refined hidden Markov model, we document a 25-residue protein sequence motif tandemly arrayed in variable-number repeats in ORFs lacking assigned functions. It appears sporadically in unicellular microbes from disparate bacterial and eucaryotic clades, representing diverse lifestyles and ecological niches that include host parasitic, marine and extreme environments. Tracts of the repeats predict a malleable configuration of recurring domains, with conserved hydrophobic residues forming an amphipathic secondary structure in which hydrophilic residues endow extensive sequence variation. Many ORFs with these domains also have membrane-targeting sequences that predict assorted topologies; others may comprise reservoirs of sequence variants. We demonstrate expressed variants among surface lipoproteins that distinguish closely related animal pathogens belonging to a subgroup of the Mollicutes. DNA sequences encoding the tandem domains display dyad symmetry. Moreover, in some taxa the domains occur in ORFs selectively associated with mobile elements. These features, a punctate phylogenetic distribution, and different patterns of dispersal in genomes of related taxa, suggest that the repeat may be disseminated by HGT and intra-genomic shuffling. Conclusions We describe novel features of PARCELs (Palindromic Amphipathic Repeat Coding ELements), a set of widely distributed repeat protein domains and coding sequences that were likely acquired through HGT by diverse unicellular microbes, further mobilized and diversified within genomes, and co-opted for expression in the membrane proteome of some taxa. Disseminated by multiple gene-centric vehicles, ORFs harboring these elements enhance accessory gene pools as part of the "mobilome" connecting genomes of various clades, in taxa sharing common niches. PMID:20626840
A TALE-inspired computational screen for proteins that contain approximate tandem repeats.

PubMed

Perycz, Malgorzata; Krwawicz, Joanna; Bochtler, Matthias

2017-01-01

TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen.
A TALE-inspired computational screen for proteins that contain approximate tandem repeats

PubMed Central

Krwawicz, Joanna

2017-01-01

TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen. PMID:28617832
Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing.

PubMed

Hribová, Eva; Neumann, Pavel; Matsumoto, Takashi; Roux, Nicolas; Macas, Jirí; Dolezel, Jaroslav

2010-09-16

Bananas and plantains (Musa spp.) are grown in more than a hundred tropical and subtropical countries and provide staple food for hundreds of millions of people. They are seed-sterile crops propagated clonally and this makes them vulnerable to a rapid spread of devastating diseases and at the same time hampers breeding improved cultivars. Although the socio-economic importance of bananas and plantains cannot be overestimated, they remain outside the focus of major research programs. This slows down the study of nuclear genome and the development of molecular tools to facilitate banana improvement. In this work, we report on the first thorough characterization of the repeat component of the banana (M. acuminata cv. 'Calcutta 4') genome. Analysis of almost 100 Mb of sequence data (0.15× genome coverage) permitted partial sequence reconstruction and characterization of repetitive DNA, making up about 30% of the genome. The results showed that the banana repeats are predominantly made of various types of Ty1/copia and Ty3/gypsy retroelements representing 16 and 7% of the genome respectively. On the other hand, DNA transposons were found to be rare. In addition to new families of transposable elements, two new satellite repeats were discovered and found useful as cytogenetic markers. To help in banana sequence annotation, a specific Musa repeat database was created, and its utility was demonstrated by analyzing the repeat composition of 62 genomic BAC clones. A low-depth 454 sequencing of banana nuclear genome provided the largest amount of DNA sequence data available until now for Musa and permitted reconstruction of most of the major types of DNA repeats. The information obtained in this study improves the knowledge of the long-range organization of banana chromosomes, and provides sequence resources needed for repeat masking and annotation during the Musa genome sequencing project. It also provides sequence data for isolation of DNA markers to be used in genetic diversity studies and in marker-assisted selection.
Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing

PubMed Central

2010-01-01

Background Bananas and plantains (Musa spp.) are grown in more than a hundred tropical and subtropical countries and provide staple food for hundreds of millions of people. They are seed-sterile crops propagated clonally and this makes them vulnerable to a rapid spread of devastating diseases and at the same time hampers breeding improved cultivars. Although the socio-economic importance of bananas and plantains cannot be overestimated, they remain outside the focus of major research programs. This slows down the study of nuclear genome and the development of molecular tools to facilitate banana improvement. Results In this work, we report on the first thorough characterization of the repeat component of the banana (M. acuminata cv. 'Calcutta 4') genome. Analysis of almost 100 Mb of sequence data (0.15× genome coverage) permitted partial sequence reconstruction and characterization of repetitive DNA, making up about 30% of the genome. The results showed that the banana repeats are predominantly made of various types of Ty1/copia and Ty3/gypsy retroelements representing 16 and 7% of the genome respectively. On the other hand, DNA transposons were found to be rare. In addition to new families of transposable elements, two new satellite repeats were discovered and found useful as cytogenetic markers. To help in banana sequence annotation, a specific Musa repeat database was created, and its utility was demonstrated by analyzing the repeat composition of 62 genomic BAC clones. Conclusion A low-depth 454 sequencing of banana nuclear genome provided the largest amount of DNA sequence data available until now for Musa and permitted reconstruction of most of the major types of DNA repeats. The information obtained in this study improves the knowledge of the long-range organization of banana chromosomes, and provides sequence resources needed for repeat masking and annotation during the Musa genome sequencing project. It also provides sequence data for isolation of DNA markers to be used in genetic diversity studies and in marker-assisted selection. PMID:20846365
Optimization of sequence alignment for simple sequence repeat regions.

PubMed

Jighly, Abdulqader; Hamwieh, Aladdin; Ogbonnaya, Francis C

2011-07-20

Microsatellites, or simple sequence repeats (SSRs), are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs) mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs).SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type.When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic phylogenic relationship.
A Dynamic Tandem Repeat in Monocotyledons Inferred from a Comparative Analysis of Chloroplast Genomes in Melanthiaceae.

PubMed

Do, Hoang Dang Khoa; Kim, Joo-Hwan

2017-01-01

Chloroplast genomes (cpDNA) are highly valuable resources for evolutionary studies of angiosperms, since they are highly conserved, are small in size, and play critical roles in plants. Slipped-strand mispairing (SSM) was assumed to be a mechanism for generating repeat units in cpDNA. However, research on the employment of different small repeated sequences through SSM events, which may induce the accumulation of distinct types of repeats within the same region in cpDNA, has not been documented. Here, we sequenced two chloroplast genomes from the endemic species Heloniopsis tubiflora (Korea) and Xerophyllum tenax (USA) to cover the gap between molecular data and explore "hot spots" for genomic events in Melanthiaceae. Comparative analysis of 23 complete cpDNA sequences revealed that there were different stages of deletion in the rps16 region across the Melanthiaceae. Based on the partial or complete loss of rps16 gene in cpDNA, we have firstly reported potential molecular markers for recognizing two sections ( Veratrum and Fuscoveratrum ) of Veratrum . Melathiaceae exhibits a significant change in the junction between large single copy and inverted repeat regions, ranging from trnH_GUG to a part of rps3 . Our results show an accumulation of tandem repeats in the rpl23-ycf2 regions of cpDNAs. Small conserved sequences exist and flank tandem repeats in further observation of this region across most of the examined taxa of Liliales. Therefore, we propose three scenarios in which different small repeated sequences were used during SSM events to generate newly distinct types of repeats. Occasionally, prior to the SSM process, point mutation event and double strand break repair occurred and induced the formation of initial repeat units which are indispensable in the SSM process. SSM may have likely occurred more frequently for short repeats than for long repeat sequences in tribe Parideae (Melanthiaceae, Liliales). Collectively, these findings add new evidence of dynamic results from SSM in chloroplast genomes which can be useful for further evolutionary studies in angiosperms. Additionally, genomics events in cpDNA are potential resources for mining molecular markers in Liliales.
Molecular and bioinformatic analysis of the FB-NOF transposable element.

PubMed

Badal, Martí; Portela, Anna; Xamena, Noel; Cabré, Oriol

2006-04-12

The Drosophila melanogaster transposable element FB-NOF is known to play a role in genome plasticity through the generation of all sort of genomic rearrangements. Moreover, several insertional mutants due to FB mobilizations have been reported. Its structure and sequence, however, have been poorly studied mainly as a consequence of the long, complex and repetitive sequence of FB inverted repeats. This repetitive region is composed of several 154 bp blocks, each with five almost identical repeats. In this paper, we report the sequencing process of 2 kb long FB inverted repeats of a complete FB-NOF element, with high precision and reliability. This achievement has been possible using a new map of the FB repetitive region, which identifies unambiguously each repeat with new features that can be used as landmarks. With this new vision of the element, a list of FB-NOF in the D. melanogaster genomic clones has been done, improving previous works that used only bioinformatic algorithms. The availability of many FB and FB-NOF sequences allowed an analysis of the FB insertion sequences that showed no sequence specificity, but a preference for A/T rich sequences. The position of NOF into FB is also studied, revealing that it is always located after a second repeat in a random block. With the results of this analysis, we propose a model of transposition in which NOF jumps from FB to FB, using an unidentified transposase enzyme that should specifically recognize the second repeat end of the FB blocks.

The repetitive landscape of the chicken genome.

PubMed

Wicker, Thomas; Robertson, Jon S; Schulze, Stefan R; Feltus, F Alex; Magrini, Vincent; Morrison, Jason A; Mardis, Elaine R; Wilson, Richard K; Peterson, Daniel G; Paterson, Andrew H; Ivarie, Robert

2005-01-01

Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, and low-copy fractions of the chicken genome. Sequencing high-copy DNA of chicken to about 2.7 x coverage of its estimated sequence complexity led to the initial identification of several new repeat families, which were then used for a survey of the newly released first draft of the complete chicken genome. The analysis provided insight into the diversity and biology of known repeat structures such as CR1 and CNM, for which only limited sequence data had previously been available. Cot sequence data also resulted in the identification of four novel repeats (Birddawg, Hitchcock, Kronos, and Soprano), two new subfamilies of CR1 repeats, and many elements absent from the chicken genome assembly. Multiple autonomous elements were found for a novel Mariner-like transposon, Galluhop, in addition to nonautonomous deletion derivatives. Phylogenetic analysis of the high-copy repeats CR1, Galluhop, and Birddawg provided insight into two distinct genome dispersion strategies. This study also exemplifies the power of the CBCS method to create representative databases for the repetitive fractions of genomes for which only limited sequence data is available.
The repetitive landscape of the chicken genome

PubMed Central

Wicker, Thomas; Robertson, Jon S.; Schulze, Stefan R.; Feltus, F. Alex; Magrini, Vincent; Morrison, Jason A.; Mardis, Elaine R.; Wilson, Richard K.; Peterson, Daniel G.; Paterson, Andrew H.; Ivarie, Robert

2005-01-01

Cot-based cloning and sequencing (CBCS) is a powerful tool for isolating and characterizing the various repetitive components of any genome, combining the established principles of DNA reassociation kinetics with high-throughput sequencing. CBCS was used to generate sequence libraries representing the high, middle, and low-copy fractions of the chicken genome. Sequencing high-copy DNA of chicken to about 2.7× coverage of its estimated sequence complexity led to the initial identification of several new repeat families, which were then used for a survey of the newly released first draft of the complete chicken genome. The analysis provided insight into the diversity and biology of known repeat structures such as CR1 and CNM, for which only limited sequence data had previously been available. Cot sequence data also resulted in the identification of four novel repeats (Birddawg, Hitchcock, Kronos, and Soprano), two new subfamilies of CR1 repeats, and many elements absent from the chicken genome assembly. Multiple autonomous elements were found for a novel Mariner-like transposon, Galluhop, in addition to nonautonomous deletion derivatives. Phylogenetic analysis of the high-copy repeats CR1, Galluhop, and Birddawg provided insight into two distinct genome dispersion strategies. This study also exemplifies the power of the CBCS method to create representative databases for the repetitive fractions of genomes for which only limited sequence data is available. PMID:15256510
ATP hydrolysis provides functions that promote rejection of pairings between different copies of long repeated sequences

PubMed Central

Danilowicz, Claudia; Hermans, Laura; Coljee, Vincent; Prévost, Chantal

2017-01-01

Abstract During DNA recombination and repair, RecA family proteins must promote rapid joining of homologous DNA. Repeated sequences with >100 base pair lengths occupy more than 1% of bacterial genomes; however, commitment to strand exchange was believed to occur after testing ∼20–30 bp. If that were true, pairings between different copies of long repeated sequences would usually become irreversible. Our experiments reveal that in the presence of ATP hydrolysis even 75 bp sequence-matched strand exchange products remain quite reversible. Experiments also indicate that when ATP hydrolysis is present, flanking heterologous dsDNA regions increase the reversibility of sequence matched strand exchange products with lengths up to ∼75 bp. Results of molecular dynamics simulations provide insight into how ATP hydrolysis destabilizes strand exchange products. These results inspired a model that shows how pairings between long repeated sequences could be efficiently rejected even though most homologous pairings form irreversible products. PMID:28854739
The Influence of Primary and Secondary DNA Structure in Deletion and Duplication between Direct Repeats in Escherichia Coli

PubMed Central

Trinh, T. Q.; Sinden, R. R.

1993-01-01

We describe a system to measure the frequency of both deletions and duplications between direct repeats. Short 17- and 18-bp palindromic and nonpalindromic DNA sequences were cloned into the EcoRI site within the chloramphenicol acetyltransferase gene of plasmids pBR325 and pJT7. This creates an insert between direct repeated EcoRI sites and results in a chloramphenicol-sensitive phenotype. Selection for chloramphenicol resistance was utilized to select chloramphenicol resistant revertants that included those with precise deletion of the insert from plasmid pBR325 and duplication of the insert in plasmid pJT7. The frequency of deletion or duplication varied more than 500-fold depending on the sequence of the short sequence inserted into the EcoRI site. For the nonpalindromic inserts, multiple internal direct repeats and the length of the direct repeats appear to influence the frequency of deletion. Certain palindromic DNA sequences with the potential to form DNA hairpin structures that might stabilize the misalignment of direct repeats had a high frequency of deletion. Other DNA sequences with the potential to form structures that might destabilize misalignment of direct repeats had a very low frequency of deletion. Duplication mutations occurred at the highest frequency when the DNA between the direct repeats contained no direct or inverted repeats. The presence of inverted repeats dramatically reduced the frequency of duplications. The results support the slippage-misalignment model, suggesting that misalignment occurring during DNA replication leads to deletion and duplication mutations. The results also support the idea that the formation of DNA secondary structures during DNA replication can facilitate and direct specific mutagenic events. PMID:8325478
Visual ModuleOrganizer: a graphical interface for the detection and comparative analysis of repeat DNA modules

PubMed Central

2014-01-01

Background DNA repeats, such as transposable elements, minisatellites and palindromic sequences, are abundant in sequences and have been shown to have significant and functional roles in the evolution of the host genomes. In a previous study, we introduced the concept of a repeat DNA module, a flexible motif present in at least two occurences in the sequences. This concept was embedded into ModuleOrganizer, a tool allowing the detection of repeat modules in a set of sequences. However, its implementation remains difficult for larger sequences. Results Here we present Visual ModuleOrganizer, a Java graphical interface that enables a new and optimized version of the ModuleOrganizer tool. To implement this version, it was recoded in C++ with compressed suffix tree data structures. This leads to less memory usage (at least 120-fold decrease in average) and decreases by at least four the computation time during the module detection process in large sequences. Visual ModuleOrganizer interface allows users to easily choose ModuleOrganizer parameters and to graphically display the results. Moreover, Visual ModuleOrganizer dynamically handles graphical results through four main parameters: gene annotations, overlapping modules with known annotations, location of the module in a minimal number of sequences, and the minimal length of the modules. As a case study, the analysis of FoldBack4 sequences clearly demonstrated that our tools can be extended to comparative and evolutionary analyses of any repeat sequence elements in a set of genomic sequences. With the increasing number of sequences available in public databases, it is now possible to perform comparative analyses of repeated DNA modules in a graphic and friendly manner within a reasonable time period. Availability Visual ModuleOrganizer interface and the new version of the ModuleOrganizer tool are freely available at: http://lcb.cnrs-mrs.fr/spip.php?rubrique313. PMID:24678954
Long interspersed repeated DNA (LINE) causes polymorphism at the rat insulin 1 locus.

PubMed

Lakshmikumaran, M S; D'Ambrosio, E; Laimins, L A; Lin, D T; Furano, A V

1985-09-01

The insulin 1, but not the insulin 2, locus is polymorphic (i.e., exhibits allelic variation) in rats. Restriction enzyme analysis and hybridization studies showed that the polymorphic region is 2.2 kilobases upstream of the insulin 1 coding region and is due to the presence or absence of an approximately 2.7-kilobase repeated DNA element. DNA sequence determination showed that this DNA element is a member of a long interspersed repeated DNA family (LINE) that is highly repeated (greater than 50,000 copies) and highly transcribed in the rat. Although the presence or absence of LINE sequences at the insulin 1 locus occurs in both the homozygous and heterozygous states, LINE-containing insulin 1 alleles are more prevalent in the rat population than are alleles without LINEs. Restriction enzyme analysis of the LINE-containing alleles indicated that at least two versions of the LINE sequence may be present at the insulin 1 locus in different rats. Either repeated transposition of LINE sequences or gene conversion between the resident insulin 1 LINE and other sequences in the genome are possible explanations for this.
Genome-wide characterization and selection of expressed sequence tag simple sequence repeat primers for optimized marker distribution and reliability in peach

USDA-ARS?s Scientific Manuscript database

Expressed sequence tag (EST) simple sequence repeats (SSRs) in Prunus were mined, and flanking primers designed and used for genome-wide characterization and selection of primers to optimize marker distribution and reliability. A total of 12,618 contigs were assembled from 84,727 ESTs, along with 34...
Microsatellite analysis in the genome of Acanthaceae: An in silico approach.

PubMed

Kaliswamy, Priyadharsini; Vellingiri, Srividhya; Nathan, Bharathi; Selvaraj, Saravanakumar

2015-01-01

Acanthaceae is one of the advanced and specialized families with conventionally used medicinal plants. Simple sequence repeats (SSRs) play a major role as molecular markers for genome analysis and plant breeding. The microsatellites existing in the complete genome sequences would help to attain a direct role in the genome organization, recombination, gene regulation, quantitative genetic variation, and evolution of genes. The current study reports the frequency of microsatellites and appropriate markers for the Acanthaceae family genome sequences. The whole nucleotide sequences of Acanthaceae species were obtained from National Center for Biotechnology Information database and screened for the presence of SSRs. SSR Locator tool was used to predict the microsatellites and inbuilt Primer3 module was used for primer designing. Totally 110 repeats from 108 sequences of Acanthaceae family plant genomes were identified, and the occurrence of dinucleotide repeats was found to be abundant in the genome sequences. The essential amino acid isoleucine was found rich in all the sequences. We also designed the SSR-based primers/markers for 59 sequences of this family that contains microsatellite repeats in their genome. The identified microsatellites and primers might be useful for breeding and genetic studies of plants that belong to Acanthaceae family in the future.
Isolation and mapping of telomeric pentanucleotide (TAACC)n repeats of the Pacific whiteleg shrimp, Penaeus vannamei, using fluorescence in situ hybridization.

PubMed

Alcivar-Warren, Acacia; Meehan-Meola, Dawn; Wang, Yongping; Guo, Ximing; Zhou, Linghua; Xiang, Jianhai; Moss, Shaun; Arce, Steve; Warren, William; Xu, Zhenkang; Bell, Kireina

2006-01-01

To develop genetic and physical maps for shrimp, accurate information on the actual number of chromosomes and a large number of genetic markers is needed. Previous reports have shown two different chromosome numbers for the Pacific whiteleg shrimp, Penaeus vannamei, the most important penaeid shrimp species cultured in the Western hemisphere. Preliminary results obtained by direct sequencing of clones from a Sau3A-digested genomic library of P. vannamei ovary identified a large number of (TAACC/GGTTA)-containing SSRs. The objectives of this study were to (1) examine the frequency of (TAACC)n repeats in 662 P. vannamei genomic clones that were directly sequenced, and perform homology searches of these clones, (2) confirm the number of chromosomes in testis of P. vannamei, and (3) localize the TAACC repeats in P. vannamei chromosome spreads using fluorescence in situ hybridization (FISH). Results for objective 1 showed that 395 out of the 662 clones sequenced contained single or multiple SSRs with three or more repeat motifs, 199 of which contained variable tandem repeats of the pentanucleotide (TAACC/GGTTA)n, with 3 to 14 copies per sequence. The frequency of (TAACC)n repeats in P. vannamei is 4.68 kb for SSRs with five or more repeat motifs. Sequence comparisons using the BLASTN nonredundant and expressed sequence tag (EST) databases indicated that most of the TAACC-containing clones were similar to either the core pentanucleotide repeat in PVPENTREP locus (GenBank accession no. X82619) or portions of 28S rRNA. Transposable elements (transposase for Tn1000 and reverse transcriptase family members), hypothetical or unnamed protein products, and genes of known function such as 18S and 28S rRNAs, heat shock protein 70, and thrombospondin were identified in non-TAACC-containing clones. For objective 2, the meiotic chromosome number of P. vannamei was confirmed as N = 44. For objective 3, four FISH probes (P1 to P4) containing different numbers of TAACC repeats produced positive signals on telomeres of P. vannamei chromosomes. A few chromosomes had positive signals interstitially. Probe signal strength and chromosome coverage differed in the general order of P1>P2>P3>P4, which correlated with the length of TAACC repeats within the probes: 83, 66, 35, and 30 bp, respectively, suggesting that the TAACC repeats, and not the flanking sequences, produced the TAACC signals at chromosome ends and TAACC is likely the telomere sequence for P. vannamei.
Complete mitochondrial genome of the whiter-spotted flower chafer, Protaetia brevitarsis (Coleoptera: Scarabaeidae).

PubMed

Kim, Min Jee; Im, Hyun Hwak; Lee, Kwang Youll; Han, Yeon Soo; Kim, Iksoo

2014-06-01

Abstract The complete nucleotide sequences of the mitochondrial genome from the whiter-spotted flower chafer, Protaetia brevitarsis (Coleoptera: Scarabaeidae), was determined. The 20,319-bp long circular genome is the longest among completely sequenced Coleoptera. As is typical in animals, the P. brevitarsis genome consisted of two ribosomal RNAs, 22 transfer RNAs, 13 protein-coding genes and one A + T-rich region. Although the size of the coding genes was typical, the non-coding A + T-rich region was 5654 bp, which is the longest in insects. The extraordinary length of this region was composed of 28,117-bp tandem repeats and 782-bp tandem repeats. These repeat sequences were encompassed by three non-repeat sequences constituting 1804 bp.
Massively parallel sequencing of forensic STRs: Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements.

PubMed

Parson, Walther; Ballard, David; Budowle, Bruce; Butler, John M; Gettings, Katherine B; Gill, Peter; Gusmão, Leonor; Hares, Douglas R; Irwin, Jodi A; King, Jonathan L; Knijff, Peter de; Morling, Niels; Prinz, Mechthild; Schneider, Peter M; Neste, Christophe Van; Willuweit, Sascha; Phillips, Christopher

2016-05-01

The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established nomenclature for CE-based STR analysis will remain unchanged in the future, the nomenclature of sequence-based STR genotypes will need to follow updated rules and be generated by expert systems that translate MPS sequences to match CE conventions in order to guarantee compatibility between the different generations of STR data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
SSRscanner: a program for reporting distribution and exact location of simple sequence repeats

PubMed Central

Anwar, Tamanna; Khan, Asad U

2006-01-01

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. Availability This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com PMID:17597863
Structure and stability of the ankyrin domain of the Drosophila Notch receptor.

PubMed

Zweifel, Mark E; Leahy, Daniel J; Hughson, Frederick M; Barrick, Doug

2003-11-01

The Notch receptor contains a conserved ankyrin repeat domain that is required for Notch-mediated signal transduction. The ankyrin domain of Drosophila Notch contains six ankyrin sequence repeats previously identified as closely matching the ankyrin repeat consensus sequence, and a putative seventh C-terminal sequence repeat that exhibits lower similarity to the consensus sequence. To better understand the role of the Notch ankyrin domain in Notch-mediated signaling and to examine how structure is distributed among the seven ankyrin sequence repeats, we have determined the crystal structure of this domain to 2.0 angstroms resolution. The seventh, C-terminal, ankyrin sequence repeat adopts a regular ankyrin fold, but the first, N-terminal ankyrin repeat, which contains a 15-residue insertion, appears to be largely disordered. The structure reveals a substantial interface between ankyrin polypeptides, showing a high degree of shape and charge complementarity, which may be related to homotypic interactions suggested from indirect studies. However, the Notch ankyrin domain remains largely monomeric in solution, demonstrating that this interface alone is not sufficient to promote tight association. Using the structure, we have classified reported mutations within the Notch ankyrin domain that are known to disrupt signaling into those that affect buried residues and those restricted to surface residues. We show that the buried substitutions greatly decrease protein stability, whereas the surface substitutions have only a marginal affect on stability. The surface substitutions are thus likely to interfere with Notch signaling by disrupting specific Notch-effector interactions and map the sites of these interactions.
Evaluation of Diversity Based on Morphological Variabilities and ISSR Molecular Markers in Iranian Cynodon dactylon (L.) Pers. Accessions to Select and Introduce Cold-Tolerant Genotypes.

PubMed

Akbari, M; Salehi, H; Niazi, A

2018-04-01

The main goals of the present study were to screen Iranian common bermudagrasses to find cold-tolerant accessions and evaluate their genetic and morphological variabilities. In this study, 49 accessions were collected from 18 provinces of Iran. One foreign cultivar of common bermudagrass was used as control. Morphological variation was evaluated based on 14 morphological traits to give information about taxonomic position of Iranian common bermudagrass. Data from morphological traits were evaluated to categorize all accessions as either cold sensitive or tolerant using hierarchical clustering with Ward's method in SPSS software. Inter-Simple Sequence Repeat (ISSR) primers were employed to evaluate genetic variability of accessions. The results of our taxonomic investigation support the existence of two varieties of Cynodon dactylon in Iran: var. dactylon (hairless plant) and var. villosous (plant with hairs at leaf underside and/or upper side surfaces or exterior surfaces of sheath). All 15 primers amplified and gave clear and highly reproducible DNA fragments. In total, 152 fragments were produced, of which 144 (94.73%) being polymorphic. The polymorphic information content (PIC) values ranged from 0.700 to 0.928. The average PIC value obtained with 15 ISSR primers was 0.800, which shows that all primers were informative. Probability identity (PI) and discriminating power between all primers ranged from 0.029 to 0.185 and 0.815 to 0.971, respectively. Genetic data were converted into a binary data matrix. NTSYS software was used for data analysis. Clustering was done by the unweighted pair-group method with arithmetic averages and principle coordinate analysis, separated the accessions into six main clusters. According to both morphological and genetic diversity investigations of accessions, they can be clustered into three groups: cold sensitive, cold semi-tolerant, and cold tolerant. The most cold-tolerant accessions were: Taft, Malayear, Gorgan, Safashahr, Naein, Aligoudarz, and the foreign cultivar. This study may provide useful information for further breeding programs on common bermudagrass. Selected genotypes can be evaluated for other abiotic stresses such as drought and salinity.
Application of ISSR markers to analyze molecular relationships in Iranian jasmine (Jasminum spp.) accessions.

PubMed

Ghasemi Ghehsareh, Masood; Salehi, Hassan; Khosh-Khui, Morteza; Niazi, Ali

2015-01-01

There are many species of jasmines in different regions of Iran in natural or cultivated form, and there is no information about their genetic status. Therefore, inter-simple sequence repeat (ISSR) analysis was used to evaluate genetic variations of the 53 accessions representing eight species of Jasminum collected from different regions of Iran. A total of 21 ISSR primers were used which generated 981 bands of different sizes. Mean percentage of polymorphic bands was 90.64 %. Maximum resolving power, polymorphic information content average, and marker index values were 21.55, 0.35, and 14.42 for primers of 3, 4, and 3 respectively. The unweighted pair group method with arithmetic mean dendrogram based on Jaccard's coefficients indicated that 53 accessions were divided into two major clusters. The first major cluster was divided into two subclusters; the subcluster A included Jasminum grandiflorum L., J. officinale L., and J. azoricum L. and the subcluster B consisted of three forms of J. sambac L. (single, semi-double, and double flowers). The second major cluster was divided into two subclusters; the first subcluster (C) included J. humile L., J. primulinum Hemsl., J. nudiflorum Lindl. and the second subcluster (D) consisted of J. fruticans L. At the species level, the highest percentage of polymorphism (34.05 %), numbers of effective alleles (1.16), Shannon index (0.151), and Nei's genetic diversity (0.098) were observed in J. officinale. The lowest values of percentage polymorphism (0.011), number of effective alleles (1.009), Shannon index (0.007), and Nei's genetic diversity (0.005) were obtained for J. nudiflorum. Based on pairwise population matrix of Nei's unbiased genetic identity, the highest identity (0.85) was found between J.officinale and J. azoricum and the lowest identity (0.69) was between J. grandiflorum and J. perimulinum. Based on analysis of molecular variance, the amount of genetic variations among the eight populations was 83 %. This study demonstrated that the ISSR is an useful tool in jasmine genomic diversity studies and to detect their relationships.
Convergence of goals: phylogenetical, morphological, and physiological characterization of tolerance to drought stress in tall fescue (Festuca arundinacea Schreb.).

PubMed

Salehi, Mohammadreza; Salehi, Hassan; xNiazi, Hassan; Ghobadi, Cyrus

2014-03-01

The aim of this study is to find Iranian tall fescue accessions that tolerate drought stress and investigation on phylogenetical, morphological, and physiological characterization of them. For this propose, inter-simple sequence repeats (ISSR) markers were used to examine the genetic variability of accessions from different provinces of Iran. Of 21 primers, 20 primers generated highly reproducible fragments. Using these primers, 390 discernible DNA fragments were produced with 367 (93.95 %) being polymorphic. The polymorphic information content (PIC) values ranged from 0.948 to 0.976, with a mean PIC value of 0.969. Probability identity (PI) and discriminating power (D = 1-PI) among the primers ranged from 0.001 to 0.004 and 0.998 to 0.995, respectively. A binary qualitative data matrix was constructed. Data analyses were performed using the NTSYS software and the similarity values were used to generate a dendrogram via UPGMA. To study the drought stress, plants were irrigated at 25 % FC condition for three times. Fresh leaves were collected to measure physiological characters including: superoxide dismutase, catalase, and peroxidase activities and proline and total chlorophyll content at two times, before and after stress application. Relative water content, fresh and dry weight ratio, survival percentage, and visual quality were evaluated after stress. Morphological and physiological characters were assessed in order to classify accessions as either tolerant or sensitive using Ward's method of Hierarchical cluster analysis in SPSS software. The results of present study demonstrated that the ISSR markers are useful for studying tall fescue genetic diversity. Convergence of morphological and physiological characterizations during drought stress and phylogenetic relationship results showed that accessions can be grouped into four clusters; drought-tolerant accessions that collected from west of Iran, drought-tolerant accessions collected from northwest of Iran, drought semi-tolerant accessions collected from center of Iran, and drought-sensitive accessions collected from north of Iran. Data presented could be used to classify the tall fescue accessions based on suitability of cultivation in the regions studied or the regions with the similar environmental condition.
Molecular architecture of classical cytological landmarks: Centromeres and telomeres

DOE Office of Scientific and Technical Information (OSTI.GOV)

Meyne, J.

1994-11-01

Both the human telomere repeat and the pericentromeric repeat sequence (GGAAT)n were isolated based on evolutionary conservation. Their isolation was based on the premise that chromosomal features as structurally and functionally important as telomeres and centromeres should be highly conserved. Both sequences were isolated by high stringency screening of a human repetitive DNA library with rodent repetitive DNA. The pHuR library (plasmid Human Repeat) used for this project was enriched for repetitive DNA by using a modification of the standard DNA library preparation method. Usually DNA for a library is cut with restriction enzymes, packaged, infected, and the library ismore » screened. A problem with this approach is that many tandem repeats don`t have any (or many) common restriction sites. Therefore, many of the repeat sequences will not be represented in the library because they are not restricted to a viable length for the vector used. To prepare the pHuR library, human DNA was mechanically sheared to a small size. These relatively short DNA fragments were denatured and then renatured to C{sub o}t 50. Theoretically only repetitive DNA sequences should renature under C{sub o}t 50 conditions. The single-stranded regions were digested using S1 nuclease, leaving the double-stranded, renatured repeat sequences.« less
Sequences spanning the leader-repeat junction mediate CRISPR adaptation to phage in Streptococcus thermophilus

PubMed Central

Wei, Yunzhou; Chesne, Megan T.; Terns, Rebecca M.; Terns, Michael P.

2015-01-01

CRISPR-Cas systems are RNA-based immune systems that protect prokaryotes from invaders such as phages and plasmids. In adaptation, the initial phase of the immune response, short foreign DNA fragments are captured and integrated into host CRISPR loci to provide heritable defense against encountered foreign nucleic acids. Each CRISPR contains a ∼100–500 bp leader element that typically includes a transcription promoter, followed by an array of captured ∼35 bp sequences (spacers) sandwiched between copies of an identical ∼35 bp direct repeat sequence. New spacers are added immediately downstream of the leader. Here, we have analyzed adaptation to phage infection in Streptococcus thermophilus at the CRISPR1 locus to identify cis-acting elements essential for the process. We show that the leader and a single repeat of the CRISPR locus are sufficient for adaptation in this system. Moreover, we identified a leader sequence element capable of stimulating adaptation at a dormant repeat. We found that sequences within 10 bp of the site of integration, in both the leader and repeat of the CRISPR, are required for the process. Our results indicate that information at the CRISPR leader-repeat junction is critical for adaptation in this Type II-A system and likely other CRISPR-Cas systems. PMID:25589547
Structure, organization, and sequence of alpha satellite DNA from human chromosome 17: evidence for evolution by unequal crossing-over and an ancestral pentamer repeat shared with the human X chromosome.

PubMed

Waye, J S; Willard, H F

1986-09-01

The centromeric regions of all human chromosomes are characterized by distinct subsets of a diverse tandemly repeated DNA family, alpha satellite. On human chromosome 17, the predominant form of alpha satellite is a 2.7-kilobase-pair higher-order repeat unit consisting of 16 alphoid monomers. We present the complete nucleotide sequence of the 16-monomer repeat, which is present in 500 to 1,000 copies per chromosome 17, as well as that of a less abundant 15-monomer repeat, also from chromosome 17. These repeat units were approximately 98% identical in sequence, differing by the exclusion of precisely 1 monomer from the 15-monomer repeat. Homologous unequal crossing-over is suggested as a probable mechanism by which the different repeat lengths on chromosome 17 were generated, and the putative site of such a recombination event is identified. The monomer organization of the chromosome 17 higher-order repeat unit is based, in part, on tandemly repeated pentamers. A similar pentameric suborganization has been previously demonstrated for alpha satellite of the human X chromosome. Despite the organizational similarities, substantial sequence divergence distinguishes these subsets. Hybridization experiments indicate that the chromosome 17 and X subsets are more similar to each other than to the subsets found on several other human chromosomes. We suggest that the chromosome 17 and X alpha satellite subsets may be related components of a larger alphoid subfamily which have evolved from a common ancestral repeat into the contemporary chromosome-specific subsets.
Nucleotide sequences of Dictyostelium discoideum developmentally regulated cDNAs rich in (AAC) imply proteins that contain clusters of asparagine, glutamine, or threonine.

PubMed

Shaw, D R; Richter, H; Giorda, R; Ohmachi, T; Ennis, H L

1989-09-01

A Dictyostelium discoideum repetitive element composed of long repeats of the codon (AAC) is found in developmentally regulated transcripts. The concentration of (AAC) sequences is low in mRNA from dormant spores and growing cells and increases markedly during spore germination and multicellular development. The sequence hybridizes to many different sized Dictyostelium DNA restriction fragments indicating that it is scattered throughout the genome. Four cDNA clones isolated contain (AAC) sequences in the deduced coding region. Interestingly, the (AAC)-rich sequences are present in all three reading frames in the deduced proteins, i.e., AAC (asparagine), ACA (threonine) and CAA (glutamine). Three of the clones contain only one of these in-frame so that the individual proteins carry either asparagine, threonine, or glutamine clusters, not mixtures. However, one clone is both glutamine- and asparagine-rich. The (AAC) portion of the transcripts are reiterated 300 times in the haploid genome while the other portions of the cDNAs represent single copy genes, whose sequences show no similarity other than the (AAC) repeats. The repeated sequence is similar to the opa or M sequence found in Drosophila melanogaster notch and homeo box genes and in fly developmentally regulated transcripts. The transcripts are present on polysomes suggesting that they are translated. Although the function of these repeats is unknown, long amino acid repeats are a characteristic feature of extracellular proteins of lower eukaryotes.

Design, production and molecular structure of a new family of artificial alpha-helicoidal repeat proteins (αRep) based on thermostable HEAT-like repeats.

PubMed

Urvoas, Agathe; Guellouz, Asma; Valerio-Lepiniec, Marie; Graille, Marc; Durand, Dominique; Desravines, Danielle C; van Tilbeurgh, Herman; Desmadril, Michel; Minard, Philippe

2010-11-26

Repeat proteins have a modular organization and a regular architecture that make them attractive models for design and directed evolution experiments. HEAT repeat proteins, although very common, have not been used as a scaffold for artificial proteins, probably because they are made of long and irregular repeats. Here, we present and validate a consensus sequence for artificial HEAT repeat proteins. The sequence was defined from the structure-based sequence analysis of a thermostable HEAT-like repeat protein. Appropriate sequences were identified for the N- and C-caps. A library of genes coding for artificial proteins based on this sequence design, named αRep, was assembled using new and versatile methodology based on circular amplification. Proteins picked randomly from this library are expressed as soluble proteins. The biophysical properties of proteins with different numbers of repeats and different combinations of side chains in hypervariable positions were characterized. Circular dichroism and differential scanning calorimetry experiments showed that all these proteins are folded cooperatively and are very stable (T(m) >70 °C). Stability of these proteins increases with the number of repeats. Detailed gel filtration and small-angle X-ray scattering studies showed that the purified proteins form either monomers or dimers. The X-ray structure of a stable dimeric variant structure was solved. The protein is folded with a highly regular topology and the repeat structure is organized, as expected, as pairs of alpha helices. In this protein variant, the dimerization interface results directly from the variable surface enriched in aromatic residues located in the randomized positions of the repeats. The dimer was crystallized both in an apo and in a PEG-bound form, revealing a very well defined binding crevice and some structure flexibility at the interface. This fortuitous binding site could later prove to be a useful binding site for other low molecular mass partners. Copyright © 2010 Elsevier Ltd. All rights reserved.
TRedD—A database for tandem repeats over the edit distance

PubMed Central

Sokol, Dina; Atagun, Firat

2010-01-01

A ‘tandem repeat’ in DNA is a sequence of two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats are common in the genomes of both eukaryotic and prokaryotic organisms. They are significant markers for human identity testing, disease diagnosis, sequence homology and population studies. In this article, we describe a new database, TRedD, which contains the tandem repeats found in the human genome. The database is publicly available online, and the software for locating the repeats is also freely available. The definition of tandem repeats used by TRedD is a new and innovative definition based upon the concept of ‘evolutive tandem repeats’. In addition, we have developed a tool, called TandemGraph, to graphically depict the repeats occurring in a sequence. This tool can be coupled with any repeat finding software, and it should greatly facilitate analysis of results. Database URL: http://tandem.sci.brooklyn.cuny.edu/ PMID:20624712
ST proteins, a new family of plant tandem repeat proteins with a DUF2775 domain mainly found in Fabaceae and Asteraceae.

PubMed

Albornos, Lucía; Martín, Ignacio; Iglesias, Rebeca; Jiménez, Teresa; Labrador, Emilia; Dopico, Berta

2012-11-07

Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats. ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development. We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the described group of 20 to 40 amino acid tandem repeat proteins and also from known cell wall proteins with repeat sequences. Several putative roles in plant physiology can be inferred from the characteristics found.
ST proteins, a new family of plant tandem repeat proteins with a DUF2775 domain mainly found in Fabaceae and Asteraceae

PubMed Central

2012-01-01

Background Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats. Results ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development. Conclusions We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the described group of 20 to 40 amino acid tandem repeat proteins and also from known cell wall proteins with repeat sequences. Several putative roles in plant physiology can be inferred from the characteristics found. PMID:23134664
A candidate gene for choanal atresia in alpaca.

PubMed

Reed, Kent M; Bauer, Miranda M; Mendoza, Kristelle M; Armién, Aníbal G

2010-03-01

Choanal atresia (CA) is a common nasal craniofacial malformation in New World domestic camelids (alpaca and llama). CA results from abnormal development of the nasal passages and is especially debilitating to newborn crias. CA in camelids shares many of the clinical manifestations of a similar condition in humans (CHARGE syndrome). Herein we report on the regulatory gene CHD7 of alpaca, whose homologue in humans is most frequently associated with CHARGE. Sequence of the CHD7 coding region was obtained from a non-affected cria. The complete coding region was 9003 bp, corresponding to a translated amino acid sequence of 3000 aa. Additional genomic sequences corresponding to a significant portion of the CHD7 gene were identified and assembled from the 2x alpaca whole genome sequence, providing confirmatory sequence for much of the CHD7 coding region. The alpaca CHD7 mRNA sequence was 97.9% similar to the human sequence, with the greatest sequence difference being an insertion in exon 38 that results in a polyalanine repeat (A12). Polymorphism in this repeat was tested for association with CA in alpaca by cloning and sequencing the repeat from both affected and non-affected individuals. Variation in length of the poly-A repeat was not associated with CA. Complete sequencing of the CHD7 gene will be necessary to determine whether other mutations in CHD7 are the cause of CA in camelids.
A theory that may explain the Hayflick limit--a means to delete one copy of a repeating sequence during each cell cycle in certain human cells such as fibroblasts.

PubMed

Naveilhan, P; Baudet, C; Jabbour, W; Wion, D

1994-09-01

A model that may explain the limited division potential of certain cells such as human fibroblasts in culture is presented. The central postulate of this theory is that there exists, prior to certain key exons that code for materials needed for cell division, a unique sequence of specific repeating segments of DNA. One copy of such repeating segments is deleted during each cell cycle in cells that are not protected from such deletion through methylation of their cytosine residues. According to this theory, the means through which such repeated sequences are removed, one per cycle, is through the sequential action of enzymes that act much as bacterial restriction enzymes do--namely to produce scissions in both strands of DNA in areas that correspond to the DNA base sequence recognition specificities of such enzymes. After the first scission early in a replicative cycle, that enzyme becomes inhibited, but the cleavage of the first site exposes the closest site in the repetitive element to the action of a second restriction enzyme after which that enzyme also becomes inhibited. Then repair occurs, regenerating the original first site. Through this sequential activation and inhibition of two different restriction enzymes, only one copy of the repeating sequence is deleted during each cell cycle. In effect, the repeating sequence operates as a precise counter of the numbers of cell doubling that have occurred since the cells involved differentiated during development.
Molecular characterization and physical localization of highly repetitive DNA sequences from Brazilian Alstroemeria species.

PubMed

Kuipers, A G J; Kamstra, S A; de Jeu, M J; Visser, R G F

2002-01-01

Highly repetitive DNA sequences were isolated from genomic DNA libraries of Alstroemeria psittacina and A. inodora. Among the repetitive sequences that were isolated, tandem repeats as well as dispersed repeats could be discerned. The tandem repeats belonged to a family of interlinked Sau3A subfragments with sizes varying from 68-127 bp, and constituted a larger HinfI repeat of approximately 400 bp. Southern hybridization showed a similar molecular organization of the tandem repeats in each of the Brazilian Alstroemeria species tested. None of the repeats hybridized with DNA from Chilean Alstroemeria species, which indicates that they are specific for the Brazilian species. In-situ localization studies revealed the tandem repeats to be localized in clusters on the chromosomes of A. inodora and A. psittacina: distal hybridization sites were found on chromosome arms 2PS, 6PL, 7PS, 7PL and 8PL, interstitial sites on chromosome arms 2PL, 3PL, 4PL and 5PL. The applicability of the tandem repeats for cytogenetic analysis of interspecific hybrids and their role in heterochromatin organization are discussed.
Accurate typing of short tandem repeats from genome-wide sequencing data and its applications.

PubMed

Fungtammasan, Arkarachai; Ananda, Guruprasad; Hile, Suzanne E; Su, Marcia Shu-Wei; Sun, Chen; Harris, Robert; Medvedev, Paul; Eckert, Kristin; Makova, Kateryna D

2015-05-01

Short tandem repeats (STRs) are implicated in dozens of human genetic diseases and contribute significantly to genome variation and instability. Yet profiling STRs from short-read sequencing data is challenging because of their high sequencing error rates. Here, we developed STR-FM, short tandem repeat profiling using flank-based mapping, a computational pipeline that can detect the full spectrum of STR alleles from short-read data, can adapt to emerging read-mapping algorithms, and can be applied to heterogeneous genetic samples (e.g., tumors, viruses, and genomes of organelles). We used STR-FM to study STR error rates and patterns in publicly available human and in-house generated ultradeep plasmid sequencing data sets. We discovered that STRs sequenced with a PCR-free protocol have up to ninefold fewer errors than those sequenced with a PCR-containing protocol. We constructed an error correction model for genotyping STRs that can distinguish heterozygous alleles containing STRs with consecutive repeat numbers. Applying our model and pipeline to Illumina sequencing data with 100-bp reads, we could confidently genotype several disease-related long trinucleotide STRs. Utilizing this pipeline, for the first time we determined the genome-wide STR germline mutation rate from a deeply sequenced human pedigree. Additionally, we built a tool that recommends minimal sequencing depth for accurate STR genotyping, depending on repeat length and sequencing read length. The required read depth increases with STR length and is lower for a PCR-free protocol. This suite of tools addresses the pressing challenges surrounding STR genotyping, and thus is of wide interest to researchers investigating disease-related STRs and STR evolution. © 2015 Fungtammasan et al.; Published by Cold Spring Harbor Laboratory Press.
Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.)

USDA-ARS?s Scientific Manuscript database

Background: Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed S...
Development and characterization of simple sequence repeats for Bipolaris sokiniana and cross transferability to related species

USDA-ARS?s Scientific Manuscript database

Simple sequence repeats (SSR) markers were developed from a small insert genomic library for Bipolaris sorokiniana, a mitosporic fungal pathogen that causes spot blotch and root rot in switchgrass. About 59% of sequenced clones (n=384) harbored various SSR motifs. After eliminating the redundant seq...
Are the TTAGG and TTAGGG telomeric repeats phylogenetically conserved in aculeate Hymenoptera?

NASA Astrophysics Data System (ADS)

Menezes, Rodolpho S. T.; Bardella, Vanessa B.; Cabral-de-Mello, Diogo C.; Lucena, Daercio A. A.; Almeida, Eduardo A. B.

2017-10-01

Despite the (TTAGG)n telomeric repeat supposed being the ancestral DNA motif of telomeres in insects, it was repeatedly lost within some insect orders. Notably, parasitoid hymenopterans and the social wasp Metapolybia decorata (Gribodo) lack the (TTAGG)n sequence, but in other representatives of Hymenoptera, this motif was noticed, such as different ant species and the honeybee. These findings raise the question of whether the insect telomeric repeat is or not phylogenetically predominant in Hymenoptera. Thus, we evaluated the occurrence of both the (TTAGG)n sequence and the vertebrate telomere sequence (TTAGGG)n using dot-blotting hybridization in 25 aculeate species of Hymenoptera. Our results revealed the absence of (TTAGG)n sequence in all tested species, elevating the number of hymenopteran families lacking this telomeric sequence to 13 out of the 15 tested families so far. The (TTAGGG)n was not observed in any tested species. Based on our data and compiled information, we suggest that the (TTAGG)n sequence was putatively lost in the ancestor of Apocrita with at least two subsequent independent regains (in Formicidae and Apidae).
Repetitive DNA in the pea (Pisum sativum L.) genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula

PubMed Central

Macas, Jiří; Neumann, Pavel; Navrátilová, Alice

2007-01-01

Background Extraordinary size variation of higher plant nuclear genomes is in large part caused by differences in accumulation of repetitive DNA. This makes repetitive DNA of great interest for studying the molecular mechanisms shaping architecture and function of complex plant genomes. However, due to methodological constraints of conventional cloning and sequencing, a global description of repeat composition is available for only a very limited number of higher plants. In order to provide further data required for investigating evolutionary patterns of repeated DNA within and between species, we used a novel approach based on massive parallel sequencing which allowed a comprehensive repeat characterization in our model species, garden pea (Pisum sativum). Results Analysis of 33.3 Mb sequence data resulted in quantification and partial sequence reconstruction of major repeat families occurring in the pea genome with at least thousands of copies. Our results showed that the pea genome is dominated by LTR-retrotransposons, estimated at 140,000 copies/1C. Ty3/gypsy elements are less diverse and accumulated to higher copy numbers than Ty1/copia. This is in part due to a large population of Ogre-like retrotransposons which alone make up over 20% of the genome. In addition to numerous types of mobile elements, we have discovered a set of novel satellite repeats and two additional variants of telomeric sequences. Comparative genome analysis revealed that there are only a few repeat sequences conserved between pea and soybean genomes. On the other hand, all major families of pea mobile elements are well represented in M. truncatula. Conclusion We have demonstrated that even in a species with a relatively large genome like pea, where a single 454-sequencing run provided only 0.77% coverage, the generated sequences were sufficient to reconstruct and analyze major repeat families corresponding to a total of 35–48% of the genome. These data provide a starting point for further investigations of legume plant genomes based on their global comparative analysis and for the development of more sophisticated approaches for data mining. PMID:18031571
Molecular basis of length polymorphism in the human zeta-globin gene complex.

PubMed Central

Goodbourn, S E; Higgs, D R; Clegg, J B; Weatherall, D J

1983-01-01

The length polymorphism between the human zeta-globin gene and its pseudogene is caused by an allele-specific variation in the copy number of a tandemly repeating 36-base-pair sequence. This sequence is related to a tandemly repeated 14-base-pair sequence in the 5' flanking region of the human insulin gene, which is known to cause length polymorphism, and to a repetitive sequence in intervening sequence (IVS) 1 of the pseudo-zeta-globin gene. Evidence is presented that the latter is also of variable length, probably because of differences in the copy number of the tandem repeat. The homology between the three length polymorphisms may be an indication of the presence of a more widespread group of related sequences in the human genome, which might be useful for generalized linkage studies. PMID:6308667
Identification and Analysis of Novel Amino-Acid Sequence Repeats in Bacillus anthracis str. Ames Proteome Using Computational Tools

PubMed Central

Hemalatha, G. R.; Rao, D. Satyanarayana; Guruprasad, L.

2007-01-01

We have identified four repeats and ten domains that are novel in proteins encoded by the Bacillus anthracis str. Ames proteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 57-amino-acid-residue PxV domain, (2) 122-amino-acid-residue FxF domain, (3) 111-amino-acid-residue YEFF domain, (4) 109-amino-acid-residue IMxxH domain, (5) 103-amino-acid-residue VxxT domain, (6) 84-amino-acid-residue ExW domain, (7) 104-amino-acid-residue NTGFIG domain, (8) 36-amino-acid-residue NxGK repeat, (9) 95-amino-acid-residue VYV domain, (10) 75-amino-acid-residue KEWE domain, (11) 59-amino-acid-residue AFL domain, (12) 53-amino-acid-residue RIDVK repeat, (13) (a) 41-amino-acid-residue AGQF repeat and (b) 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure. PMID:17538688
Complete mitochondrial genome of the larch hawk moth, Sphinx morio (Lepidoptera: Sphingidae).

PubMed

Kim, Min Jee; Choi, Sei-Woong; Kim, Iksoo

2013-12-01

The larch hawk moth, Sphinx morio, belongs to the lepidopteran family Sphingidae that has long been studied as a family of model insects in a diverse field. In this study, we describe the complete mitochondrial genome (mitogenome) sequences of the species in terms of general genomic features and characteristic short repetitive sequences found in the A + T-rich region. The 15,299-bp-long genome consisted of a typical set of genes (13 protein-coding genes, 2 rRNA genes, and 22 tRNA genes) and one major non-coding A + T-rich region, with the typical arrangement found in Lepidoptera. The 316-bp-long A + T-rich region located between srRNA and tRNA(Met) harbored the conserved sequence blocks that are typically found in lepidopteran insects. Additionally, the A + T-rich region of S. morio contained three characteristic repeat sequences that are rarely found in Lepidoptera: two identical 12-bp repeat, three identical 5-bp-long tandem repeat, and six nearly identical 5-6 bp long repeat sequences.
PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

PubMed

Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

2011-01-01

PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
Long interspersed repeated DNA (LINE) causes polymorphism at the rat insulin 1 locus.

PubMed Central

Lakshmikumaran, M S; D'Ambrosio, E; Laimins, L A; Lin, D T; Furano, A V

1985-01-01

The insulin 1, but not the insulin 2, locus is polymorphic (i.e., exhibits allelic variation) in rats. Restriction enzyme analysis and hybridization studies showed that the polymorphic region is 2.2 kilobases upstream of the insulin 1 coding region and is due to the presence or absence of an approximately 2.7-kilobase repeated DNA element. DNA sequence determination showed that this DNA element is a member of a long interspersed repeated DNA family (LINE) that is highly repeated (greater than 50,000 copies) and highly transcribed in the rat. Although the presence or absence of LINE sequences at the insulin 1 locus occurs in both the homozygous and heterozygous states, LINE-containing insulin 1 alleles are more prevalent in the rat population than are alleles without LINEs. Restriction enzyme analysis of the LINE-containing alleles indicated that at least two versions of the LINE sequence may be present at the insulin 1 locus in different rats. Either repeated transposition of LINE sequences or gene conversion between the resident insulin 1 LINE and other sequences in the genome are possible explanations for this. Images PMID:3016521
Spectroscopic insights into quadruplexes of five-repeat telomere DNA sequences upon G-block damage.

PubMed

Dvořáková, Zuzana; Vorlíčková, Michaela; Renčiuk, Daniel

2017-11-01

The DNA lesions, resulting from oxidative damage, were shown to destabilize human telomere four-repeat quadruplex and to alter its structure. Long telomere DNA, as a repetitive sequence, offers, however, other mechanisms of dealing with the lesion: extrusion of the damaged repeat into loop or shifting the quadruplex position by one repeat. Using circular dichroism and UV absorption spectroscopy and polyacrylamide electrophoresis, we studied consequences of lesions at different positions of the model five-repeat human telomere DNA sequences on the structure and stability of their quadruplexes in sodium and in potassium. The repeats affected by lesion are preferentially positioned as terminal overhangs of the core quadruplex structurally similar to the four-repeat one. Forced affecting of the inner repeats leads to presence of variety of more parallel folds in potassium. In sodium the designed models form mixture of two dominant antiparallel quadruplexes whose population varies with the position of the affected repeat. The shapes of quadruplex CD spectra, namely the height of dominant peaks, significantly correlate with melting temperatures. Lesion in one guanine tract of a more than four repeats long human telomere DNA sequence may cause re-positioning of its quadruplex arrangement associated with a shift of the structure to less common quadruplex conformations. The type of the quadruplex depends on the loop position and external conditions. The telomere DNA quadruplexes are quite resistant to the effect of point mutations due to the telomere DNA repetitive nature, although their structure and, consequently, function might be altered. Copyright © 2017. Published by Elsevier B.V.
Microsatellite analysis in the genome of Acanthaceae: An in silico approach

PubMed Central

Kaliswamy, Priyadharsini; Vellingiri, Srividhya; Nathan, Bharathi; Selvaraj, Saravanakumar

2015-01-01

Background: Acanthaceae is one of the advanced and specialized families with conventionally used medicinal plants. Simple sequence repeats (SSRs) play a major role as molecular markers for genome analysis and plant breeding. The microsatellites existing in the complete genome sequences would help to attain a direct role in the genome organization, recombination, gene regulation, quantitative genetic variation, and evolution of genes. Objective: The current study reports the frequency of microsatellites and appropriate markers for the Acanthaceae family genome sequences. Materials and Methods: The whole nucleotide sequences of Acanthaceae species were obtained from National Center for Biotechnology Information database and screened for the presence of SSRs. SSR Locator tool was used to predict the microsatellites and inbuilt Primer3 module was used for primer designing. Results: Totally 110 repeats from 108 sequences of Acanthaceae family plant genomes were identified, and the occurrence of dinucleotide repeats was found to be abundant in the genome sequences. The essential amino acid isoleucine was found rich in all the sequences. We also designed the SSR-based primers/markers for 59 sequences of this family that contains microsatellite repeats in their genome. Conclusion: The identified microsatellites and primers might be useful for breeding and genetic studies of plants that belong to Acanthaceae family in the future. PMID:25709226
The repeating nucleotide sequence in the repetitive mitochondrial DNA from a "low-density" petite mutant of yeast.

PubMed Central

Van Kreijl, C F; Bos, J L

1977-01-01

The repeating nucleotide sequence of 68 base pairs in the mtDNA from an ethidium-induced cytoplasmic petite mutant of yeast has been determined. For sequence analysis specifically primed and terminated RNA copies, obtained by in vitro transcription of the separated strands, were use. The sequence consists of 66 consecutive AT base pairs flanked by two GC pairs and comprises nearly all of the mutant mitochondrial genome. The sequence, moreover, also represents the first part of wild-type mtDNA sequence so far. Images PMID:198740

A Lossy Compression Technique Enabling Duplication-Aware Sequence Alignment

PubMed Central

Freschi, Valerio; Bogliolo, Alessandro

2012-01-01

In spite of the recognized importance of tandem duplications in genome evolution, commonly adopted sequence comparison algorithms do not take into account complex mutation events involving more than one residue at the time, since they are not compliant with the underlying assumption of statistical independence of adjacent residues. As a consequence, the presence of tandem repeats in sequences under comparison may impair the biological significance of the resulting alignment. Although solutions have been proposed, repeat-aware sequence alignment is still considered to be an open problem and new efficient and effective methods have been advocated. The present paper describes an alternative lossy compression scheme for genomic sequences which iteratively collapses repeats of increasing length. The resulting approximate representations do not contain tandem duplications, while retaining enough information for making their comparison even more significant than the edit distance between the original sequences. This allows us to exploit traditional alignment algorithms directly on the compressed sequences. Results confirm the validity of the proposed approach for the problem of duplication-aware sequence alignment. PMID:22518086
Tandemly repeated sequences in mtDNA control region of whitefish, Coregonus lavaretus.

PubMed

Brzuzan, P

2000-06-01

Length variation of the mitochondrial DNA control region was observed with PCR amplification of a sample of 138 whitefish (Coregonus lavaretus). Nucleotide sequences of representative PCR products showed that the variation was due to the presence of an approximately 100-bp motif tandemly repeated two, three, or five times in the region between the conserved sequence block-3 (CSB-3) and the gene for phenylalanine tRNA. This is the first report on the tandem array composed of long repeat units in mitochondrial DNA of salmonids.
Reprint of: Early Behavioural Facilitation by Temporal Expectations in Complex Visual-motor Sequences.

PubMed

Heideman, Simone G; van Ede, Freek; Nobre, Anna C

2018-05-24

In daily life, temporal expectations may derive from incidental learning of recurring patterns of intervals. We investigated the incidental acquisition and utilisation of combined temporal-ordinal (spatial/effector) structure in complex visual-motor sequences using a modified version of a serial reaction time (SRT) task. In this task, not only the series of targets/responses, but also the series of intervals between subsequent targets was repeated across multiple presentations of the same sequence. Each participant completed three sessions. In the first session, only the repeating sequence was presented. During the second and third session, occasional probe blocks were presented, where a new (unlearned) spatial-temporal sequence was introduced. We first confirm that participants not only got faster over time, but that they were slower and less accurate during probe blocks, indicating that they incidentally learned the sequence structure. Having established a robust behavioural benefit induced by the repeating spatial-temporal sequence, we next addressed our central hypothesis that implicit temporal orienting (evoked by the learned temporal structure) would have the largest influence on performance for targets following short (as opposed to longer) intervals between temporally structured sequence elements, paralleling classical observations in tasks using explicit temporal cues. We found that indeed, reaction time differences between new and repeated sequences were largest for the short interval, compared to the medium and long intervals, and that this was the case, even when comparing late blocks (where the repeated sequence had been incidentally learned), to early blocks (where this sequence was still unfamiliar). We conclude that incidentally acquired temporal expectations that follow a sequential structure can have a robust facilitatory influence on visually-guided behavioural responses and that, like more explicit forms of temporal orienting, this effect is most pronounced for sequence elements that are expected at short inter-element intervals. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Evolutionary conservation of sequence and secondary structures inCRISPR repeats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kunin, Victor; Sorek, Rotem; Hugenholtz, Philip

Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in {approx}40% of bacterial and all archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CAS), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been proposed that the CRISPR/CAS system samples, maintains a record of, and inactivates invasive DNA that the cell has encountered, and therefore constitutes a prokaryotic analog of an immune system. Here we analyze CRISPR repeatsmore » identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. All individual repeats in any given cluster were inferred to form characteristic RNA secondary structure, ranging from non-existent to pronounced. Stable secondary structures included G:U base pairs and exhibited multiple compensatory base changes in the stem region, indicating evolutionary conservation and functional importance. We also show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification including specific relationships between CRISPR and CAS subtypes.« less
Heterogeneity of the Epstein-Barr Virus (EBV) Major Internal Repeat Reveals Evolutionary Mechanisms of EBV and a Functional Defect in the Prototype EBV Strain B95-8.

PubMed

Ba Abdullah, Mohammed M; Palermo, Richard D; Palser, Anne L; Grayson, Nicholas E; Kellam, Paul; Correia, Samantha; Szymula, Agnieszka; White, Robert E

2017-12-01

Epstein-Barr virus (EBV) is a ubiquitous pathogen of humans that can cause several types of lymphoma and carcinoma. Like other herpesviruses, EBV has diversified through both coevolution with its host and genetic exchange between virus strains. Sequence analysis of the EBV genome is unusually challenging because of the large number and lengths of repeat regions within the virus. Here we describe the sequence assembly and analysis of the large internal repeat 1 of EBV (IR1; also known as the BamW repeats) for more than 70 strains. The diversity of the latency protein EBV nuclear antigen leader protein (EBNA-LP) resides predominantly within the exons downstream of IR1. The integrity of the putative BWRF1 open reading frame (ORF) is retained in over 80% of strains, and deletions truncating IR1 always spare BWRF1. Conserved regions include the IR1 latency promoter (Wp) and one zone upstream of and two within BWRF1. IR1 is heterogeneous in 70% of strains, and this heterogeneity arises from sequence exchange between strains as well as from spontaneous mutation, with interstrain recombination being more common in tumor-derived viruses. This genetic exchange often incorporates regions of <1 kb, and allelic gene conversion changes the frequency of small regions within the repeat but not close to the flanks. These observations suggest that IR1-and, by extension, EBV-diversifies through both recombination and breakpoint repair, while concerted evolution of IR1 is driven by gene conversion of small regions. Finally, the prototype EBV strain B95-8 contains four nonconsensus variants within a single IR1 repeat unit, including a stop codon in the EBNA-LP gene. Repairing IR1 improves EBNA-LP levels and the quality of transformation by the B95-8 bacterial artificial chromosome (BAC). IMPORTANCE Epstein-Barr virus (EBV) infects the majority of the world population but causes illness in only a small minority of people. Nevertheless, over 1% of cancers worldwide are attributable to EBV. Recent sequencing projects investigating virus diversity to see if different strains have different disease impacts have excluded regions of repeating sequence, as they are more technically challenging. Here we analyze the sequence of the largest repeat in EBV (IR1). We first characterized the variations in protein sequences encoded across IR1. In studying variations within the repeat of each strain, we identified a mutation in the main laboratory strain of EBV that impairs virus function, and we suggest that tumor-associated viruses may be more likely to contain DNA mixed from two strains. The patterns of this mixing suggest that sequences can spread between strains (and also within the repeat) by copying sequence from another strain (or repeat unit) to repair DNA damage. Copyright © 2017 Ba abdullah et al.
Improved PCR-Based Detection of Soil Transmitted Helminth Infections Using a Next-Generation Sequencing Approach to Assay Design.

PubMed

Pilotte, Nils; Papaiakovou, Marina; Grant, Jessica R; Bierwert, Lou Ann; Llewellyn, Stacey; McCarthy, James S; Williams, Steven A

2016-03-01

The soil transmitted helminths are a group of parasitic worms responsible for extensive morbidity in many of the world's most economically depressed locations. With growing emphasis on disease mapping and eradication, the availability of accurate and cost-effective diagnostic measures is of paramount importance to global control and elimination efforts. While real-time PCR-based molecular detection assays have shown great promise, to date, these assays have utilized sub-optimal targets. By performing next-generation sequencing-based repeat analyses, we have identified high copy-number, non-coding DNA sequences from a series of soil transmitted pathogens. We have used these repetitive DNA elements as targets in the development of novel, multi-parallel, PCR-based diagnostic assays. Utilizing next-generation sequencing and the Galaxy-based RepeatExplorer web server, we performed repeat DNA analysis on five species of soil transmitted helminths (Necator americanus, Ancylostoma duodenale, Trichuris trichiura, Ascaris lumbricoides, and Strongyloides stercoralis). Employing high copy-number, non-coding repeat DNA sequences as targets, novel real-time PCR assays were designed, and assays were tested against established molecular detection methods. Each assay provided consistent detection of genomic DNA at quantities of 2 fg or less, demonstrated species-specificity, and showed an improved limit of detection over the existing, proven PCR-based assay. The utilization of next-generation sequencing-based repeat DNA analysis methodologies for the identification of molecular diagnostic targets has the ability to improve assay species-specificity and limits of detection. By exploiting such high copy-number repeat sequences, the assays described here will facilitate soil transmitted helminth diagnostic efforts. We recommend similar analyses when designing PCR-based diagnostic tests for the detection of other eukaryotic pathogens.
Evolution Analysis of Simple Sequence Repeats in Plant Genome.

PubMed

Qin, Zhen; Wang, Yanping; Wang, Qingmei; Li, Aixian; Hou, Fuyun; Zhang, Liming

2015-01-01

Simple sequence repeats (SSRs) are widespread units on genome sequences, and play many important roles in plants. In order to reveal the evolution of plant genomes, we investigated the evolutionary regularities of SSRs during the evolution of plant species and the plant kingdom by analysis of twelve sequenced plant genome sequences. First, in the twelve studied plant genomes, the main SSRs were those which contain repeats of 1-3 nucleotides combination. Second, in mononucleotide SSRs, the A/T percentage gradually increased along with the evolution of plants (except for P. patens). With the increase of SSRs repeat number the percentage of A/T in C. reinhardtii had no significant change, while the percentage of A/T in terrestrial plants species gradually declined. Third, in dinucleotide SSRs, the percentage of AT/TA increased along with the evolution of plant kingdom and the repeat number increased in terrestrial plants species. This trend was more obvious in dicotyledon than monocotyledon. The percentage of CG/GC showed the opposite pattern to the AT/TA. Forth, in trinucleotide SSRs, the percentages of combinations including two or three A/T were in a rising trend along with the evolution of plant kingdom; meanwhile with the increase of SSRs repeat number in plants species, different species chose different combinations as dominant SSRs. SSRs in C. reinhardtii, P. patens, Z. mays and A. thaliana showed their specific patterns related to evolutionary position or specific changes of genome sequences. The results showed that, SSRs not only had the general pattern in the evolution of plant kingdom, but also were associated with the evolution of the specific genome sequence. The study of the evolutionary regularities of SSRs provided new insights for the analysis of the plant genome evolution.
Translocation and gross deletion breakpoints in human inherited disease and cancer II: Potential involvement of repetitive sequence elements in secondary structure formation between DNA ends.

PubMed

Chuzhanova, Nadia; Abeysinghe, Shaun S; Krawczak, Michael; Cooper, David N

2003-09-01

Translocations and gross deletions are responsible for a significant proportion of both cancer and inherited disease. Although such gene rearrangements are nonuniformly distributed in the human genome, the underlying mutational mechanisms remain unclear. We have studied the potential involvement of various types of repetitive sequence elements in the formation of secondary structure intermediates between the single-stranded DNA ends that recombine during rearrangements. Complexity analysis was used to assess the potential of these ends to form secondary structures, the maximum decrease in complexity consequent to a gross rearrangement being used as an indicator of the type of repeat and the specific DNA ends involved. A total of 175 pairs of deletion/translocation breakpoint junction sequences available from the Gross Rearrangement Breakpoint Database [GRaBD; www.uwcm.ac.uk/uwcm/mg/grabd/grabd.html] were analyzed. Potential secondary structure was noted between the 5' flanking sequence of the first breakpoint and the 3' flanking sequence of the second breakpoint in 49% of rearrangements and between the 5' flanking sequence of the second breakpoint and the 3' flanking sequence of the first breakpoint in 36% of rearrangements. Inverted repeats, inversions of inverted repeats, and symmetric elements were found in association with gross rearrangements at approximately the same frequency. However, inverted repeats and inversions of inverted repeats accounted for the vast majority (83%) of deletions plus small insertions, symmetric elements for one-half of all antigen receptor-mediated translocations, while direct repeats appear only to be involved in mediating simple deletions. These findings extend our understanding of illegitimate recombination by highlighting the importance of secondary structure formation between single-stranded DNA ends at breakpoint junctions. Copyright 2003 Wiley-Liss, Inc.
Perceived empty duration between sounds of different lengths: Possible relation with repetition and rhythmic grouping.

PubMed

Kuroda, Tsuyoshi; Tomimatsu, Erika; Grondin, Simon; Miyazaki, Makoto

2016-11-01

We investigated how perceived duration of empty time intervals would be modulated by the length of sounds marking those intervals. Three sounds were successively presented in Experiment 1. Each sound was short (S) or long (L), and the temporal position of the middle sound's onset was varied. The lengthening of each sound resulted in delayed perception of the onset; thus, the middle sound's onset had to be presented earlier in the SLS than in the LSL sequence so that participants perceived the three sounds as presented at equal interonset intervals. In Experiment 2, a short sound and a long sound were alternated repeatedly, and the relative duration of the SL interval to the LS interval was varied. This repeated sequence was perceived as consisting of equal interonset intervals when the onsets of all sounds were aligned at physically equal intervals. If the same onset delay as in the preceding experiment had occurred, participants should have perceived equality between the interonset intervals in the repeated sequence when the SL interval was physically shortened relative to the LS interval. The effects of sound length seemed to be canceled out when the presentation of intervals was repeated. Finally, the perceived duration of the interonset intervals in the repeated sequence was not influenced by whether the participant's native language was French or Japanese, or by how the repeated sequence was perceptually segmented into rhythmic groups.
Genetic and DNA sequence analysis of the kanamycin resistance transposon Tn903.

PubMed Central

Grindley, N D; Joyce, C M

1980-01-01

The kanamycin resistance transposon Tn903 consists of a unique region of about 1000 base pairs bounded by a pair of 1050-base-pair inverted repeat sequences. Each repeat contains two Pvu II endonuclease cleavage sites separated by 520 base pairs. We have constructed derivatives of Tn903 in which this 520-base-pair fragment is deleted from one or both repeats. Those derivatives that lack both 520-base-pair fragments cannot transpose, whereas those that lack just one remain transposition proficient. One such transposable derivative, Tn903 delta I, has been selected for further study. We have determined the sequence of the intact inverted repeat. The 18 base pairs at each end are identical and inverted relative to one another, a structure characteristic of insertion sequences. Additional experiments indicate that a single inverted repeat from Tn903 can, in fact, transpose; we propose that this element be called IS903. To correlate the DNA sequence with genetic activities, we have created mutations by inserting a 10-base-pair DNA fragment at several sites within the intact repeat of Tn903 delta 1, and we have examined the effect of such insertions on transposability. The results suggest that IS903 encodes a 307-amino-acid polypeptide (a "transposase") that is absolutely required for transposition of IS903 or Tn903. Images PMID:6261245
Sunflower centromeres consist of a centromere-specific LINE and a chromosome-specific tandem repeat.

PubMed

Nagaki, Kiyotaka; Tanaka, Keisuke; Yamaji, Naoki; Kobayashi, Hisato; Murata, Minoru

2015-01-01

The kinetochore is a protein complex including kinetochore-specific proteins that plays a role in chromatid segregation during mitosis and meiosis. The complex associates with centromeric DNA sequences that are usually species-specific. In plant species, tandem repeats including satellite DNA sequences and retrotransposons have been reported as centromeric DNA sequences. In this study on sunflowers, a cDNA-encoding centromere-specific histone H3 (CENH3) was isolated from a cDNA pool from a seedling, and an antibody was raised against a peptide synthesized from the deduced cDNA. The antibody specifically recognized the sunflower CENH3 (HaCENH3) and showed centromeric signals by immunostaining and immunohistochemical staining analysis. The antibody was also applied in chromatin immunoprecipitation (ChIP)-Seq to isolate centromeric DNA sequences and two different types of repetitive DNA sequences were identified. One was a long interspersed nuclear element (LINE)-like sequence, which showed centromere-specific signals on almost all chromosomes in sunflowers. This is the first report of a centromeric LINE sequence, suggesting possible centromere targeting ability. Another type of identified repetitive DNA was a tandem repeat sequence with a 187-bp unit that was found only on a pair of chromosomes. The HaCENH3 content of the tandem repeats was estimated to be much higher than that of the LINE, which implies centromere evolution from LINE-based centromeres to more stable tandem-repeat-based centromeres. In addition, the epigenetic status of the sunflower centromeres was investigated by immunohistochemical staining and ChIP, and it was found that centromeres were heterochromatic.
Direct repeat sequences in the Streptomyces chitinase-63 promoter direct both glucose repression and chitin induction

PubMed Central

Ni, Xiangyang; Westpheling, Janet

1997-01-01

The chi63 promoter directs glucose-sensitive, chitin-dependent transcription of a gene involved in the utilization of chitin as carbon source. Analysis of 5′ and 3′ deletions of the promoter region revealed that a 350-bp segment is sufficient for wild-type levels of expression and regulation. The analysis of single base changes throughout the promoter region, introduced by random and site-directed mutagenesis, identified several sequences to be important for activity and regulation. Single base changes at −10, −12, −32, −33, −35, and −37 upstream of the transcription start site resulted in loss of activity from the promoter, suggesting that bases in these positions are important for RNA polymerase interaction. The sequences centered around −10 (TATTCT) and −35 (TTGACC) in this promoter are, in fact, prototypical of eubacterial promoters. Overlapping the RNA polymerase binding site is a perfect 12-bp direct repeat sequence. Some base changes within this direct repeat resulted in constitutive expression, suggesting that this sequence is an operator for negative regulation. Other base changes resulted in loss of glucose repression while retaining the requirement for chitin induction, suggesting that this sequence is also involved in glucose repression. The fact that cis-acting mutations resulted in glucose resistance but not inducer independence rules out the possibility that glucose repression acts exclusively by inducer exclusion. The fact that mutations that affect glucose repression and chitin induction fall within the same direct repeat sequence module suggests that the direct repeat sequence facilitates both chitin induction and glucose repression. PMID:9371809
Superior ab initio identification, annotation and characterisation of TEs and segmental duplications from genome assemblies.

PubMed

Zeng, Lu; Kortschak, R Daniel; Raison, Joy M; Bertozzi, Terry; Adelson, David L

2018-01-01

Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive sequences in genome assemblies. The pipeline begins with a pairwise alignment using krishna, a custom aligner. Single linkage clustering is then carried out to produce families of repetitive elements. Consensus sequences are then filtered for protein coding genes and then annotated using Repbase and a custom library of retrovirus and reverse transcriptase sequences. This process yields three types of family: fully annotated, partially annotated and unannotated. Fully annotated families reflect recently diverged/young known TEs present in Repbase. The remaining two types of families contain a mixture of novel TEs and segmental duplications. These can be resolved by aligning these consensus sequences back to the genome to assess copy number vs. length distribution. Our pipeline has three significant advantages compared to other methods for ab initio repeat identification: 1) we generate not only consensus sequences, but keep the genomic intervals for the original aligned sequences, allowing straightforward analysis of evolutionary dynamics, 2) consensus sequences represent low-divergence, recently/currently active TE families, 3) segmental duplications are annotated as a useful by-product. We have compared our ab initio repeat annotations for 7 genome assemblies to other methods and demonstrate that CARP compares favourably with RepeatModeler, the most widely used repeat annotation package.
Superior ab initio identification, annotation and characterisation of TEs and segmental duplications from genome assemblies

PubMed Central

Zeng, Lu; Kortschak, R. Daniel; Raison, Joy M.

2018-01-01

Transposable Elements (TEs) are mobile DNA sequences that make up significant fractions of amniote genomes. However, they are difficult to detect and annotate ab initio because of their variable features, lengths and clade-specific variants. We have addressed this problem by refining and developing a Comprehensive ab initio Repeat Pipeline (CARP) to identify and cluster TEs and other repetitive sequences in genome assemblies. The pipeline begins with a pairwise alignment using krishna, a custom aligner. Single linkage clustering is then carried out to produce families of repetitive elements. Consensus sequences are then filtered for protein coding genes and then annotated using Repbase and a custom library of retrovirus and reverse transcriptase sequences. This process yields three types of family: fully annotated, partially annotated and unannotated. Fully annotated families reflect recently diverged/young known TEs present in Repbase. The remaining two types of families contain a mixture of novel TEs and segmental duplications. These can be resolved by aligning these consensus sequences back to the genome to assess copy number vs. length distribution. Our pipeline has three significant advantages compared to other methods for ab initio repeat identification: 1) we generate not only consensus sequences, but keep the genomic intervals for the original aligned sequences, allowing straightforward analysis of evolutionary dynamics, 2) consensus sequences represent low-divergence, recently/currently active TE families, 3) segmental duplications are annotated as a useful by-product. We have compared our ab initio repeat annotations for 7 genome assemblies to other methods and demonstrate that CARP compares favourably with RepeatModeler, the most widely used repeat annotation package. PMID:29538441
Origin of the CMS gene locus in rapeseed cybrid mitochondria: active and inactive recombination produces the complex CMS gene region in the mitochondrial genomes of Brassicaceae.

PubMed

Oshima, Masao; Kikuchi, Rie; Imamura, Jun; Handa, Hirokazu

2010-01-01

CMS (cytoplasmic male sterile) rapeseed is produced by asymmetrical somatic cell fusion between the Brassica napus cv. Westar and the Raphanus sativus Kosena CMS line (Kosena radish). The CMS rapeseed contains a CMS gene, orf125, which is derived from Kosena radish. Our sequence analyses revealed that the orf125 region in CMS rapeseed originated from recombination between the orf125/orfB region and the nad1C/ccmFN1 region by way of a 63 bp repeat. A precise sequence comparison among the related sequences in CMS rapeseed, Kosena radish and normal rapeseed showed that the orf125 region in CMS rapeseed consisted of the Kosena orf125/orfB region and the rapeseed nad1C/ccmFN1 region, even though Kosena radish had both the orf125/orfB region and the nad1C/ccmFN1 region in its mitochondrial genome. We also identified three tandem repeat sequences in the regions surrounding orf125, including a 63 bp repeat, which were involved in several recombination events. Interestingly, differences in the recombination activity for each repeat sequence were observed, even though these sequences were located adjacent to each other in the mitochondrial genome. We report results indicating that recombination events within the mitochondrial genomes are regulated at the level of specific repeat sequences depending on the cellular environment.
Analysis of Two Cosmid Clones from Chromosome 4 of Drosophila melanogaster Reveals Two New Genes Amid an Unusual Arrangement of Repeated Sequences

PubMed Central

Locke, John; Podemski, Lynn; Roy, Ken; Pilgrim, David; Hodgetts, Ross

1999-01-01

Chromosome 4 from Drosophila melanogaster has several unusual features that distinguish it from the other chromosomes. These include a diffuse appearance in salivary gland polytene chromosomes, an absence of recombination, and the variegated expression of P-element transgenes. As part of a larger project to understand these properties, we are assembling a physical map of this chromosome. Here we report the sequence of two cosmids representing ∼5% of the polytenized region. Both cosmid clones contain numerous repeated DNA sequences, as identified by cross hybridization with labeled genomic DNA, BLAST searches, and dot matrix analysis, which are positioned between and within the transcribed sequences. The repetitive sequences include three copies of the mobile element Hoppel, one copy of the mobile element HB, and 18 DINE repeats. DINE is a novel, short repeated sequence dispersed throughout both cosmid sequences. One cosmid includes the previously described cubitus interruptus (ci) gene and two new genes: that a gene with a predicted amino acid sequence similar to ribosomal protein S3a which is consistent with the Minute(4)101 locus thought to be in the region, and a novel member of the protein family that includes plexin and met–hepatocyte growth factor receptor. The other cosmid contains only the two short 5′-most exons from the zinc-finger-homolog-2 (zfh-2) gene. This is the first extensive sequence analysis of noncoding DNA from chromosome 4. The distribution of the various repeats suggests its organization is similar to the β-heterochromatic regions near the base of the major chromosome arms. Such a pattern may account for the diffuse banding of the polytene chromosome 4 and the variegation of many P-element transgenes on the chromosome. PMID:10022978
Experimental definition of a clustered regularly interspaced short palindromic duplicon in Escherichia coli.

PubMed

Goren, Moran G; Yosef, Ido; Auster, Oren; Qimron, Udi

2012-10-12

We analyzed sequences of newly inserted repeats in an Escherichia coli CRISPR (clustered regularly interspaced short palindromic repeats) array in vivo and showed that a base previously thought to belong to the repeat is actually derived from a protospacer. Based on further experimental results, we propose to use the term "duplicon" for a repeated sequence in a CRISPR array that serves as a template for a new duplicon. Our findings suggest the possibility of redrawing the borders between repeats, spacers, and protospacer adjacent motifs. Copyright © 2012 Elsevier Ltd. All rights reserved.
Phylogeny and strain typing of Escherichia coli, inferred from variation at mononucleotide repeat loci.

PubMed

Diamant, Eran; Palti, Yniv; Gur-Arie, Riva; Cohen, Helit; Hallerman, Eric M; Kashi, Yechezkel

2004-04-01

Multilocus sequencing of housekeeping genes has been used previously for bacterial strain typing and for inferring evolutionary relationships among strains of Escherichia coli. In this study, we used shorter intergenic sequences that contained simple sequence repeats (SSRs) of repeating mononucleotide motifs (mononucleotide repeats [MNRs]) to infer the phylogeny of pathogenic and commensal E. coli strains. Seven noncoding loci (four MNRs and three non-SSRs) were sequenced in 27 strains, including enterohemorrhagic (six isolates of O157:H7), enteropathogenic, enterotoxigenic, B, and K-12 strains. The four MNRs were also sequenced in 20 representative strains of the E. coli reference (ECOR) collection. Sequence polymorphism was significantly higher at the MNR loci, including the flanking sequences, indicating a higher mutation rate in the sequences flanking the MNR tracts. The four MNR loci were amplifiable by PCR in the standard ECOR A, B1, and D groups, but only one (yaiN) in the B2 group was amplified, which is consistent with previous studies that suggested that B2 is the most ancient group. High sequence compatibility was found between the four MNR loci, indicating that they are in the same clonal frame. The phylogenetic trees that were constructed from the sequence data were in good agreement with those of previous studies that used multilocus enzyme electrophoresis. The results demonstrate that MNR loci are useful for inferring phylogenetic relationships and provide much higher sequence variation than housekeeping genes. Therefore, the use of MNR loci for multilocus sequence typing should prove efficient for clinical diagnostics, epidemiology, and evolutionary study of bacteria.
Phylogeny and Strain Typing of Escherichia coli, Inferred from Variation at Mononucleotide Repeat Loci

PubMed Central

Diamant, Eran; Palti, Yniv; Gur-Arie, Riva; Cohen, Helit; Hallerman, Eric M.; Kashi, Yechezkel

2004-01-01

Multilocus sequencing of housekeeping genes has been used previously for bacterial strain typing and for inferring evolutionary relationships among strains of Escherichia coli. In this study, we used shorter intergenic sequences that contained simple sequence repeats (SSRs) of repeating mononucleotide motifs (mononucleotide repeats [MNRs]) to infer the phylogeny of pathogenic and commensal E. coli strains. Seven noncoding loci (four MNRs and three non-SSRs) were sequenced in 27 strains, including enterohemorrhagic (six isolates of O157:H7), enteropathogenic, enterotoxigenic, B, and K-12 strains. The four MNRs were also sequenced in 20 representative strains of the E. coli reference (ECOR) collection. Sequence polymorphism was significantly higher at the MNR loci, including the flanking sequences, indicating a higher mutation rate in the sequences flanking the MNR tracts. The four MNR loci were amplifiable by PCR in the standard ECOR A, B1, and D groups, but only one (yaiN) in the B2 group was amplified, which is consistent with previous studies that suggested that B2 is the most ancient group. High sequence compatibility was found between the four MNR loci, indicating that they are in the same clonal frame. The phylogenetic trees that were constructed from the sequence data were in good agreement with those of previous studies that used multilocus enzyme electrophoresis. The results demonstrate that MNR loci are useful for inferring phylogenetic relationships and provide much higher sequence variation than housekeeping genes. Therefore, the use of MNR loci for multilocus sequence typing should prove efficient for clinical diagnostics, epidemiology, and evolutionary study of bacteria. PMID:15066845
Characterization and assessment of an avian repetitive DNA sequence as an icterid phylogenetic marker.

PubMed

Quinn, J S; Guglich, E; Seutin, G; Lau, R; Marsolais, J; Parna, L; Boag, P T; White, B N

1992-02-01

The first tandemly repeated sequence examined in a passerine bird, a 431-bp PstI fragment named pMAT1, has been cloned from the genome of the brown-headed cowbird (Molothrus ater). The sequence represents about 5-10% of the genome (about 4 x 10(5) copies) and yields prominent ethidium bromide stained bands when genomic DNA cut with a variety of restriction enzymes is electrophoresed in agarose gels. A particularly striking ladder of fragments is apparent when the DNA is cut with HinfI, indicative of a tandem arrangement of the monomer. The cloned PstI monomer has been sequenced, revealing no internal repeated structure. There are sequences that hybridize with pMAT1 found in related nine-primaried oscines but not in more distantly related oscines, suboscines, or nonpasserine species. Little sequence similarity to tandemly repeated PstI cut sequences from the merlin (Falco columbarius), saurus crane (Grus antigone), or Puerto Rican parrot (Amazona vittata) or to HinfI digested sequence from the Toulouse goose (Anser anser) was detected. The isolated sequence was used as a probe to examine DNA samples of eight members of the tribe Icterini. This examination revealed phylogenetically informative characters. The repeat contains cutting sites from a number of restriction enzymes, which, if sufficiently polymorphic, would provide new phylogenetic characters. Sequences like these, conserved within a species, but variable between closely related species, may be very useful for phylogenetic studies of closely related taxa.

“One code to find them all”: a perl tool to conveniently parse RepeatMasker output files

PubMed Central

2014-01-01

Background Of the different bioinformatic methods used to recover transposable elements (TEs) in genome sequences, one of the most commonly used procedures is the homology-based method proposed by the RepeatMasker program. RepeatMasker generates several output files, including the .out file, which provides annotations for all detected repeats in a query sequence. However, a remaining challenge consists of identifying the different copies of TEs that correspond to the identified hits. This step is essential for any evolutionary/comparative analysis of the different copies within a family. Different possibilities can lead to multiple hits corresponding to a unique copy of an element, such as the presence of large deletions/insertions or undetermined bases, and distinct consensus corresponding to a single full-length sequence (like for long terminal repeat (LTR)-retrotransposons). These possibilities must be taken into account to determine the exact number of TE copies. Results We have developed a perl tool that parses the RepeatMasker .out file to better determine the number and positions of TE copies in the query sequence, in addition to computing quantitative information for the different families. To determine the accuracy of the program, we tested it on several RepeatMasker .out files corresponding to two organisms (Drosophila melanogaster and Homo sapiens) for which the TE content has already been largely described and which present great differences in genome size, TE content, and TE families. Conclusions Our tool provides access to detailed information concerning the TE content in a genome at the family level from the .out file of RepeatMasker. This information includes the exact position and orientation of each copy, its proportion in the query sequence, and its quality compared to the reference element. In addition, our tool allows a user to directly retrieve the sequence of each copy and obtain the same detailed information at the family level when a local library with incomplete TE class/subclass information was used with RepeatMasker. We hope that this tool will be helpful for people working on the distribution and evolution of TEs within genomes.
Effects of "D"-Amphetamine and Ethanol on Variable and Repetitive Key-Peck Sequences in Pigeons

ERIC Educational Resources Information Center

Ward, Ryan D.; Bailey, Ericka M.; Odum, Amy L.

2006-01-01

This experiment assessed the effects of "d"-Amphetamine and ethanol on reinforced variable and repetitive key-peck sequences in pigeons. Pigeons responded on two keys under a multiple schedule of Repeat and Vary components. In the Repeat component, completion of a target sequence of right, right, left, left resulted in food. In the Vary component,…
The mitochondrial genome of the legume Vigna radiata and the analysis of recombination across short mitochondrial repeats.

PubMed

Alverson, Andrew J; Zhuo, Shi; Rice, Danny W; Sloan, Daniel B; Palmer, Jeffrey D

2011-01-20

The mitochondrial genomes of seed plants are exceptionally fluid in size, structure, and sequence content, with the accumulation and activity of repetitive sequences underlying much of this variation. We report the first fully sequenced mitochondrial genome of a legume, Vigna radiata (mung bean), and show that despite its unexceptional size (401,262 nt), the genome is unusually depauperate in repetitive DNA and "promiscuous" sequences from the chloroplast and nuclear genomes. Although Vigna lacks the large, recombinationally active repeats typical of most other seed plants, a PCR survey of its modest repertoire of short (38-297 nt) repeats nevertheless revealed evidence for recombination across all of them. A set of novel control assays showed, however, that these results could instead reflect, in part or entirely, artifacts of PCR-mediated recombination. Consequently, we recommend that other methods, especially high-depth genome sequencing, be used instead of PCR to infer patterns of plant mitochondrial recombination. The average-sized but repeat- and feature-poor mitochondrial genome of Vigna makes it ever more difficult to generalize about the factors shaping the size and sequence content of plant mitochondrial genomes.
Highly Informative Simple Sequence Repeat (SSR) Markers for Fingerprinting Hazelnut

USDA-ARS?s Scientific Manuscript database

Simple sequence repeat (SSR) or microsatellite markers have many applications in breeding and genetic studies of plants, including fingerprinting of cultivars and investigations of genetic diversity, and therefore provide information for better management of germplasm collections. They are repeatab...
Distribution and sequence homogeneity of an abundant satellite DNA in the beetle, Tenebrio molitor.

PubMed Central

Davis, C A; Wyatt, G R

1989-01-01

The mealworm beetle, Tenebrio molitor, contains an unusually abundant and homogeneous satellite DNA which constitutes up to 60% of its genome. The satellite DNA is shown to be present in all of the chromosomes by in situ hybridization. 18 dimers of the repeat unit were cloned and sequenced. The consensus sequence is 142 nt long and lacks any internal repeat structure. Monomers of the sequence are very similar, showing on average a 2% divergence from the calculated consensus. Variant nucleotides are scattered randomly throughout the sequence although some variants are more common than others. Neighboring repeat units are no more alike than randomly chosen ones. The results suggest that some mechanism, perhaps gene conversion, is acting to maintain the homogeneity of the satellite DNA despite its abundance and distribution on all of the chromosomes. Images PMID:2762148
Amino acid sequence analysis of the annexin super-gene family of proteins.

PubMed

Barton, G J; Newman, R H; Freemont, P S; Crumpton, M J

1991-06-15

The annexins are a widespread family of calcium-dependent membrane-binding proteins. No common function has been identified for the family and, until recently, no crystallographic data existed for an annexin. In this paper we draw together 22 available annexin sequences consisting of 88 similar repeat units, and apply the techniques of multiple sequence alignment, pattern matching, secondary structure prediction and conservation analysis to the characterisation of the molecules. The analysis clearly shows that the repeats cluster into four distinct families and that greatest variation occurs within the repeat 3 units. Multiple alignment of the 88 repeats shows amino acids with conserved physicochemical properties at 22 positions, with only Gly at position 23 being absolutely conserved in all repeats. Secondary structure prediction techniques identify five conserved helices in each repeat unit and patterns of conserved hydrophobic amino acids are consistent with one face of a helix packing against the protein core in predicted helices a, c, d, e. Helix b is generally hydrophobic in all repeats, but contains a striking pattern of repeat-specific residue conservation at position 31, with Arg in repeats 4 and Glu in repeats 2, but unconserved amino acids in repeats 1 and 3. This suggests repeats 2 and 4 may interact via a buried saltbridge. The loop between predicted helices a and b of repeat 3 shows features distinct from the equivalent loop in repeats 1, 2 and 4, suggesting an important structural and/or functional role for this region. No compelling evidence emerges from this study for uteroglobin and the annexins sharing similar tertiary structures, or for uteroglobin representing a derivative of a primordial one-repeat structure that underwent duplication to give the present day annexins. The analyses performed in this paper are re-evaluated in the Appendix, in the light of the recently published X-ray structure for human annexin V. The structure confirms most of the predictions and shows the power of techniques for the determination of tertiary structural information from the amino acid sequences of an aligned protein family.
Genetic diversity in intraspecific hybrid populations of Eucommia ulmoides Oliver evaluated from ISSR and SRAP molecular marker analysis.

PubMed

Yu, J; Wang, Y; Ru, M; Peng, L; Liang, Z S

2015-07-03

Eucommia ulmoides Oliver, the only extant species of Eucommiaceae, is a second-category state-protected endangered plant in China. Evaluation of genetic diversity among some intraspecific hybrid populations of E. ulmoides Oliver is vital for breeding programs and further conservation of this rare species. We studied the genetic diversity of 130 accessions from 13 E. ulmoides intraspecific hybrid populations using inter-simple sequence related (ISSR) and sequence-related amplified polymorphism (SRAP) markers. Of the 100 ISSR primers and 100 SRAP primer combinations screened, eight ISSRs and eight SRAPs were used to evaluate the level of polymorphism and discriminating capacity. A total number of 65 bands were amplified using eight ISSR primers, in which 50 bands (76.9%) were polymorphic, with an average of 8.1 polymorphic fragments per primer. Alternatively, another 244 bands were observed using eight SRAP primer combinations, and 163 (66.8%) of them were polymorphic, with an average of 30.5 polymorphic fragments per primer. The unweighted pair-group method (UPGMA) analysis showed that these 13 populations could be classified into three groups by the ISSR marker and two groups by the SRAP marker. Principal coordinate analysis using SRAP was completely identical to the UPGMA-based clustering, although this was partly confirmed by the results of UPGMA cluster analysis using the ISSR marker. This study provides insights into the genetic background of E. ulmoides intraspecific hybrids. The progenies of the variations "Huazhong-3", "big fruit", "Yanci", and "smooth bark" present high genetic diversity and offer great potential for E. ulmoides breeding and conservation.
Discovery of Escherichia coli CRISPR sequences in an undergraduate laboratory.

PubMed

Militello, Kevin T; Lazatin, Justine C

2017-05-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) represent a novel type of adaptive immune system found in eubacteria and archaebacteria. CRISPRs have recently generated a lot of attention due to their unique ability to catalog foreign nucleic acids, their ability to destroy foreign nucleic acids in a mechanism that shares some similarity to RNA interference, and the ability to utilize reconstituted CRISPR systems for genome editing in numerous organisms. In order to introduce CRISPR biology into an undergraduate upper-level laboratory, a five-week set of exercises was designed to allow students to examine the CRISPR status of uncharacterized Escherichia coli strains and to allow the discovery of new repeats and spacers. Students started the project by isolating genomic DNA from E. coli and amplifying the iap CRISPR locus using the polymerase chain reaction (PCR). The PCR products were analyzed by Sanger DNA sequencing, and the sequences were examined for the presence of CRISPR repeat sequences. The regions between the repeats, the spacers, were extracted and analyzed with BLASTN searches. Overall, CRISPR loci were sequenced from several previously uncharacterized E. coli strains and one E. coli K-12 strain. Sanger DNA sequencing resulted in the discovery of 36 spacer sequences and their corresponding surrounding repeat sequences. Five of the spacers were homologous to foreign (non-E. coli) DNA. Assessment of the laboratory indicates that improvements were made in the ability of students to answer questions relating to the structure and function of CRISPRs. Future directions of the laboratory are presented and discussed. © 2016 by The International Union of Biochemistry and Molecular Biology, 45(3):262-269, 2017. © 2016 The International Union of Biochemistry and Molecular Biology.
Sequence investigation of 34 forensic autosomal STRs with massively parallel sequencing.

PubMed

Zhang, Suhua; Niu, Yong; Bian, Yingnan; Dong, Rixia; Liu, Xiling; Bao, Yun; Jin, Chao; Zheng, Hancheng; Li, Chengtao

2018-05-01

STRs vary not only in the length of the repeat units and the number of repeats but also in the region with which they conform to an incremental repeat pattern. Massively parallel sequencing (MPS) offers new possibilities in the analysis of STRs since they can simultaneously sequence multiple targets in a single reaction and capture potential internal sequence variations. Here, we sequenced 34 STRs applied in the forensic community of China with a custom-designed panel. MPS performance were evaluated from sequencing reads analysis, concordance study and sensitivity testing. High coverage sequencing data were obtained to determine the constitute ratios and heterozygous balance. No actual inconsistent genotypes were observed between capillary electrophoresis (CE) and MPS, demonstrating the reliability of the panel and the MPS technology. With the sequencing data from the 200 investigated individuals, 346 and 418 alleles were obtained via CE and MPS technologies at the 34 STRs, indicating MPS technology provides higher discrimination than CE detection. The whole study demonstrated that STR genotyping with the custom panel and MPS technology has the potential not only to reveal length and sequence variations but also to satisfy the demands of high throughput and high multiplexing with acceptable sensitivity.
Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor.

PubMed

Kohany, Oleksiy; Gentles, Andrew J; Hankus, Lukasz; Jurka, Jerzy

2006-10-25

Repbase is a reference database of eukaryotic repetitive DNA, which includes prototypic sequences of repeats and basic information described in annotations. Updating and maintenance of the database requires specialized tools, which we have created and made available for use with Repbase, and which may be useful as a template for other curated databases. We describe the software tools RepbaseSubmitter and Censor, which are designed to facilitate updating and screening the content of Repbase. RepbaseSubmitter is a java-based interface for formatting and annotating Repbase entries. It eliminates many common formatting errors, and automates actions such as calculation of sequence lengths and composition, thus facilitating curation of Repbase sequences. In addition, it has several features for predicting protein coding regions in sequences; searching and including Pubmed references in Repbase entries; and searching the NCBI taxonomy database for correct inclusion of species information and taxonomic position. Censor is a tool to rapidly identify repetitive elements by comparison to known repeats. It uses WU-BLAST for speed and sensitivity, and can conduct DNA-DNA, DNA-protein, or translated DNA-translated DNA searches of genomic sequence. Defragmented output includes a map of repeats present in the query sequence, with the options to report masked query sequence(s), repeat sequences found in the query, and alignments. Censor and RepbaseSubmitter are available as both web-based services and downloadable versions. They can be found at http://www.girinst.org/repbase/submission.html (RepbaseSubmitter) and http://www.girinst.org/censor/index.php (Censor).
Centromere location in Arabidopsis is unaltered by extreme divergence in CENH3 protein sequence

PubMed Central

2017-01-01

During cell division, spindle fibers attach to chromosomes at centromeres. The DNA sequence at regional centromeres is fast evolving with no conserved genetic signature for centromere identity. Instead CENH3, a centromere-specific histone H3 variant, is the epigenetic signature that specifies centromere location across both plant and animal kingdoms. Paradoxically, CENH3 is also adaptively evolving. An ongoing question is whether CENH3 evolution is driven by a functional relationship with the underlying DNA sequence. Here, we demonstrate that despite extensive protein sequence divergence, CENH3 histones from distant species assemble centromeres on the same underlying DNA sequence. We first characterized the organization and diversity of centromere repeats in wild-type Arabidopsis thaliana. We show that A. thaliana CENH3-containing nucleosomes exhibit a strong preference for a unique subset of centromeric repeats. These sequences are largely missing from the genome assemblies and represent the youngest and most homogeneous class of repeats. Next, we tested the evolutionary specificity of this interaction in a background in which the native A. thaliana CENH3 is replaced with CENH3s from distant species. Strikingly, we find that CENH3 from Lepidium oleraceum and Zea mays, although specifying epigenetically weaker centromeres that result in genome elimination upon outcrossing, show a binding pattern on A. thaliana centromere repeats that is indistinguishable from the native CENH3. Our results demonstrate positional stability of a highly diverged CENH3 on independently evolved repeats, suggesting that the sequence specificity of centromeres is determined by a mechanism independent of CENH3. PMID:28223399
Repeating aftershocks of the great 2004 Sumatra and 2005 Nias earthquakes

NASA Astrophysics Data System (ADS)

Yu, Wen-che; Song, Teh-Ru Alex; Silver, Paul G.

2013-05-01

We investigate repeating aftershocks associated with the great 2004 Sumatra-Andaman (Mw 9.2) and 2005 Nias-Simeulue (Mw 8.6) earthquakes by cross-correlating waveforms recorded by the regional seismographic station PSI and teleseismic stations. We identify 10 and 18 correlated aftershock sequences associated with the great 2004 Sumatra and 2005 Nias earthquakes, respectively. The majority of the correlated aftershock sequences are located near the down-dip end of a large afterslip patch. We determine the precise relative locations of event pairs among these sequences and estimate the source rupture areas. The correlated event pairs identified are appropriately referred to as repeating aftershocks, in that the source rupture areas are comparable and significantly overlap within a sequence. We use the repeating aftershocks to estimate afterslip based on the slip-seismic moment scaling relationship and to infer the temporal decay rate of the recurrence interval. The estimated afterslip resembles that measured from the near-field geodetic data to the first order. The decay rate of repeating aftershocks as a function of lapse time t follows a power-law decay 1/tp with the exponent p in the range 0.8-1.1. Both types of observations indicate that repeating aftershocks are governed by post-seismic afterslip.
Genome-Wide Stochastic Adaptive DNA Amplification at Direct and Inverted DNA Repeats in the Parasite Leishmania

PubMed Central

Plourde, Marie; Gingras, Hélène; Roy, Gaétan; Lapointe, Andréanne; Leprohon, Philippe; Papadopoulou, Barbara; Corbeil, Jacques; Ouellette, Marc

2014-01-01

Gene amplification of specific loci has been described in all kingdoms of life. In the protozoan parasite Leishmania, the product of amplification is usually part of extrachromosomal circular or linear amplicons that are formed at the level of direct or inverted repeated sequences. A bioinformatics screen revealed that repeated sequences are widely distributed in the Leishmania genome and the repeats are chromosome-specific, conserved among species, and generally present in low copy number. Using sensitive PCR assays, we provide evidence that the Leishmania genome is continuously being rearranged at the level of these repeated sequences, which serve as a functional platform for constitutive and stochastic amplification (and deletion) of genomic segments in the population. This process is adaptive as the copy number of advantageous extrachromosomal circular or linear elements increases upon selective pressure and is reversible when selection is removed. We also provide mechanistic insights on the formation of circular and linear amplicons through RAD51 recombinase-dependent and -independent mechanisms, respectively. The whole genome of Leishmania is thus stochastically rearranged at the level of repeated sequences, and the selection of parasite subpopulations with changes in the copy number of specific loci is used as a strategy to respond to a changing environment. PMID:24844805
Population-scale whole genome sequencing identifies 271 highly polymorphic short tandem repeats from Japanese population.

PubMed

Hirata, Satoshi; Kojima, Kaname; Misawa, Kazuharu; Gervais, Olivier; Kawai, Yosuke; Nagasaki, Masao

2018-05-01

Forensic DNA typing is widely used to identify missing persons and plays a central role in forensic profiling. DNA typing usually uses capillary electrophoresis fragment analysis of PCR amplification products to detect the length of short tandem repeat (STR) markers. Here, we analyzed whole genome data from 1,070 Japanese individuals generated using massively parallel short-read sequencing of 162 paired-end bases. We have analyzed 843,473 STR loci with two to six basepair repeat units and cataloged highly polymorphic STR loci in the Japanese population. To evaluate the performance of the cataloged STR loci, we compared 23 STR loci, widely used in forensic DNA typing, with capillary electrophoresis based STR genotyping results in the Japanese population. Seventeen loci had high correlations and high call rates. The other six loci had low call rates or low correlations due to either the limitations of short-read sequencing technology, the bioinformatics tool used, or the complexity of repeat patterns. With these analyses, we have also purified the suitable 218 STR loci with four basepair repeat units and 53 loci with five basepair repeat units both for short read sequencing and PCR based technologies, which would be candidates to the actual forensic DNA typing in Japanese population.
Comparative molecular cytogenetics of major repetitive sequence families of three Dendrobium species (Orchidaceae) from Bangladesh

PubMed Central

Begum, Rabeya; Alam, Sheikh Shamimul; Menzel, Gerhard; Schmidt, Thomas

2009-01-01

Background and Aims Dendrobium species show tremendous morphological diversity and have broad geographical distribution. As repetitive sequence analysis is a useful tool to investigate the evolution of chromosomes and genomes, the aim of the present study was the characterization of repetitive sequences from Dendrobium moschatum for comparative molecular and cytogenetic studies in the related species Dendrobium aphyllum, Dendrobium aggregatum and representatives from other orchid genera. Methods In order to isolate highly repetitive sequences, a c0t-1 DNA plasmid library was established. Repeats were sequenced and used as probes for Southern hybridization. Sequence divergence was analysed using bioinformatic tools. Repetitive sequences were localized along orchid chromosomes by fluorescence in situ hybridization (FISH). Key Results Characterization of the c0t-1 library resulted in the detection of repetitive sequences including the (GA)n dinucleotide DmoO11, numerous Arabidopsis-like telomeric repeats and the highly amplified dispersed repeat DmoF14. The DmoF14 repeat is conserved in six Dendrobium species but diversified in representative species of three other orchid genera. FISH analyses showed the genome-wide distribution of DmoF14 in D. moschatum, D. aphyllum and D. aggregatum. Hybridization with the telomeric repeats demonstrated Arabidopsis-like telomeres at the chromosome ends of Dendrobium species. However, FISH using the telomeric probe revealed two pairs of chromosomes with strong intercalary signals in D. aphyllum. FISH showed the terminal position of 5S and 18S–5·8S–25S rRNA genes and a characteristic number of rDNA sites in the three Dendrobium species. Conclusions The repeated sequences isolated from D. moschatum c0t-1 DNA constitute major DNA families of the D. moschatum, D. aphyllum and D. aggregatum genomes with DmoF14 representing an ancient component of orchid genomes. Large intercalary telomere-like arrays suggest chromosomal rearrangements in D. aphyllum while the number and localization of rRNA genes as well as the species-specific distribution pattern of an abundant microsatellite reflect the genomic diversity of the three Dendrobium species. PMID:19635741
Comparative genomics and repetitive sequence divergence in the species of diploid Nicotiana section Alatae.

PubMed

Lim, K Yoong; Kovarik, Ales; Matyasek, Roman; Chase, Mark W; Knapp, Sandra; McCarthy, Elizabeth; Clarkson, James J; Leitch, Andrew R

2006-12-01

Combining phylogenetic reconstructions of species relationships with comparative genomic approaches is a powerful way to decipher evolutionary events associated with genome divergence. Here, we reconstruct the history of karyotype and tandem repeat evolution in species of diploid Nicotiana section Alatae. By analysis of plastid DNA, we resolved two clades with high bootstrap support, one containing N. alata, N. langsdorffii, N. forgetiana and N. bonariensis (called the n = 9 group) and another containing N. plumbaginifolia and N. longiflora (called the n = 10 group). Despite little plastid DNA sequence divergence, we observed, via fluorescent in situ hybridization, substantial chromosomal repatterning, including altered chromosome numbers, structure and distribution of repeats. Effort was focussed on 35S and 5S nuclear ribosomal DNA (rDNA) and the HRS60 satellite family of tandem repeats comprising the elements HRS60, NP3R and NP4R. We compared divergence of these repeats in diploids and polyploids of Nicotiana. There are dramatic shifts in the distribution of the satellite repeats and complete replacement of intergenic spacers (IGSs) of 35S rDNA associated with divergence of the species in section Alatae. We suggest that sequence homogenization has replaced HRS60 family repeats at sub-telomeric regions, but that this process may not occur, or occurs more slowly, when the repeats are found at intercalary locations. Sequence homogenization acts more rapidly (at least two orders of magnitude) on 35S rDNA than 5S rDNA and sub-telomeric satellite sequences. This rapid rate of divergence is analogous to that found in polyploid species, and is therefore, in plants, not only associated with polyploidy.
Correlation between fibroin amino acid sequence and physical silk properties.

PubMed

Fedic, Robert; Zurovec, Michal; Sehnal, Frantisek

2003-09-12

The fiber properties of lepidopteran silk depend on the amino acid repeats that interact during H-fibroin polymerization. The aim of our research was to relate repeat composition to insect biology and fiber strength. Representative regions of the H-fibroin genes were sequenced and analyzed in three pyralid species: wax moth (Galleria mellonella), European flour moth (Ephestia kuehniella), and Indian meal moth (Plodia interpunctella). The amino acid repeats are species-specific, evidently a diversification of an ancestral region of 43 residues, and include three types of regularly dispersed motifs: modifications of GSSAASAA sequence, stretches of tripeptides GXZ where X and Z represent bulky residues, and sequences similar to PVIVIEE. No concatenations of GX dipeptide or alanine, which are typical for Bombyx silkworms and Antheraea silk moths, respectively, were found. Despite different repeat structure, the silks of G. mellonella and E. kuehniella exhibit similar tensile strength as the Bombyx and Antheraea silks. We suggest that in these latter two species, variations in the repeat length obstruct repeat alignment, but sufficiently long stretches of iterated residues get superposed to interact. In the pyralid H-fibroins, interactions of the widely separated and diverse motifs depend on the precision of repeat matching; silk is strong in G. mellonella and E. kuehniella, with 2-3 types of long homogeneous repeats, and nearly 10 times weaker in P. interpunctella, with seven types of shorter erratic repeats. The high proportion of large amino acids in the H-fibroin of pyralids has probably evolved in connection with the spinning habit of caterpillars that live in protective silk tubes and spin continuously, enlarging the tubes on one end and partly devouring the other one. The silk serves as a depot of energetically rich and essential amino acids that may be scarce in the diet.
Evidence for Long-Timescale Patterns of Synaptic Inputs in CA1 of Awake Behaving Mice.

PubMed

Kolb, Ilya; Talei Franzesi, Giovanni; Wang, Michael; Kodandaramaiah, Suhasa B; Forest, Craig R; Boyden, Edward S; Singer, Annabelle C

2018-02-14

Repeated sequences of neural activity are a pervasive feature of neural networks in vivo and in vitro In the hippocampus, sequential firing of many neurons over periods of 100-300 ms reoccurs during behavior and during periods of quiescence. However, it is not known whether the hippocampus produces longer sequences of activity or whether such sequences are restricted to specific network states. Furthermore, whether long repeated patterns of activity are transmitted to single cells downstream is unclear. To answer these questions, we recorded intracellularly from hippocampal CA1 of awake, behaving male mice to examine both subthreshold activity and spiking output in single neurons. In eight of nine recordings, we discovered long (900 ms) reoccurring subthreshold fluctuations or "repeats." Repeats generally were high-amplitude, nonoscillatory events reoccurring with 10 ms precision. Using statistical controls, we determined that repeats occurred more often than would be expected from unstructured network activity (e.g., by chance). Most spikes occurred during a repeat, and when a repeat contained a spike, the spike reoccurred with precision on the order of ≤20 ms, showing that long repeated patterns of subthreshold activity are strongly connected to spike output. Unexpectedly, we found that repeats occurred independently of classic hippocampal network states like theta oscillations or sharp-wave ripples. Together, these results reveal surprisingly long patterns of repeated activity in the hippocampal network that occur nonstochastically, are transmitted to single downstream neurons, and strongly shape their output. This suggests that the timescale of information transmission in the hippocampal network is much longer than previously thought. SIGNIFICANCE STATEMENT We found long (≥900 ms), repeated, subthreshold patterns of activity in CA1 of awake, behaving mice. These repeated patterns ("repeats") occurred more often than expected by chance and with 10 ms precision. Most spikes occurred within repeats and reoccurred with a precision on the order of 20 ms. Surprisingly, there was no correlation between repeat occurrence and classical network states such as theta oscillations and sharp-wave ripples. These results provide strong evidence that long patterns of activity are repeated and transmitted to downstream neurons, suggesting that the hippocampus can generate longer sequences of repeated activity than previously thought. Copyright © 2018 the authors 0270-6474/18/381822-14$15.00/0.
CRF: detection of CRISPR arrays using random forest.

PubMed

Wang, Kai; Liang, Chun

2017-01-01

CRISPRs (clustered regularly interspaced short palindromic repeats) are particular repeat sequences found in wide range of bacteria and archaea genomes. Several tools are available for detecting CRISPR arrays in the genomes of both domains. Here we developed a new web-based CRISPR detection tool named CRF (CRISPR Finder by Random Forest). Different from other CRISPR detection tools, a random forest classifier was used in CRF to filter out invalid CRISPR arrays from all putative candidates and accordingly enhanced detection accuracy. In CRF, particularly, triplet elements that combine both sequence content and structure information were extracted from CRISPR repeats for classifier training. The classifier achieved high accuracy and sensitivity. Moreover, CRF offers a highly interactive web interface for robust data visualization that is not available among other CRISPR detection tools. After detection, the query sequence, CRISPR array architecture, and the sequences and secondary structures of CRISPR repeats and spacers can be visualized for visual examination and validation. CRF is freely available at http://bioinfolab.miamioh.edu/crf/home.php.
Expanded complexity of unstable repeat diseases

PubMed Central

Polak, Urszula; McIvor, Elizabeth; Dent, Sharon Y.R.; Wells, Robert D.; Napierala, Marek

2015-01-01

Unstable Repeat Diseases (URDs) share a common mutational phenomenon of changes in the copy number of short, tandemly repeated DNA sequences. More than 20 human neurological diseases are caused by instability, predominantly expansion, of microsatellite sequences. Changes in the repeat size initiate a cascade of pathological processes, frequently characteristic of a unique disease or a small subgroup of the URDs. Understanding of both the mechanism of repeat instability and molecular consequences of the repeat expansions is critical to developing successful therapies for these diseases. Recent technological breakthroughs in whole genome, transcriptome and proteome analyses will almost certainly lead to new discoveries regarding the mechanisms of repeat instability, the pathogenesis of URDs, and will facilitate development of novel therapeutic approaches. The aim of this review is to give a general overview of unstable repeats diseases, highlight the complexities of these diseases, and feature the emerging discoveries in the field. PMID:23233240

Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana.

PubMed

Mayer, K; Schüller, C; Wambutt, R; Murphy, G; Volckaert, G; Pohl, T; Düsterhöft, A; Stiekema, W; Entian, K D; Terryn, N; Harris, B; Ansorge, W; Brandt, P; Grivell, L; Rieger, M; Weichselgartner, M; de Simone, V; Obermaier, B; Mache, R; Müller, M; Kreis, M; Delseny, M; Puigdomenech, P; Watson, M; Schmidtheini, T; Reichert, B; Portatelle, D; Perez-Alonso, M; Boutry, M; Bancroft, I; Vos, P; Hoheisel, J; Zimmermann, W; Wedler, H; Ridley, P; Langham, S A; McCullagh, B; Bilham, L; Robben, J; Van der Schueren, J; Grymonprez, B; Chuang, Y J; Vandenbussche, F; Braeken, M; Weltjens, I; Voet, M; Bastiaens, I; Aert, R; Defoor, E; Weitzenegger, T; Bothe, G; Ramsperger, U; Hilbert, H; Braun, M; Holzer, E; Brandt, A; Peters, S; van Staveren, M; Dirske, W; Mooijman, P; Klein Lankhorst, R; Rose, M; Hauf, J; Kötter, P; Berneiser, S; Hempel, S; Feldpausch, M; Lamberth, S; Van den Daele, H; De Keyser, A; Buysshaert, C; Gielen, J; Villarroel, R; De Clercq, R; Van Montagu, M; Rogers, J; Cronin, A; Quail, M; Bray-Allen, S; Clark, L; Doggett, J; Hall, S; Kay, M; Lennard, N; McLay, K; Mayes, R; Pettett, A; Rajandream, M A; Lyne, M; Benes, V; Rechmann, S; Borkova, D; Blöcker, H; Scharfe, M; Grimm, M; Löhnert, T H; Dose, S; de Haan, M; Maarse, A; Schäfer, M; Müller-Auer, S; Gabel, C; Fuchs, M; Fartmann, B; Granderath, K; Dauner, D; Herzl, A; Neumann, S; Argiriou, A; Vitale, D; Liguori, R; Piravandi, E; Massenet, O; Quigley, F; Clabauld, G; Mündlein, A; Felber, R; Schnabl, S; Hiller, R; Schmidt, W; Lecharny, A; Aubourg, S; Chefdor, F; Cooke, R; Berger, C; Montfort, A; Casacuberta, E; Gibbons, T; Weber, N; Vandenbol, M; Bargues, M; Terol, J; Torres, A; Perez-Perez, A; Purnelle, B; Bent, E; Johnson, S; Tacon, D; Jesse, T; Heijnen, L; Schwarz, S; Scholler, P; Heber, S; Francs, P; Bielke, C; Frishman, D; Haase, D; Lemcke, K; Mewes, H W; Stocker, S; Zaccaria, P; Bevan, M; Wilson, R K; de la Bastide, M; Habermann, K; Parnell, L; Dedhia, N; Gnoj, L; Schutz, K; Huang, E; Spiegel, L; Sehkon, M; Murray, J; Sheet, P; Cordes, M; Abu-Threideh, J; Stoneking, T; Kalicki, J; Graves, T; Harmon, G; Edwards, J; Latreille, P; Courtney, L; Cloud, J; Abbott, A; Scott, K; Johnson, D; Minx, P; Bentley, D; Fulton, B; Miller, N; Greco, T; Kemp, K; Kramer, J; Fulton, L; Mardis, E; Dante, M; Pepin, K; Hillier, L; Nelson, J; Spieth, J; Ryan, E; Andrews, S; Geisel, C; Layman, D; Du, H; Ali, J; Berghoff, A; Jones, K; Drone, K; Cotton, M; Joshu, C; Antonoiu, B; Zidanic, M; Strong, C; Sun, H; Lamar, B; Yordan, C; Ma, P; Zhong, J; Preston, R; Vil, D; Shekher, M; Matero, A; Shah, R; Swaby, I K; O'Shaughnessy, A; Rodriguez, M; Hoffmann, J; Till, S; Granat, S; Shohdy, N; Hasegawa, A; Hameed, A; Lodhi, M; Johnson, A; Chen, E; Marra, M; Martienssen, R; McCombie, W R

1999-12-16

The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.
Characterization of species-specific repeated DNA sequences from B. nigra.

PubMed

Gupta, V; Lakshmisita, G; Shaila, M S; Jagannathan, V; Lakshmikumaran, M S

1992-07-01

The construction and characterization of two genome-specific recombinant DNA clones from B. nigra are described. Southern analysis showed that the two clones belong to a dispersed repeat family. They differ from each other in their length, distribution and sequence, though the average GC content is nearly the same (45%). These B genome-specific repeats have been used to analyse the phylogenetic relationships between cultivated and wild species of the family Brassicaceae.
[Convergent origin of repeats in genes coding for globular proteins. An analysis of the factors determining the presence of inverted and symmetrical repeats].

PubMed

Solov'ev, V V; Kel', A E; Kolchanov, N A

1989-01-01

The factors, determining the presence of inverted and symmetrical repeats in genes coding for globular proteins, have been analysed. An interesting property of genetical code has been revealed in the analysis of symmetrical repeats: the pairs of symmetrical codons corresponded to pairs of amino acids with mostly similar physical-chemical parameters. This property may explain the presence of symmetrical repeats and palindromes only in genes coding for beta-structural proteins-polypeptides, where amino acids with similar physical-chemical properties occupy symmetrical positions. A stochastic model of evolution of polynucleotide sequences has been used for analysis of inverted repeats. The modelling demonstrated that only limiting of sequences (uneven frequencies of used codons) is enough for arising of nonrandom inverted repeats in genes.
Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain

PubMed Central

de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas

2014-01-01

The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. PMID:24792163
ACMES: fast multiple-genome searches for short repeat sequences with concurrent cross-species information retrieval

PubMed Central

Reneker, Jeff; Shyu, Chi-Ren; Zeng, Peiyu; Polacco, Joseph C.; Gassmann, Walter

2004-01-01

We have developed a web server for the life sciences community to use to search for short repeats of DNA sequence of length between 3 and 10 000 bases within multiple species. This search employs a unique and fast hash function approach. Our system also applies information retrieval algorithms to discover knowledge of cross-species conservation of repeat sequences. Furthermore, we have incorporated a part of the Gene Ontology database into our information retrieval algorithms to broaden the coverage of the search. Our web server and tutorial can be found at http://acmes.rnet.missouri.edu. PMID:15215469
Genetic characterization of UCS region of Pneumocystis jirovecii and construction of allelic profiles of Indian isolates based on sequence typing at three regions.

PubMed

Gupta, Rashmi; Mirdha, Bijay Ranjan; Guleria, Randeep; Kumar, Lalit; Luthra, Kalpana; Agarwal, Sanjay Kumar; Sreenivas, Vishnubhatla

2013-01-01

Pneumocystis jirovecii is an opportunistic pathogen that causes severe pneumonia in immunocompromised patients. To study the genetic diversity of P. jirovecii in India the upstream conserved sequence (UCS) region of Pneumocystis genome was amplified, sequenced and genotyped from a set of respiratory specimens obtained from 50 patients with a positive result for nested mitochondrial large subunit ribosomal RNA (mtLSU rRNA) PCR during the years 2005-2008. Of these 50 cases, 45 showed a positive PCR for UCS region. Variations in the tandem repeats in UCS region were characterized by sequencing all the positive cases. Of the 45 cases, one case showed five repeats, 11 cases showed four repeats, 29 cases showed three repeats and four cases showed two repeats. By running amplified DNA from all these cases on a high-resolution gel, mixed infection was observed in 12 cases (26.7%, 12/45). Forty three of 45 cases included in this study had previously been typed at mtLSU rRNA and internal transcribed spacer (ITS) region by our group. In the present study, the genotypes at those two regions were combined with UCS repeat patterns to construct allelic profiles of 43 cases. A total of 36 allelic profiles were observed in 43 isolates indicating high genetic variability. A statistically significant association was observed between mtLSU rRNA genotype 1, ITS type Ea and UCS repeat pattern 4. Copyright © 2012 Elsevier B.V. All rights reserved.
Evolution and selection of Rhg1, a copy-number variant nematode-resistance locus

PubMed Central

Lee, Tong Geon; Kumar, Indrajit; Diers, Brian W; Hudson, Matthew E

2015-01-01

The soybean cyst nematode (SCN) resistance locus Rhg1 is a tandem repeat of a 31.2 kb unit of the soybean genome. Each 31.2-kb unit contains four genes. One allele of Rhg1, Rhg1-b, is responsible for protecting most US soybean production from SCN. Whole-genome sequencing was performed, and PCR assays were developed to investigate allelic variation in sequence and copy number of the Rhg1 locus across a population of soybean germplasm accessions. Four distinct sequences of the 31.2-kb repeat unit were identified, and some Rhg1 alleles carry up to three different types of repeat unit. The total number of copies of the repeat varies from 1 to 10 per haploid genome. Both copy number and sequence of the repeat correlate with the resistance phenotype, and the Rhg1 locus shows strong signatures of selection. Significant linkage disequilibrium in the genome outside the boundaries of the repeat allowed the Rhg1 genotype to be inferred using high-density single nucleotide polymorphism genotyping of 15 996 accessions. Over 860 germplasm accessions were found likely to possess Rhg1 alleles. The regions surrounding the repeat show indications of non-neutral evolution and high genetic variability in populations from different geographic locations, but without evidence of fixation of the resistant genotype. A compelling explanation of these results is that balancing selection is in operation at Rhg1. PMID:25735447
SSR allelic variation in almond (Prunus dulcis Mill.).

PubMed

Xie, Hua; Sui, Yi; Chang, Feng-Qi; Xu, Yong; Ma, Rong-Cai

2006-01-01

Sixteen SSR markers including eight EST-SSR and eight genomic SSRs were used for genetic diversity analysis of 23 Chinese and 15 international almond cultivars. EST- and genomic SSR markers previously reported in species of Prunus, mainly peach, proved to be useful for almond genetic analysis. DNA sequences of 117 alleles of six of the 16 SSR loci were analysed to reveal sequence variation among the 38 almond accessions. For the four SSR loci with AG/CT repeats, no insertions or deletions were observed in the flanking regions of the 98 alleles sequenced. Allelic size variation of these loci resulted exclusively from differences in the structures of repeat motifs, which involved interruptions or occurrences of new motif repeats in addition to varying number of AG/CT repeats. Some alleles had a high number of uninterrupted repeat motifs, indicating that SSR mutational patterns differ among alleles at a given SSR locus within the almond species. Allelic homoplasy was observed in the SSR loci because of base substitutions, interruptions or compound repeat motifs. Substitutions in the repeat regions were found at two SSR loci, suggesting that point mutations operate on SSRs and hinder the further SSR expansion by introducing repeat interruptions to stabilize SSR loci. Furthermore, it was shown that some potential point mutations in the flanking regions are linked with new SSR repeat motif variation in almond and peach.
The 28S–18S rDNA intergenic spacer from Crithidia fasciculata: repeated sequences, length heterogeneity, putative processing sites and potential interactions between U3 small nucleolar RNA and the ribosomal RNA precursor

PubMed Central

Schnare, Murray N.; Collings, James C.; Spencer, David F.; Gray, Michael W.

2000-01-01

In Crithidia fasciculata, the ribosomal RNA (rRNA) gene repeats range in size from ∼11 to 12 kb. This length heterogeneity is localized to a region of the intergenic spacer (IGS) that contains tandemly repeated copies of a 19mer sequence. The IGS also contains four copies of an ∼55 nt repeat that has an internal inverted repeat and is also present in the IGS of Leishmania species. We have mapped the C.fasciculata transcription initiation site as well as two other reverse transcriptase stop sites that may be analogous to the A0 and A′ pre-rRNA processing sites within the 5′ external transcribed spacer (ETS) of other eukaryotes. Features that could influence processing at these sites include two stretches of conserved primary sequence and three secondary structure elements present in the 5′ ETS. We also characterized the C.fasciculata U3 snoRNA, which has the potential for base-pairing with pre-rRNA sequences. Finally, we demonstrate that biosynthesis of large subunit rRNA in both C.fasciculata and Trypanosoma brucei involves 3′-terminal addition of three A residues that are not present in the corresponding DNA sequences. PMID:10982863
Molecular identification and characterization of clustered regularly interspaced short palindromic repeats (CRISPRs) in a urease-positive thermophilic Campylobacter sp. (UPTC).

PubMed

Tasaki, E; Hirayama, J; Tazumi, A; Hayashi, K; Hara, Y; Ueno, H; Moore, J E; Millar, B C; Matsuda, M

2012-02-01

Novel clustered regularly-interspaced short palindromic repeats (CRISPRs) locus [7,500 base pairs (bp) in length] occurred in the urease-positive thermophilic Campylobacter (UPTC) Japanese isolate, CF89-12. The 7,500 bp gene loci consisted of the 5'-methylaminomethyl-2-thiouridylate methyltransferase gene, putative (P) CRISPR associated (p-Cas), putative open reading frames, Cas1 and Cas2, leader sequence region (146 bp), 12 CRISPRs consensus sequence repeats (each 36 bp) separated by a non-repetitive unique spacer region of similar length (26-31 bp) and the phosphatidyl glycerophosphatase A gene. When the CRISPRs loci in the UPTC CF89-12 and five C. jejuni isolates were compared with one another, these six isolates contained p-Cas, Cas1 and Cas2 within the loci. Four to 12 CRISPRs consensus sequence repeats separated by a non-repetitive unique spacer region occurred in six isolates and the nucleotide sequences of those repeats gave approximately 92-100% similarity with each other. However, no sequence similarity occurred in the unique spacer regions among these isolates. The putative σ(70) transcriptional promoter and the hypothetical ρ-independent terminator structures for the CRISPRs and Cas were detected. No in vivo transcription of p-Cas, Cas1 and Cas2 was confirmed in the UPTC cells.
Comparative Chloroplast Genomics of Gossypium Species: Insights Into Repeat Sequence Variations and Phylogeny

PubMed Central

Wu, Ying; Liu, Fang; Yang, Dai-Gang; Li, Wei; Zhou, Xiao-Jian; Pei, Xiao-Yu; Liu, Yan-Gai; He, Kun-Lun; Zhang, Wen-Sheng; Ren, Zhong-Ying; Zhou, Ke-Hai; Ma, Xiong-Feng; Li, Zhong-Hu

2018-01-01

Cotton is one of the most economically important fiber crop plants worldwide. The genus Gossypium contains a single allotetraploid group (AD) and eight diploid genome groups (A–G and K). However, the evolution of repeat sequences in the chloroplast genomes and the phylogenetic relationships of Gossypium species are unclear. Thus, we determined the variations in the repeat sequences and the evolutionary relationships of 40 cotton chloroplast genomes, which represented the most diverse in the genus, including five newly sequenced diploid species, i.e., G. nandewarense (C1-n), G. armourianum (D2-1), G. lobatum (D7), G. trilobum (D8), and G. schwendimanii (D11), and an important semi-wild race of upland cotton, G. hirsutum race latifolium (AD1). The genome structure, gene order, and GC content of cotton species were similar to those of other higher plant plastid genomes. In total, 2860 long sequence repeats (>10 bp in length) were identified, where the F-genome species had the largest number of repeats (G. longicalyx F1: 108) and E-genome species had the lowest (G. stocksii E1: 53). Large-scale repeat sequences possibly enrich the genetic information and maintain genome stability in cotton species. We also identified 10 divergence hotspot regions, i.e., rpl33-rps18, psbZ-trnG (GCC), rps4-trnT (UGU), trnL (UAG)-rpl32, trnE (UUC)-trnT (GGU), atpE, ndhI, rps2, ycf1, and ndhF, which could be useful molecular genetic markers for future population genetics and phylogenetic studies. Site-specific selection analysis showed that some of the coding sites of 10 chloroplast genes (atpB, atpE, rps2, rps3, petB, petD, ccsA, cemA, ycf1, and rbcL) were under protein sequence evolution. Phylogenetic analysis based on the whole plastomes suggested that the Gossypium species grouped into six previously identified genetic clades. Interestingly, all 13 D-genome species clustered into a strong monophyletic clade. Unexpectedly, the cotton species with C, G, and K-genomes were admixed and nested in a large clade, which could have been due to their recent radiation, incomplete lineage sorting, and introgression hybridization among different cotton lineages. In conclusion, the results of this study provide new insights into the evolution of repeat sequences in chloroplast genomes and interspecific relationships in the genus Gossypium. PMID:29619041
[Molecular cloning and characterization of a novel Clonorchis sinensis antigenic protein containing tandem repeat sequences].

PubMed

Liu, Qian; Xu, Xue-Nian; Zhou, Yan; Cheng, Na; Dong, Yu-Ting; Zheng, Hua-Jun; Zhu, Yong-Qiang; Zhu, Yong-Qiang

2013-08-01

To find and clone new antigen genes from the lambda-ZAP cDNA expression library of adult Clonorchis sinensis, and determine the immunological characteristics of the recombinant proteins. The cDNA expression library of adult C. sinensis was screened by pooled sera of clonorchiasis patients. The sequences of the positive phage clones were compared with the sequences in EST database, and the full-length sequence of the gene (Cs22 gene) was obtained by RT-PCR. cDNA fragments containing 2 and 3 times tandem repeat sequences were generated by jumping PCR. The sequence encoding the mature peptide or the tandem repeat sequence was respectively cloned into the prokaryotic expression vector pET28a (+), and then transformed into E. coli Rosetta DE3 cells for expression. The recombinant proteins (rCs22-2r, rCs22-3r, rCs22M-2r, and rCs22M-3r) were purified by His-bind-resin (Ni-NTA) affinity chromatography. The immunogenicity of rCs22-2r and rCs22-3r was identified by ELISA. To evaluate the immunological diagnostic value of rCs22-2r and rCs22-3r, serum samples from 35 clonorchiasis patients, 31 healthy individuals, 15 schistosomiasis patients, 15 paragonimiasis westermani patients and 13 cysticercosis patients were examined by ELISA. To locate antigenic determinants, the pooled sera of clonorchiasis patients and healthy persons were analyzed for specific antibodies by ELISA with recombinant protein rCs22M-2r and rCs22M-3r containing the tandem repeat sequences. The full-length sequence of Cs22 antigen gene of C. sinensis was obtained. It contained 13 times tandem repeat sequences of EQQDGDEEGMGGDGGRGKEKGKVEGEDGAGEQKEQA. Bioinformatics analysis indicated that the protein (Cs22) belonged to GPI-anchored proteins family. The recombinant proteins rCs22-2r and rCs22-3r showed a certain level of immunogenicity. The positive rate by ELISA coated with the purified PrCs22-2r and PrCs22-3r for sera of clonorchiasis patients both were 45.7% (16/35), and 3.2% (1/31) for those of healthy persons. There was no cross reaction with sera of schistosomiasis and cysticercosis patients. The cross reaction with sera of paragonimiasis westermani patients was 1/15. The recombinant proteins rCs22M-2r and rCs22M-3r which only contained tandem repeats were specifically recognized by pooled sera of clonorchiasis patients. The Cs22 antigen gene of Clonorchis sinensis is obtained, and the recombinant proteins have certain diagnostic value. The antigenic determinant is located in tandem repeat sequences.
Centromere location in Arabidopsis is unaltered by extreme divergence in CENH3 protein sequence.

PubMed

Maheshwari, Shamoni; Ishii, Takayoshi; Brown, C Titus; Houben, Andreas; Comai, Luca

2017-03-01

During cell division, spindle fibers attach to chromosomes at centromeres. The DNA sequence at regional centromeres is fast evolving with no conserved genetic signature for centromere identity. Instead CENH3, a centromere-specific histone H3 variant, is the epigenetic signature that specifies centromere location across both plant and animal kingdoms. Paradoxically, CENH3 is also adaptively evolving. An ongoing question is whether CENH3 evolution is driven by a functional relationship with the underlying DNA sequence. Here, we demonstrate that despite extensive protein sequence divergence, CENH3 histones from distant species assemble centromeres on the same underlying DNA sequence. We first characterized the organization and diversity of centromere repeats in wild-type Arabidopsis thaliana We show that A. thaliana CENH3-containing nucleosomes exhibit a strong preference for a unique subset of centromeric repeats. These sequences are largely missing from the genome assemblies and represent the youngest and most homogeneous class of repeats. Next, we tested the evolutionary specificity of this interaction in a background in which the native A. thaliana CENH3 is replaced with CENH3s from distant species. Strikingly, we find that CENH3 from Lepidium oleraceum and Zea mays , although specifying epigenetically weaker centromeres that result in genome elimination upon outcrossing, show a binding pattern on A. thaliana centromere repeats that is indistinguishable from the native CENH3. Our results demonstrate positional stability of a highly diverged CENH3 on independently evolved repeats, suggesting that the sequence specificity of centromeres is determined by a mechanism independent of CENH3. © 2017 Maheshwari et al.; Published by Cold Spring Harbor Laboratory Press.
Structural analysis of two length variants of the rDNA intergenic spacer from Eruca sativa.

PubMed

Lakshmikumaran, M; Negi, M S

1994-03-01

Restriction enzyme analysis of the rRNA genes of Eruca sativa indicated the presence of many length variants within a single plant and also between different cultivars which is unusual for most crucifers studied so far. Two length variants of the rDNA intergenic spacer (IGS) from a single individual E. sativa (cv. Itsa) plant were cloned and characterized. The complete nucleotide sequences of both the variants (3 kb and 4 kb) were determined. The intergenic spacer contains three families of tandemly repeated DNA sequences denoted as A, B and C. However, the long (4 kb) variant shows the presence of an additional repeat, denoted as D, which is a duplication of a 224 bp sequence just upstream of the putative transcription initiation site. Repeat units belonging to the three different families (A, B and C) were in the size range of 22 to 30 bp. Such short repeat elements are present in the IGS of most of the crucifers analysed so far. Sequence analysis of the variants (3 kb and 4 kb) revealed that the length heterogeneity of the spacer is located at three different regions and is due to the varying copy numbers of repeat units belonging to families A and B. Length variation of the spacer is also due to the presence of a large duplication (D repeats) in the 4 kb variant which is absent in the 3 kb variant. The putative transcription initiation site was identified by comparisons with the rDNA sequences from other plant species.
The yeast DNA ligase gene CDC9 is controlled by six orientation specific upstream activating sequences that respond to cellular proliferation but which alone cannot mediate cell cycle regulation.

PubMed Central

White, J H; Johnson, A L; Lowndes, N F; Johnston, L H

1991-01-01

By fusing the CDC9 structural gene to the PGK upstream sequences and the CDC9 upstream to lacZ, we showed that the cell cycle expression of CDC9 is largely due to transcriptional regulation. To investigate the role of six ATGATT upstream repeats in CDC9 regulation, synthetic copies of the sequence were attached to a heterologous gene. The repeats stimulated transcription strongly and additively, but, unlike conventional yeast UAS elements, only when present in one orientation. Transcription driven by the repeats declines in cells held at START of the cell cycle or in stationary phase, as occurs with CDC9. However, the repeats by themselves cannot impart cell cycle regulation to a heterologous gene. CDC9 may therefore be controlled by an activating system operating through the repeats that is sensitive to cellular proliferation and a separate mechanism that governs the periodic expression in the cell cycle. Images PMID:1901644
In Vitro Expansion of CAG, CAA, and Mixed CAG/CAA Repeats.

PubMed

Figura, Grzegorz; Koscianska, Edyta; Krzyzosiak, Wlodzimierz J

2015-08-11

Polyglutamine diseases, including Huntington's disease and a number of spinocerebellar ataxias, are caused by expanded CAG repeats that are located in translated sequences of individual, functionally-unrelated genes. Only mutant proteins containing polyglutamine expansions have long been thought to be pathogenic, but recent evidence has implicated mutant transcripts containing long CAG repeats in pathogenic processes. The presence of two pathogenic factors prompted us to attempt to distinguish the effects triggered by mutant protein from those caused by mutant RNA in cellular models of polyglutamine diseases. We used the SLIP (Synthesis of Long Iterative Polynucleotide) method to generate plasmids expressing long CAG repeats (forming a hairpin structure), CAA-interrupted CAG repeats (forming multiple unstable hairpins) or pure CAA repeats (not forming any secondary structure). We successfully modified the original SLIP protocol to generate repeats of desired length starting from constructs containing short repeat tracts. We demonstrated that the SLIP method is a time- and cost-effective approach to manipulate the lengths of expanded repeat sequences.
Deep landscape update of dispersed and tandem repeats in the genome model of the red jungle fowl, Gallus gallus, using a series of de novo investigating tools.

PubMed

Guizard, Sébastien; Piégu, Benoît; Arensburger, Peter; Guillou, Florian; Bigot, Yves

2016-08-19

The program RepeatMasker and the database Repbase-ISB are part of the most widely used strategy for annotating repeats in animal genomes. They have been used to show that avian genomes have a lower repeat content (8-12 %) than the sequenced genomes of many vertebrate species (30-55 %). However, the efficiency of such a library-based strategies is dependent on the quality and completeness of the sequences in the database that is used. An alternative to these library based methods are methods that identify repeats de novo. These alternative methods have existed for a least a decade and may be more powerful than the library based methods. We have used an annotation strategy involving several complementary de novo tools to determine the repeat content of the model genome galGal4 (1.04 Gbp), including identifying simple sequence repeats (SSRs), tandem repeats and transposable elements (TEs). We annotated over one Gbp. of the galGal4 genome and showed that it is composed of approximately 19 % SSRs and TEs repeats. Furthermore, we estimate that the actual genome of the red jungle fowl contains about 31-35 % repeats. We find that library-based methods tend to overestimate TE diversity. These results have a major impact on the current understanding of repeats distributions throughout chromosomes in the red jungle fowl. Our results are a proof of concept of the reliability of using de novo tools to annotate repeats in large animal genomes. They have also revealed issues that will need to be resolved in order to develop gold-standard methodologies for annotating repeats in eukaryote genomes.
Rational design of alpha-helical tandem repeat proteins with closed architectures

PubMed Central

Doyle, Lindsey; Hallinan, Jazmine; Bolduc, Jill; Parmeggiani, Fabio; Baker, David; Stoddard, Barry L.; Bradley, Philip

2015-01-01

Tandem repeat proteins, which are formed by repetition of modular units of protein sequence and structure, play important biological roles as macromolecular binding and scaffolding domains, enzymes, and building blocks for the assembly of fibrous materials1,2. The modular nature of repeat proteins enables the rapid construction and diversification of extended binding surfaces by duplication and recombination of simple building blocks3,4. The overall architecture of tandem repeat protein structures – which is dictated by the internal geometry and local packing of the repeat building blocks – is highly diverse, ranging from extended, super-helical folds that bind peptide, DNA, and RNA partners5–9, to closed and compact conformations with internal cavities suitable for small molecule binding and catalysis10. Here we report the development and validation of computational methods for de novo design of tandem repeat protein architectures driven purely by geometric criteria defining the inter-repeat geometry, without reference to the sequences and structures of existing repeat protein families. We have applied these methods to design a series of closed alpha-solenoid11 repeat structures (alpha-toroids) in which the inter-repeat packing geometry is constrained so as to juxtapose the N- and C-termini; several of these designed structures have been validated by X-ray crystallography. Unlike previous approaches to tandem repeat protein engineering12–20, our design procedure does not rely on template sequence or structural information taken from natural repeat proteins and hence can produce structures unlike those seen in nature. As an example, we have successfully designed and validated closed alpha-solenoid repeats with a left-handed helical architecture that – to our knowledge – is not yet present in the protein structure database21. PMID:26675735
The complete chloroplast genome sequences of Lychnis wilfordii and Silene capitata and comparative analyses with other Caryophyllaceae genomes.

PubMed

Kang, Jong-Soo; Lee, Byoung Yoon; Kwak, Myounghai

2017-01-01

The complete chloroplast genomes of Lychnis wilfordii and Silene capitata were determined and compared with ten previously reported Caryophyllaceae chloroplast genomes. The chloroplast genome sequences of L. wilfordii and S. capitata contain 152,320 bp and 150,224 bp, respectively. The gene contents and orders among 12 Caryophyllaceae species are consistent, but several microstructural changes have occurred. Expansion of the inverted repeat (IR) regions at the large single copy (LSC)/IRb and small single copy (SSC)/IR boundaries led to partial or entire gene duplications. Additionally, rearrangements of the LSC region were caused by gene inversions and/or transpositions. The 18 kb inversions, which occurred three times in different lineages of tribe Sileneae, were thought to be facilitated by the intermolecular duplicated sequences. Sequence analyses of the L. wilfordii and S. capitata genomes revealed 39 and 43 repeats, respectively, including forward, palindromic, and reverse repeats. In addition, a total of 67 and 56 simple sequence repeats were discovered in the L. wilfordii and S. capitata chloroplast genomes, respectively. Finally, we constructed phylogenetic trees of the 12 Caryophyllaceae species and two Amaranthaceae species based on 73 protein-coding genes using both maximum parsimony and likelihood methods.
Development of expressed sequence tag-simple sequence repeat markers for genetic characterization and population structure analysis of Praxelis clematidea (Asteraceae).

PubMed

Wang, Q Z; Huang, M; Downie, S R; Chen, Z X

2016-05-23

Invasive plants tend to spread aggressively in new habitats and an understanding of their genetic diversity and population structure is useful for their management. In this study, expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed for the invasive plant species Praxelis clematidea (Asteraceae) from 5548 Stevia rebaudiana (Asteraceae) expressed sequence tags (ESTs). A total of 133 microsatellite-containing ESTs (2.4%) were identified, of which 56 (42.1%) were hexanucleotide repeat motifs and 50 (37.6%) were trinucleotide repeat motifs. Of the 24 primer pairs designed from these 133 ESTs, 7 (29.2%) resulted in significant polymorphisms. The number of alleles per locus ranged from 5 to 9. The relatively high genetic diversity (H = 0.2667, I = 0.4212, and P = 100%) of P. clematidea was related to high gene flow (Nm = 1.4996) among populations. The coefficient of population differentiation (GST = 0.2500) indicated that most genetic variation occurred within populations. A Mantel test suggested that there was significant correlation between genetic distance and geographical distribution (r = 0.3192, P = 0.012). These results further support the transferability of EST-SSR markers between closely related genera of the same family.

Assessing Diversity of DNA Structure-Related Sequence Features in Prokaryotic Genomes

PubMed Central

Huang, Yongjie; Mrázek, Jan

2014-01-01

Prokaryotic genomes are diverse in terms of their nucleotide and oligonucleotide composition as well as presence of various sequence features that can affect physical properties of the DNA molecule. We present a survey of local sequence patterns which have a potential to promote non-canonical DNA conformations (i.e. different from standard B-DNA double helix) and interpret the results in terms of relationships with organisms' habitats, phylogenetic classifications, and other characteristics. Our present work differs from earlier similar surveys not only by investigating a wider range of sequence patterns in a large number of genomes but also by using a more realistic null model to assess significant deviations. Our results show that simple sequence repeats and Z-DNA-promoting patterns are generally suppressed in prokaryotic genomes, whereas palindromes and inverted repeats are over-represented. Representation of patterns that promote Z-DNA and intrinsic DNA curvature increases with increasing optimal growth temperature (OGT), and decreases with increasing oxygen requirement. Additionally, representations of close direct repeats, palindromes and inverted repeats exhibit clear negative trends with increasing OGT. The observed relationships with environmental characteristics, particularly OGT, suggest possible evolutionary scenarios of structural adaptation of DNA to particular environmental niches. PMID:24408877
Simple sequence repeat markers that identify Claviceps species and strains

USDA-ARS?s Scientific Manuscript database

Claviceps purpurea is a pathogen that infects most members of the Pooideae subfamily and causes ergot, a floral disease in which the ovary is replaced with a sclerotium. This study was initiated to develop Simple Sequence Repeat (SSRs) markers for rapid identification of C. purpurea. SSRs were desi...
Biological sequence compression algorithms.

PubMed

Matsumoto, T; Sadakane, K; Imai, H

2000-01-01

Today, more and more DNA sequences are becoming available. The information about DNA sequences are stored in molecular biology databases. The size and importance of these databases will be bigger and bigger in the future, therefore this information must be stored or communicated efficiently. Furthermore, sequence compression can be used to define similarities between biological sequences. The standard compression algorithms such as gzip or compress cannot compress DNA sequences, but only expand them in size. On the other hand, CTW (Context Tree Weighting Method) can compress DNA sequences less than two bits per symbol. These algorithms do not use special structures of biological sequences. Two characteristic structures of DNA sequences are known. One is called palindromes or reverse complements and the other structure is approximate repeats. Several specific algorithms for DNA sequences that use these structures can compress them less than two bits per symbol. In this paper, we improve the CTW so that characteristic structures of DNA sequences are available. Before encoding the next symbol, the algorithm searches an approximate repeat and palindrome using hash and dynamic programming. If there is a palindrome or an approximate repeat with enough length then our algorithm represents it with length and distance. By using this preprocessing, a new program achieves a little higher compression ratio than that of existing DNA-oriented compression algorithms. We also describe new compression algorithm for protein sequences.
Alu repeats: A source for the genesis of primate microsatellites

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arcot, S.S.; Batzer, M.A.; Wang, Zhenyuan

1995-09-01

As a result of their abundance, relatively uniform distribution, and high degree of polymorphism, microsatellites and minisatellites have become valuable tools in genetic mapping, forensic identity testing, and population studies. In recent years, a number of microsatellite repeats have been found to be associated with Alu interspersed repeated DNA elements. The association of an Alu element with a microsatellite repeat could result from the integration of an Alu element within a preexisting microsatellite repeat. Alternatively, Alu elements could have a direct role in the origin of microsatellite repeats. Errors introduced during reverse transcription of the primary transcript derived from anmore » Alu {open_quotes}master{close_quote} gene or the accumulation of random mutations in the middle A-rich regions and oligo(dA)-rich tails of Alu elements after insertion and subsequent expansion and contraction of these sequences could result in the genesis of a microsatellite repeat. We have tested these hypotheses by a direct evolutionary comparison of the sequences of some recent Alu elements that are found only in humans and are absent from nonhuman primates, as well as some older Alu elements that are present at orthologous positions in a number of nonhuman primates. The origin of {open_quotes}young{close_quotes} Alu insertions, absence of sequences that resemble microsatellite repeats at the orthologous loci in chimpanzees, and the gradual expansion of microsatellite repeats in some old Alu repeats at orthologous positions within the genomes of a number of nonhuman primates suggest that Alu elements are a source for the genesis of primate microsatellite repeats. 48 refs., 5 figs., 3 tabs.« less
Short-Sequence DNA Repeats in Prokaryotic Genomes

PubMed Central

van Belkum, Alex; Scherer, Stewart; van Alphen, Loek; Verbrugh, Henri

1998-01-01

Short-sequence DNA repeat (SSR) loci can be identified in all eukaryotic and many prokaryotic genomes. These loci harbor short or long stretches of repeated nucleotide sequence motifs. DNA sequence motifs in a single locus can be identical and/or heterogeneous. SSRs are encountered in many different branches of the prokaryote kingdom. They are found in genes encoding products as diverse as microbial surface components recognizing adhesive matrix molecules and specific bacterial virulence factors such as lipopolysaccharide-modifying enzymes or adhesins. SSRs enable genetic and consequently phenotypic flexibility. SSRs function at various levels of gene expression regulation. Variations in the number of repeat units per locus or changes in the nature of the individual repeat sequences may result from recombination processes or polymerase inadequacy such as slipped-strand mispairing (SSM), either alone or in combination with DNA repair deficiencies. These rather complex phenomena can occur with relative ease, with SSM approaching a frequency of 10−4 per bacterial cell division and allowing high-frequency genetic switching. Bacteria use this random strategy to adapt their genetic repertoire in response to selective environmental pressure. SSR-mediated variation has important implications for bacterial pathogenesis and evolutionary fitness. Molecular analysis of changes in SSRs allows epidemiological studies on the spread of pathogenic bacteria. The occurrence, evolution and function of SSRs, and the molecular methods used to analyze them are discussed in the context of responsiveness to environmental factors, bacterial pathogenicity, epidemiology, and the availability of full-genome sequences for increasing numbers of microorganisms, especially those that are medically relevant. PMID:9618442
GATA simple sequence repeats function as enhancer blocker boundaries.

PubMed

Kumar, Ram P; Krishnan, Jaya; Pratap Singh, Narendra; Singh, Lalji; Mishra, Rakesh K

2013-01-01

Simple sequence repeats (SSRs) account for ~3% of the human genome, but their functional significance still remains unclear. One of the prominent SSRs the GATA tetranucleotide repeat has preferentially accumulated in complex organisms. GATA repeats are particularly enriched on the human Y chromosome, and their non-random distribution and exclusive association with genes expressed during early development indicate their role in coordinated gene regulation. Here we show that GATA repeats have enhancer blocker activity in Drosophila and human cells. This enhancer blocker activity is seen in transgenic as well as native context of the enhancers at various developmental stages. These findings ascribe functional significance to SSRs and offer an explanation as to why SSRs, especially GATA, may have accumulated in complex organisms.
CRISPRDetect: A flexible algorithm to define CRISPR arrays.

PubMed

Biswas, Ambarish; Staals, Raymond H J; Morales, Sergio E; Fineran, Peter C; Brown, Chris M

2016-05-17

CRISPR (clustered regularly interspaced short palindromic repeats) RNAs provide the specificity for noncoding RNA-guided adaptive immune defence systems in prokaryotes. CRISPR arrays consist of repeat sequences separated by specific spacer sequences. CRISPR arrays have previously been identified in a large proportion of prokaryotic genomes. However, currently available detection algorithms do not utilise recently discovered features regarding CRISPR loci. We have developed a new approach to automatically detect, predict and interactively refine CRISPR arrays. It is available as a web program and command line from bioanalysis.otago.ac.nz/CRISPRDetect. CRISPRDetect discovers putative arrays, extends the array by detecting additional variant repeats, corrects the direction of arrays, refines the repeat/spacer boundaries, and annotates different types of sequence variations (e.g. insertion/deletion) in near identical repeats. Due to these features, CRISPRDetect has significant advantages when compared to existing identification tools. As well as further support for small medium and large repeats, CRISPRDetect identified a class of arrays with 'extra-large' repeats in bacteria (repeats 44-50 nt). The CRISPRDetect output is integrated with other analysis tools. Notably, the predicted spacers can be directly utilised by CRISPRTarget to predict targets. CRISPRDetect enables more accurate detection of arrays and spacers and its gff output is suitable for inclusion in genome annotation pipelines and visualisation. It has been used to analyse all complete bacterial and archaeal reference genomes.
Divergence in centromere structure distinguishes related genomes in Coix lacryma-jobi and its wild relative.

PubMed

Han, Yonghua; Wang, Guixiang; Liu, Zhao; Liu, Jinhua; Yue, Wei; Song, Rentao; Zhang, Xueyong; Jin, Weiwei

2010-02-01

Knowledge about the composition and structure of centromeres is critical for understanding how centromeres perform their functional roles. Here, we report the sequences of one centromere-associated bacterial artificial chromosome clone from a Coix lacryma-jobi library. Two Ty3/gypsy-class retrotransposons, centromeric retrotransposon of C. lacryma-jobi (CRC) and peri-centromeric retrotransposon of C. lacryma-jobi, and a (peri)centromere-specific tandem repeat with a unit length of 153 bp were identified. The CRC is highly homologous to centromere-specific retrotransposons reported in grass species. An 80-bp DNA region in the 153-bp satellite repeat was found to be conserved to centromeric satellite repeats from maize, rice, and pearl millet. Fluorescence in situ hybridization showed that the three repetitive sequences were located in (peri-)centromeric regions of both C. lacryma-jobi and Coix aquatica. However, the 153-bp satellite repeat was only detected on 20 out of the 30 chromosomes in C. aquatica. Immunostaining with an antibody against rice CENH3 indicates that the 153-bp satellite repeat and CRC might be both the major components for functional centromeres, but not all the 153-bp satellite repeats or CRC sequences are associated with CENH3. The evolution of centromeric repeats of C. lacryma-jobi during the polyploidization was discussed.
CRISPR Detection From Short Reads Using Partial Overlap Graphs.

PubMed

Ben-Bassat, Ilan; Chor, Benny

2016-06-01

Clustered regularly interspaced short palindromic repeats (CRISPR) are structured regions in bacterial and archaeal genomes, which are part of an adaptive immune system against phages. CRISPRs are important for many microbial studies and are playing an essential role in current gene editing techniques. As such, they attract substantial research interest. The exponential growth in the amount of bacterial sequence data in recent years enables the exploration of CRISPR loci in more and more species. Most of the automated tools that detect CRISPR loci rely on fully assembled genomes. However, many assemblers do not handle repetitive regions successfully. The first tool to work directly on raw sequence data is Crass, which requires reads that are long enough to contain two copies of the same repeat. We present a method to identify CRISPR repeats from raw sequence data of short reads. The algorithm is based on an observation differentiating CRISPR repeats from other types of repeats, and it involves a series of partial constructions of the overlap graph. This enables us to avoid many of the difficulties that assemblers face, as we merely aim to identify the repeats that belong to CRISPR loci. A preliminary implementation of the algorithm shows good results and detects CRISPR repeats in cases where other existing tools fail to do so.
ChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants

PubMed Central

Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh

2014-01-01

Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1–6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ PMID:25380781
ChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants.

PubMed

Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh

2014-01-01

Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1-6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ © The Author(s) 2014. Published by Oxford University Press.
Mining and validation of pyrosequenced simple sequence repeats (SSRs) from American cranberry (Vaccinium macrocarpon Ait.).

PubMed

Zhu, H; Senalik, D; McCown, B H; Zeldin, E L; Speers, J; Hyman, J; Bassil, N; Hummer, K; Simon, P W; Zalapa, J E

2012-01-01

The American cranberry (Vaccinium macrocarpon Ait.) is a major commercial fruit crop in North America, but limited genetic resources have been developed for the species. Furthermore, the paucity of codominant DNA markers has hampered the advance of genetic research in cranberry and the Ericaceae family in general. Therefore, we used Roche 454 sequencing technology to perform low-coverage whole genome shotgun sequencing of the cranberry cultivar 'HyRed'. After de novo assembly, the obtained sequence covered 266.3 Mb of the estimated 540-590 Mb in cranberry genome. A total of 107,244 SSR loci were detected with an overall density across the genome of 403 SSR/Mb. The AG repeat was the most frequent motif in cranberry accounting for 35% of all SSRs and together with AAG and AAAT accounted for 46% of all loci discovered. To validate the SSR loci, we designed 96 primer-pairs using contig sequence data containing perfect SSR repeats, and studied the genetic diversity of 25 cranberry genotypes. We identified 48 polymorphic SSR loci with 2-15 alleles per locus for a total of 323 alleles in the 25 cranberry genotypes. Genetic clustering by principal coordinates and genetic structure analyzes confirmed the heterogeneous nature of cranberries. The parentage composition of several hybrid cultivars was evident from the structure analyzes. Whole genome shotgun 454 sequencing was a cost-effective and efficient way to identify numerous SSR repeats in the cranberry sequence for marker development.
Target Site Recognition by a Diversity-Generating Retroelement

PubMed Central

Guo, Huatao; Tse, Longping V.; Nieh, Angela W.; Czornyj, Elizabeth; Williams, Steven; Oukil, Sabrina; Liu, Vincent B.; Miller, Jeff F.

2011-01-01

Diversity-generating retroelements (DGRs) are in vivo sequence diversification machines that are widely distributed in bacterial, phage, and plasmid genomes. They function to introduce vast amounts of targeted diversity into protein-encoding DNA sequences via mutagenic homing. Adenine residues are converted to random nucleotides in a retrotransposition process from a donor template repeat (TR) to a recipient variable repeat (VR). Using the Bordetella bacteriophage BPP-1 element as a prototype, we have characterized requirements for DGR target site function. Although sequences upstream of VR are dispensable, a 24 bp sequence immediately downstream of VR, which contains short inverted repeats, is required for efficient retrohoming. The inverted repeats form a hairpin or cruciform structure and mutational analysis demonstrated that, while the structure of the stem is important, its sequence can vary. In contrast, the loop has a sequence-dependent function. Structure-specific nuclease digestion confirmed the existence of a DNA hairpin/cruciform, and marker coconversion assays demonstrated that it influences the efficiency, but not the site of cDNA integration. Comparisons with other phage DGRs suggested that similar structures are a conserved feature of target sequences. Using a kanamycin resistance determinant as a reporter, we found that transplantation of the IMH and hairpin/cruciform-forming region was sufficient to target the DGR diversification machinery to a heterologous gene. In addition to furthering our understanding of DGR retrohoming, our results suggest that DGRs may provide unique tools for directed protein evolution via in vivo DNA diversification. PMID:22194701
Centromere and telomere sequence alterations reflect the rapid genome evolution within the carnivorous plant genus Genlisea.

PubMed

Tran, Trung D; Cao, Hieu X; Jovtchev, Gabriele; Neumann, Pavel; Novák, Petr; Fojtová, Miloslava; Vu, Giang T H; Macas, Jiří; Fajkus, Jiří; Schubert, Ingo; Fuchs, Joerg

2015-12-01

Linear chromosomes of eukaryotic organisms invariably possess centromeres and telomeres to ensure proper chromosome segregation during nuclear divisions and to protect the chromosome ends from deterioration and fusion, respectively. While centromeric sequences may differ between species, with arrays of tandemly repeated sequences and retrotransposons being the most abundant sequence types in plant centromeres, telomeric sequences are usually highly conserved among plants and other organisms. The genome size of the carnivorous genus Genlisea (Lentibulariaceae) is highly variable. Here we study evolutionary sequence plasticity of these chromosomal domains at an intrageneric level. We show that Genlisea nigrocaulis (1C = 86 Mbp; 2n = 40) and G. hispidula (1C = 1550 Mbp; 2n = 40) differ as to their DNA composition at centromeres and telomeres. G. nigrocaulis and its close relative G. pygmaea revealed mainly 161 bp tandem repeats, while G. hispidula and its close relative G. subglabra displayed a combination of four retroelements at centromeric positions. G. nigrocaulis and G. pygmaea chromosome ends are characterized by the Arabidopsis-type telomeric repeats (TTTAGGG); G. hispidula and G. subglabra instead revealed two intermingled sequence variants (TTCAGG and TTTCAGG). These differences in centromeric and, surprisingly, also in telomeric DNA sequences, uncovered between groups with on average a > 9-fold genome size difference, emphasize the fast genome evolution within this genus. Such intrageneric evolutionary alteration of telomeric repeats with cytosine in the guanine-rich strand, not yet known for plants, might impact the epigenetic telomere chromatin modification. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.
Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera, and comparative analyses with other grass genomes

PubMed Central

Saski, Christopher; Lee, Seung-Bum; Fjellheim, Siri; Guda, Chittibabu; Jansen, Robert K.; Luo, Hong; Tomkins, Jeffrey; Rognli, Odd Arne; Clarke, Jihong Liu

2009-01-01

Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5′ end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19–37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16–21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C–U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae. PMID:17534593
Repeated extragenic sequences in prokaryotic genomes: a proposal for the origin and dynamics of the RUP element in Streptococcus pneumoniae.

PubMed

Oggioni, M R; Claverys, J P

1999-10-01

A survey of all Streptococcus pneumoniae GenBank/EMBL DNA sequence entries and of the public domain sequence (representing more than 90% of the genome) of an S. pneumoniae type 4 strain allowed identification of 108 copies of a 107-bp-long highly repeated intergenic element called RUP (for repeat unit of pneumococcus). Several features of the element, revealed in this study, led to the proposal that RUP is an insertion sequence (IS)-derivative that could still be mobile. Among these features are: (1) a highly significant homology between the terminal inverted repeats (IRs) of RUPs and of IS630-Spn1, a new putative IS of S. pneumoniae; and (2) insertion at a TA dinucleotide, a characteristic target of several members of the IS630 family. Trans-mobilization of RUP is therefore proposed to be mediated by the transposase of IS630-Spn1. To account for the observation that RUPs are distributed among four subtypes which exhibit different degrees of sequence homogeneity, a scenario is invoked based on successive stages of RUP mobility and non-mobility, depending on whether an active transposase is present or absent. In the latter situation, an active transposase could be reintroduced into the species through natural transformation. Examination of sequences flanking RUP revealed a preferential association with ISs. It also provided evidence that RUPs promote sequence rearrangements, thereby contributing to genome flexibility. The possibility that RUP preferentially targets transforming DNA of foreign origin and subsequently favours disruption/rearrangement of exogenous sequences is discussed.
Analysis of SINE and LINE repeat content of Y chromosomes in the platypus, Ornithorhynchus anatinus.

PubMed

Kortschak, R Daniel; Tsend-Ayush, Enkhjargal; Grützner, Frank

2009-01-01

Monotremes feature an extraordinary sex-chromosome system that consists of five X and five Y chromosomes in males. These sex chromosomes share homology with bird sex chromosomes but no homology with the therian X. The genome of a female platypus was recently completed, providing unique insights into sequence and gene content of autosomes and X chromosomes, but no Y-specific sequence has so far been analysed. Here we report the isolation, sequencing and analysis of approximately 700 kb of sequence of the non-recombining regions of Y2, Y3 and Y5, which revealed differences in base composition and repeat content between autosomes and sex chromosomes, and within the sex chromosomes themselves. This provides the first insights into repeat content of Y chromosomes in platypus, which overall show similar patterns of repeat composition to Y chromosomes in other species. Interestingly, we also observed differences between the various Y chromosomes, and in combination with timing and activity patterns we provide an approach that can be used to examine the evolutionary history of the platypus sex-chromosome chain.
Efficient production of artificially designed gelatins with a Bacillus brevis system.

PubMed

Kajino, T; Takahashi, H; Hirai, M; Yamada, Y

2000-01-01

Artificially designed gelatins comprising tandemly repeated 30-amino-acid peptide units derived from human alphaI collagen were successfully produced with a Bacillus brevis system. The DNA encoding the peptide unit was synthesized by taking into consideration the codon usage of the host cells, but no clones having a tandemly repeated gene were obtained through the above-mentioned strategy. Minirepeat genes could be selected in vivo from a mixture of every possible sequence encoding an artificial gelatin by randomly ligating the mixed sequence unit and transforming it into Escherichia coli. Larger repeat genes constructed by connecting minirepeat genes obtained by in vivo selection were also stable in the expression host cells. Gelatins derived from the eight-unit and six-unit repeat genes were extracellularly produced at the level of 0.5 g/liter and easily purified by ammonium sulfate fractionation and anion-exchange chromatography. The purified artificial gelatins had the predicted N-terminal sequences and amino acid compositions and a solgel property similar to that of the native gelatin. These results suggest that the selection of a repeat unit sequence stable in an expression host is a shortcut for the efficient production of repetitive proteins and that it can conveniently be achieved by the in vivo selection method. This study revealed the possible industrial application of artificially designed repetitive proteins.
Programmable DNA-binding proteins from Burkholderia provide a fresh perspective on the TALE-like repeat domain.

PubMed

de Lange, Orlando; Wolf, Christina; Dietze, Jörn; Elsaesser, Janett; Morbitzer, Robert; Lahaye, Thomas

2014-06-01

The tandem repeats of transcription activator like effectors (TALEs) mediate sequence-specific DNA binding using a simple code. Naturally, TALEs are injected by Xanthomonas bacteria into plant cells to manipulate the host transcriptome. In the laboratory TALE DNA binding domains are reprogrammed and used to target a fused functional domain to a genomic locus of choice. Research into the natural diversity of TALE-like proteins may provide resources for the further improvement of current TALE technology. Here we describe TALE-like proteins from the endosymbiotic bacterium Burkholderia rhizoxinica, termed Bat proteins. Bat repeat domains mediate sequence-specific DNA binding with the same code as TALEs, despite less than 40% sequence identity. We show that Bat proteins can be adapted for use as transcription factors and nucleases and that sequence preferences can be reprogrammed. Unlike TALEs, the core repeats of each Bat protein are highly polymorphic. This feature allowed us to explore alternative strategies for the design of custom Bat repeat arrays, providing novel insights into the functional relevance of non-RVD residues. The Bat proteins offer fertile grounds for research into the creation of improved programmable DNA-binding proteins and comparative insights into TALE-like evolution. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Characterization of genetic sequence variation of 58 STR loci in four major population groups.

PubMed

Novroski, Nicole M M; King, Jonathan L; Churchill, Jennifer D; Seah, Lay Hong; Budowle, Bruce

2016-11-01

Massively parallel sequencing (MPS) can identify sequence variation within short tandem repeat (STR) alleles as well as their nominal allele lengths that traditionally have been obtained by capillary electrophoresis. Using the MiSeq FGx Forensic Genomics System (Illumina), STRait Razor, and in-house excel workbooks, genetic variation was characterized within STR repeat and flanking regions of 27 autosomal, 7 X-chromosome and 24 Y-chromosome STR markers in 777 unrelated individuals from four population groups. Seven hundred and forty six autosomal, 227 X-chromosome, and 324 Y-chromosome STR alleles were identified by sequence compared with 357 autosomal, 107 X-chromosome, and 189 Y-chromosome STR alleles that were identified by length. Within the observed sequence variation, 227 autosomal, 156 X-chromosome, and 112 Y-chromosome novel alleles were identified and described. One hundred and seventy six autosomal, 123 X-chromosome, and 93 Y-chromosome sequence variants resided within STR repeat regions, and 86 autosomal, 39 X-chromosome, and 20 Y-chromosome variants were located in STR flanking regions. Three markers, D18S51, DXS10135, and DYS385a-b had 1, 4, and 1 alleles, respectively, which contained both a novel repeat region variant and a flanking sequence variant in the same nucleotide sequence. There were 50 markers that demonstrated a relative increase in diversity with the variant sequence alleles compared with those of traditional nominal length alleles. These population data illustrate the genetic variation that exists in the commonly used STR markers in the selected population samples and provide allele frequencies for statistical calculations related to STR profiling with MPS data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

Construction of a small Mus musculus repetitive DNA library: identification of a new satellite sequence in Mus musculus.

PubMed Central

Pietras, D F; Bennett, K L; Siracusa, L D; Woodworth-Gutai, M; Chapman, V M; Gross, K W; Kane-Haas, C; Hastie, N D

1983-01-01

We report the construction of a small library of recombinant plasmids containing Mus musculus repetitive DNA inserts. The repetitive cloned fraction was derived from denatured genomic DNA by reassociation to a Cot value at which repetitive, but not unique, sequences have reannealed followed by exhaustive S1 nuclease treatment to degrade single stranded DNA. Initial characterizations of this library by colony filter hybridizations have led to the identification of a previously undetected M. musculus minor satellite as well as to clones containing M. musculus major satellite sequences. This new satellite is repeated 10-20 times less than the major satellite in the M. musculus genome. It has a repeat length of 130 nucleotides compared with the M. musculus major satellite with a repeat length of 234 nucleotides. Sequence analysis of the minor satellite has shown that it has a 29 base pair region with extensive homology to one of the major satellite repeating subunits. We also show by in situ hybridization that this minor satellite sequence is located at the centromeres and possibly the arms of at least half the M musculus chromosomes. Sequences related to the minor satellite have been found in the DNA of a related Mus species, Mus spretus, and may represent the major satellite of that species. Images PMID:6314268
Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum) and Comparative Analysis with Common Buckwheat (F. esculentum)

PubMed Central

Cho, Kwang-Soo; Yun, Bong-Kyoung; Yoon, Young-Ho; Hong, Su-Young; Mekapogu, Manjulatha; Kim, Kyung-Hee; Yang, Tae-Jin

2015-01-01

We report the chloroplast (cp) genome sequence of tartary buckwheat (Fagopyrum tataricum) obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale) cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp) were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats) and F. esculentum (one repeat), and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes—rpoC2, ycf3, accD, and clpP—have high synonymous (Ks) value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum. PMID:25966355
Two new miniature inverted-repeat transposable elements in the genome of the clam Donax trunculus.

PubMed

Šatović, Eva; Plohl, Miroslav

2017-10-01

Repetitive sequences are important components of eukaryotic genomes that drive their evolution. Among them are different types of mobile elements that share the ability to spread throughout the genome and form interspersed repeats. To broaden the generally scarce knowledge on bivalves at the genome level, in the clam Donax trunculus we described two new non-autonomous DNA transposons, miniature inverted-repeat transposable elements (MITEs), named DTC M1 and DTC M2. Like other MITEs, they are characterized by their small size, their A + T richness, and the presence of terminal inverted repeats (TIRs). DTC M1 and DTC M2 are 261 and 286 bp long, respectively, and in addition to TIRs, both of them contain a long imperfect palindrome sequence in their central parts. These elements are present in complete and truncated versions within the genome of the clam D. trunculus. The two new MITEs share only structural similarity, but lack any nucleotide sequence similarity to each other. In a search for related elements in databases, blast search revealed within the Crassostrea gigas genome a larger element sharing sequence similarity only to DTC M1 in its TIR sequences. The lack of sequence similarity with any previously published mobile elements indicates that DTC M1 and DTC M2 elements may be unique to D. trunculus.
Short intronic repeat sequences facilitate circular RNA production

PubMed Central

Liang, Dongming

2014-01-01

Recent deep sequencing studies have revealed thousands of circular noncoding RNAs generated from protein-coding genes. These RNAs are produced when the precursor messenger RNA (pre-mRNA) splicing machinery “backsplices” and covalently joins, for example, the two ends of a single exon. However, the mechanism by which the spliceosome selects only certain exons to circularize is largely unknown. Using extensive mutagenesis of expression plasmids, we show that miniature introns containing the splice sites along with short (∼30- to 40-nucleotide) inverted repeats, such as Alu elements, are sufficient to allow the intervening exons to circularize in cells. The intronic repeats must base-pair to one another, thereby bringing the splice sites into close proximity to each other. More than simple thermodynamics is clearly at play, however, as not all repeats support circularization, and increasing the stability of the hairpin between the repeats can sometimes inhibit circular RNA biogenesis. The intronic repeats and exonic sequences must collaborate with one another, and a functional 3′ end processing signal is required, suggesting that circularization may occur post-transcriptionally. These results suggest detailed and generalizable models that explain how the splicing machinery determines whether to produce a circular noncoding RNA or a linear mRNA. PMID:25281217
Basis of altered RNA-binding specificity by PUF proteins revealed by crystal structures of yeast Puf4p

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miller, Matthew T.; Higgin, Joshua J.; Hall, Traci M.Tanaka

2008-06-06

Pumilio/FBF (PUF) family proteins are found in eukaryotic organisms and regulate gene expression post-transcriptionally by binding to sequences in the 3' untranslated region of target transcripts. PUF proteins contain an RNA binding domain that typically comprises eight {alpha}-helical repeats, each of which recognizes one RNA base. Some PUF proteins, including yeast Puf4p, have altered RNA binding specificity and use their eight repeats to bind to RNA sequences with nine or ten bases. Here we report the crystal structures of Puf4p alone and in complex with a 9-nucleotide (nt) target RNA sequence, revealing that Puf4p accommodates an 'extra' nucleotide by modestmore » adaptations allowing one base to be turned away from the RNA binding surface. Using structural information and sequence comparisons, we created a mutant Puf4p protein that preferentially binds to an 8-nt target RNA sequence over a 9-nt sequence and restores binding of each protein repeat to one RNA base.« less
Inverted repeats in the promoter as an autoregulatory sequence for TcrX in Mycobacterium tuberculosis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhattacharya, Monolekha; Das, Amit Kumar, E-mail: amitk@hijli.iitkgp.ernet.in

Highlights: Black-Right-Pointing-Pointer The regulatory sequences recognized by TcrX have been identified. Black-Right-Pointing-Pointer The regulatory region comprises of inverted repeats segregated by 30 bp region. Black-Right-Pointing-Pointer The mode of binding of TcrX with regulatory sequence is unique. Black-Right-Pointing-Pointer In silico TcrX-DNA docked model binds one of the inverted repeats. Black-Right-Pointing-Pointer Both phosphorylated and unphosphorylated TcrX binds regulatory sequence in vitro. -- Abstract: TcrY, a histidine kinase, and TcrX, a response regulator, constitute a two-component system in Mycobacterium tuberculosis. tcrX, which is expressed during iron scarcity, is instrumental in the survival of iron-dependent M. tuberculosis. However, the regulator of tcrX/Y has notmore » been fully characterized. Crosslinking studies of TcrX reveal that it can form oligomers in vitro. Electrophoretic mobility shift assays (EMSAs) show that TcrX recognizes two regions in the promoter that are comprised of inverted repeats separated by {approx}30 bp. The dimeric in silico model of TcrX predicts binding to one of these inverted repeat regions. Site-directed mutagenesis and radioactive phosphorylation indicate that D54 of TcrX is phosphorylated by H256 of TcrY. However, phosphorylated and unphosphorylated TcrX bind the regulatory sequence with equal efficiency, which was shown with an EMSA using the D54A TcrX mutant.« less
Structural features of the rice chromosome 4 centromere.

PubMed

Zhang, Yu; Huang, Yuchen; Zhang, Lei; Li, Ying; Lu, Tingting; Lu, Yiqi; Feng, Qi; Zhao, Qiang; Cheng, Zhukuan; Xue, Yongbiao; Wing, Rod A; Han, Bin

2004-01-01

A complete sequence of a chromosome centromere is necessary for fully understanding centromere function. We reported the sequence structures of the first complete rice chromosome centromere through sequencing a large insert bacterial artificial chromosome clone-based contig, which covered the rice chromosome 4 centromere. Complete sequencing of the 124-kb rice chromosome 4 centromere revealed that it consisted of 18 tracts of 379 tandemly arrayed repeats known as CentO and a total of 19 centromeric retroelements (CRs) but no unique sequences were detected. Four tracts, composed of 65 CentO repeats, were located in the opposite orientation, and 18 CentO tracts were flanked by 19 retroelements. The CRs were classified into four types, and the type I retroelements appeared to be more specific to rice centromeres. The preferential insert of the CRs among CentO repeats indicated that the centromere-specific retroelements may contribute to centromere expansion during evolution. The presence of three intact retrotransposons in the centromere suggests that they may be responsible for functional centromere initiation through a transcription-mediated mechanism.
Chromosome rearrangements via template switching between diverged repeated sequences

PubMed Central

Anand, Ranjith P.; Tsaponina, Olga; Greenwell, Patricia W.; Lee, Cheng-Sheng; Du, Wei; Petes, Thomas D.

2014-01-01

Recent high-resolution genome analyses of cancer and other diseases have revealed the occurrence of microhomology-mediated chromosome rearrangements and copy number changes. Although some of these rearrangements appear to involve nonhomologous end-joining, many must have involved mechanisms requiring new DNA synthesis. Models such as microhomology-mediated break-induced replication (MM-BIR) have been invoked to explain these rearrangements. We examined BIR and template switching between highly diverged sequences in Saccharomyces cerevisiae, induced during repair of a site-specific double-strand break (DSB). Our data show that such template switches are robust mechanisms that give rise to complex rearrangements. Template switches between highly divergent sequences appear to be mechanistically distinct from the initial strand invasions that establish BIR. In particular, such jumps are less constrained by sequence divergence and exhibit a different pattern of microhomology junctions. BIR traversing repeated DNA sequences frequently results in complex translocations analogous to those seen in mammalian cells. These results suggest that template switching among repeated genes is a potent driver of genome instability and evolution. PMID:25367035
Simple sequence repeat marker loci discovery using SSR primer.

PubMed

Robinson, Andrew J; Love, Christopher G; Batley, Jacqueline; Barker, Gary; Edwards, David

2004-06-12

Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. With the increase in the availability of DNA sequence information, an automated process to identify and design PCR primers for amplification of SSR loci would be a useful tool in plant breeding programs. We report an application that integrates SPUTNIK, an SSR repeat finder, with Primer3, a PCR primer design program, into one pipeline tool, SSR Primer. On submission of multiple FASTA formatted sequences, the script screens each sequence for SSRs using SPUTNIK. The results are parsed to Primer3 for locus-specific primer design. The script makes use of a Web-based interface, enabling remote use. This program has been written in PERL and is freely available for non-commercial users by request from the authors. The Web-based version may be accessed at http://hornbill.cspp.latrobe.edu.au/
Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

PubMed

Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

2014-01-01

Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.
The central domain of bovine submaxillary mucin consists of over 50 tandem repeats of 329 amino acids. Chromosomal localization of the BSM1 gene and relations to ovine and porcine counterparts.

PubMed

Jiang, W; Gupta, D; Gallagher, D; Davis, S; Bhavanandan, V P

2000-04-01

We previously elucidated five distinct protein domains (I-V) for bovine submaxillary mucin, which is encoded by two genes, BSM1 and BSM2. Using Southern blot analysis, genomic cloning and sequencing of the BSM1 gene, we now show that the central domain (V) consists of approximately 55 tandem repeats of 329 amino acids and that domains III-V are encoded by a 58.4-kb exon, the largest exon known for all genes to date. The BSM1 gene was mapped by fluorescence in situ hybridization to the proximal half of chromosome 5 at bands q2. 2-q2.3. The amino-acid sequence of six tandem repeats (two full and four partial) were found to have only 92-94% identities. We propose that the variability in the amino-acid sequences of the mucin tandem repeat is important for generating the combinatorial library of saccharides that are necessary for the protective function of mucins. The deduced peptide sequences of the central domain match those determined from the purified bovine submaxillary mucin and also show 68-94% identity to published peptide sequences of ovine submaxillary mucin. This indicates that the core protein of ovine submaxillary mucin is closely related to that of bovine submaxillary mucin and contains similar tandem repeats in the central domain. In contrast, the central domain of porcine submaxillary mucin is reported to consist of 81-amino-acid tandem repeats. However, both bovine submaxillary mucin and porcine submaxillary mucin contain similar N-terminal and C-terminal domains and the corresponding genes are in the conserved linkage regions of the respective genomes.
Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers

PubMed Central

Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

2014-01-01

Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551
Pstl repeat: a family of short interspersed nucleotide element (SINE)-like sequences in the genomes of cattle, goat, and buffalo.

PubMed

Sheikh, Faruk G; Mukhopadhyay, Sudit S; Gupta, Prabhakar

2002-02-01

The PstI family of elements are short, highly repetitive DNA sequences interspersed throughout the genome of the Bovidae. We have cloned and sequenced some members of the PstI family from cattle, goat, and buffalo. These elements are approximately 500 bp, have a copy number of 2 x 10(5) - 4 x 10(5), and comprise about 4% of the haploid genome. Studies of nucleotide sequence homology indicate that the buffalo and goat PstI repeats (type II) are similar types of short interspersed nucleotide element (SINE) sequences, but the cattle PstI repeat (type I) is considerably more divergent. Additionally, the goat PstI sequence showed significant sequence homology with bovine serine tRNA, and is therefore likely derived from serine tRNA. Interestingly, Southern hybridization suggests that both types of SINEs (I and II) are present in all the species of Bovidae. Dendrogram analysis indicates that cattle PstI SINE is similar to bovine Alu-like SINEs. Goat and buffalo SINEs formed a separate cluster, suggesting that these two types of SINEs evolved separately in the genome of the Bovidae.
Repeat-aware modeling and correction of short read errors.

PubMed

Yang, Xiao; Aluru, Srinivas; Dorman, Karin S

2011-02-15

High-throughput short read sequencing is revolutionizing genomics and systems biology research by enabling cost-effective deep coverage sequencing of genomes and transcriptomes. Error detection and correction are crucial to many short read sequencing applications including de novo genome sequencing, genome resequencing, and digital gene expression analysis. Short read error detection is typically carried out by counting the observed frequencies of kmers in reads and validating those with frequencies exceeding a threshold. In case of genomes with high repeat content, an erroneous kmer may be frequently observed if it has few nucleotide differences with valid kmers with multiple occurrences in the genome. Error detection and correction were mostly applied to genomes with low repeat content and this remains a challenging problem for genomes with high repeat content. We develop a statistical model and a computational method for error detection and correction in the presence of genomic repeats. We propose a method to infer genomic frequencies of kmers from their observed frequencies by analyzing the misread relationships among observed kmers. We also propose a method to estimate the threshold useful for validating kmers whose estimated genomic frequency exceeds the threshold. We demonstrate that superior error detection is achieved using these methods. Furthermore, we break away from the common assumption of uniformly distributed errors within a read, and provide a framework to model position-dependent error occurrence frequencies common to many short read platforms. Lastly, we achieve better error correction in genomes with high repeat content. The software is implemented in C++ and is freely available under GNU GPL3 license and Boost Software V1.0 license at "http://aluru-sun.ece.iastate.edu/doku.php?id = redeem". We introduce a statistical framework to model sequencing errors in next-generation reads, which led to promising results in detecting and correcting errors for genomes with high repeat content.
Computational prediction of CRISPR cassettes in gut metagenome samples from Chinese type-2 diabetic patients and healthy controls.

PubMed

Mangericao, Tatiana C; Peng, Zhanhao; Zhang, Xuegong

2016-01-11

CRISPR has been becoming a hot topic as a powerful technique for genome editing for human and other higher organisms. The original CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats coupled with CRISPR-associated proteins) is an important adaptive defence system for prokaryotes that provides resistance against invading elements such as viruses and plasmids. A CRISPR cassette contains short nucleotide sequences called spacers. These unique regions retain a history of the interactions between prokaryotes and their invaders in individual strains and ecosystems. One important ecosystem in the human body is the human gut, a rich habitat populated by a great diversity of microorganisms. Gut microbiomes are important for human physiology and health. Metagenome sequencing has been widely applied for studying the gut microbiomes. Most efforts in metagenome study has been focused on profiling taxa compositions and gene catalogues and identifying their associations with human health. Less attention has been paid to the analysis of the ecosystems of microbiomes themselves especially their CRISPR composition. We conducted a preliminary analysis of CRISPR sequences in a human gut metagenomic data set of Chinese individuals of type-2 diabetes patients and healthy controls. Applying an available CRISPR-identification algorithm, PILER-CR, we identified 3169 CRISPR cassettes in the data, from which we constructed a set of 1302 unique repeat sequences and 36,709 spacers. A more extensive analysis was made for the CRISPR repeats: these repeats were submitted to a more comprehensive clustering and classification using the web server tool CRISPRmap. All repeats were compared with known CRISPRs in the database CRISPRdb. A total of 784 repeats had matches in the database, and the remaining 518 repeats from our set are potentially novel ones. The computational analysis of CRISPR composition based contigs of metagenome sequencing data is feasible. It provides an efficient approach for finding potential novel CRISPR arrays and for analysing the ecosystem and history of human microbiomes.
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis

PubMed Central

Zheng, Jin-shuang; Sun, Cheng-zhen; Zhang, Shu-ning; Hou, Xi-lin; Bonnema, Guusje

2016-01-01

A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis. PMID:27507974
Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis.

PubMed

Zheng, Jin-Shuang; Sun, Cheng-Zhen; Zhang, Shu-Ning; Hou, Xi-Lin; Bonnema, Guusje

2016-01-01

A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis.
Molecular characterization of long direct repeat (LDR) sequences expressing a stable mRNA encoding for a 35-amino-acid cell-killing peptide and a cis-encoded small antisense RNA in Escherichia coli.

PubMed

Kawano, Mitsuoki; Oshima, Taku; Kasai, Hiroaki; Mori, Hirotada

2002-07-01

Genome sequence analyses of Escherichia coli K-12 revealed four copies of long repetitive elements. These sequences are designated as long direct repeat (LDR) sequences. Three of the repeats (LDR-A, -B, -C), each approximately 500 bp in length, are located as tandem repeats at 27.4 min on the genetic map. Another copy (LDR-D), 450 bp in length and nearly identical to LDR-A, -B and -C, is located at 79.7 min, a position that is directly opposite the position of LDR-A, -B and -C. In this study, we demonstrate that LDR-D encodes a 35-amino-acid peptide, LdrD, the overexpression of which causes rapid cell killing and nucleoid condensation of the host cell. Northern blot and primer extension analysis showed constitutive transcription of a stable mRNA (approximately 370 nucleotides) encoding LdrD and an unstable cis-encoded antisense RNA (approximately 60 nucleotides), which functions as a trans-acting regulator of ldrD translation. We propose that LDR encodes a toxin-antitoxin module. LDR-homologous sequences are not pre-sent on any known plasmids but are conserved in Salmonella and other enterobacterial species.
Evolutional dynamics of 45S and 5S ribosomal DNA in ancient allohexaploid Atropa belladonna.

PubMed

Volkov, Roman A; Panchuk, Irina I; Borisjuk, Nikolai V; Hosiawa-Baranska, Marta; Maluszynska, Jolanta; Hemleben, Vera

2017-01-23

Polyploid hybrids represent a rich natural resource to study molecular evolution of plant genes and genomes. Here, we applied a combination of karyological and molecular methods to investigate chromosomal structure, molecular organization and evolution of ribosomal DNA (rDNA) in nightshade, Atropa belladonna (fam. Solanaceae), one of the oldest known allohexaploids among flowering plants. Because of their abundance and specific molecular organization (evolutionarily conserved coding regions linked to variable intergenic spacers, IGS), 45S and 5S rDNA are widely used in plant taxonomic and evolutionary studies. Molecular cloning and nucleotide sequencing of A. belladonna 45S rDNA repeats revealed a general structure characteristic of other Solanaceae species, and a very high sequence similarity of two length variants, with the only difference in number of short IGS subrepeats. These results combined with the detection of three pairs of 45S rDNA loci on separate chromosomes, presumably inherited from both tetraploid and diploid ancestor species, example intensive sequence homogenization that led to substitution/elimination of rDNA repeats of one parent. Chromosome silver-staining revealed that only four out of six 45S rDNA sites are frequently transcriptionally active, demonstrating nucleolar dominance. For 5S rDNA, three size variants of repeats were detected, with the major class represented by repeats containing all functional IGS elements required for transcription, the intermediate size repeats containing partially deleted IGS sequences, and the short 5S repeats containing severe defects both in the IGS and coding sequences. While shorter variants demonstrate increased rate of based substitution, probably in their transition into pseudogenes, the functional 5S rDNA variants are nearly identical at the sequence level, pointing to their origin from a single parental species. Localization of the 5S rDNA genes on two chromosome pairs further supports uniparental inheritance from the tetraploid progenitor. The obtained molecular, cytogenetic and phylogenetic data demonstrate complex evolutionary dynamics of rDNA loci in allohexaploid species of Atropa belladonna. The high level of sequence unification revealed in 45S and 5S rDNA loci of this ancient hybrid species have been seemingly achieved by different molecular mechanisms.
Identification of presumed ancestral DNA sequences of phaseolin in Phaseolus vulgaris.

PubMed Central

Kami, J; Velásquez, V B; Debouck, D G; Gepts, P

1995-01-01

Common bean (Phaseolus vulgaris) consists of two major geographic gene pools, one distributed in Mexico, Central America, and Colombia and the other in the southern Andes (southern Peru, Bolivia, and Argentina). Amplification and sequencing of members of the multigene family coding for phaseolin, the major seed storage protein of the common bean, provide evidence for accumulation of tandem direct repeats in both introns and exons during evolution of the multigene family in this species. The presumed ancestral phaseolin sequences, without tandem repeats, were found in recently discovered but nearly extinct wild common bean populations of Ecuador and northern Peru that are intermediate between the two major gene pools of the species based on geographical and molecular arguments. Our results illustrate the usefulness of tandem direct repeats in establishing the polarity of DNA sequence divergence and therefore in proposing phylogenies. Images Fig. 1 Fig. 3 PMID:7862642

Functional centromeres in Astragalus sinicus include a compact centromere-specific histone H3 and a 20-bp tandem repeat.

PubMed

Tek, Ahmet L; Kashihara, Kazunari; Murata, Minoru; Nagaki, Kiyotaka

2011-11-01

The centromere plays an essential role for proper chromosome segregation during cell division and usually harbors long arrays of tandem repeated satellite DNA sequences. Although this function is conserved among eukaryotes, the sequences of centromeric DNA repeats are variable. Most of our understanding of functional centromeres, which are defined by localization of a centromere-specific histone H3 (CENH3) protein, comes from model organisms. The components of the functional centromere in legumes are poorly known. The genus Astragalus is a member of the legumes and bears the largest numbers of species among angiosperms. Therefore, we studied the components of centromeres in Astragalus sinicus. We identified the CenH3 homolog of A. sinicus, AsCenH3 that is the most compact in size among higher eukaryotes. A CENH3-based assay revealed the functional centromeric DNA sequences from A. sinicus, called CentAs. The CentAs repeat is localized in A. sinicus centromeres, and comprises an AT-rich tandem repeat with a monomer size of 20 nucleotides.
Identification and characterization of dinucleotide repeat (CA)[sub n] markers for genetic mapping in dog

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ostrander, E.A.; Sprague, G.F. Jr.; Rine, J.

1993-04-01

A large block of simple sequence repeat (SSR) polymorphisms for the dog genome has been isolated and characterized. Screening of primary libraries by conventional hybridization methods as well as by screening of enriched marker-selected libraries led to the isolation of a large number of genomic clones that contained (CA)[sub n] repeats. The sequences of 101 clones showed that the size and complexity of (CA)[sub n] repeats in the dog genome were similar to those reported for these markers in the human genome. Detailed analysis of a representative subset of these markers revealed that most markers were moderately to highly polymorphic,more » with PIC values exceeding 0.70 for 33% of the markers tested. An association between higher PIC values and markers containing longer (CA)[sub n] repeats was observed in these studies, as previously noted for similar markers in the human genome. A list of primer sequences that tag each characterized marker is provided, and a comprehensive system of nomenclature for the dog genome is suggested. 28 refs., 4 figs., 2 tabs.« less
Horseradish peroxidase-labeled oligonucleotides and fluorescent tyramides for rapid detection of chromosome-specific repeat sequences.

PubMed

van Gijlswijk, R P; Wiegant, J; Vervenne, R; Lasan, R; Tanke, H J; Raap, A K

1996-01-01

We present a sensitive and rapid fluorescence in situ hybridization (FISH) strategy for detecting chromosome-specific repeat sequences. It uses horseradish peroxidase (HRP)-labeled oligonucleotide sequences in combination with fluorescent tyramide-based detection. After in situ hybridization, the HRP conjugated to the oligonucleotide probe is used to deposit fluorescently labeled tyramide molecules at the site of hybridization. The method features full chemical synthesis of probes, strong FISH signals, and short processing periods, as well as multicolor capabilities.
Cultivar identification, pedigree verification, and diversity analysis among Peach (Prunus persica L. Batsch) Cultivars based on Simple Sequence Repeat markers

USDA-ARS?s Scientific Manuscript database

The genetic relationships and pedigree inferences among peach (Prunus persica (L.) Batsch) accessions and breeding lines used in genetic improvement were evaluated using 15 simple sequence repeat (SSR) markers. A total of 80 alleles were detected among the 37 peach accessions with an average of 5.53...
THE USE OF INTER SIMPLE SEQUENCE REPEATS (ISSR) IN DISTINGUISHING NEIGHBORING DOUGLAS-FIR TREES AS A MEANS TO IDENTIFYING TREE ROOTS WITH ABOVE-GROUND BIOMASS

EPA Science Inventory

We are attempting to identify specific root fragments from soil cores with individual trees. We successfully used Inter Simple Sequence Repeats (ISSR) to distinguish neighboring old-growth Douglas-fir trees from one another, while maintaining identity among each tree's parts. W...
Cross-species transferability and mapping of genomic and cDNA SSRs in pines

Treesearch

D. Chagne; P. Chaumeil; A. Ramboer; C. Collada; A. Guevara; M. T. Cervera; G. G. Vendramin; V. Garcia; J-M. Frigerio; Craig Echt; T. Richardson; Christophe Plomion

2004-01-01

Two unigene datasets of Pinus taeda and Pinus pinaster were screened to detect di-, tri and tetranucleotide repeated motifs using the SSRIT script. A total of 419 simple sequence repeats (SSRs) were identified, from which only 12.8% overlapped between the two sets. The position of the SSRs within the coding sequence were predicted...
An integrated genetic linkage map of watermelon and genetic diversity based on single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers

USDA-ARS?s Scientific Manuscript database

Watermelon (Citrullus lanatus var. lanatus) is an important vegetable fruit throughout the world. A high number of single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers should provide large coverage of the watermelon genome and high phylogenetic resolution of germplasm acces...
A Repeat Look at Repeating Patterns

ERIC Educational Resources Information Center

Markworth, Kimberly A.

2016-01-01

A "repeating pattern" is a cyclical repetition of an identifiable core. Children in the primary grades usually begin pattern work with fairly simple patterns, such as AB, ABC, or ABB patterns. The unique letters represent unique elements, whereas the sequence of letters represents the core that is repeated. Based on color, shape,…
Rapid and accurate synthesis of TALE genes from synthetic oligonucleotides.

PubMed

Wang, Fenghua; Zhang, Hefei; Gao, Jingxia; Chen, Fengjiao; Chen, Sijie; Zhang, Cuizhen; Peng, Gang

2016-01-01

Custom synthesis of transcription activator-like effector (TALE) genes has relied upon plasmid libraries of pre-fabricated TALE-repeat monomers or oligomers. Here we describe a novel synthesis method that directly incorporates annealed synthetic oligonucleotides into the TALE-repeat units. Our approach utilizes iterative sets of oligonucleotides and a translational frame check strategy to ensure the high efficiency and accuracy of TALE-gene synthesis. TALE arrays of more than 20 repeats can be constructed, and the majority of the synthesized constructs have perfect sequences. In addition, this novel oligonucleotide-based method can readily accommodate design changes to the TALE repeats. We demonstrated an increased gene targeting efficiency against a genomic site containing a potentially methylated cytosine by incorporating non-conventional repeat variable di-residue (RVD) sequences.
[Polymorphic loci and polymorphism analysis of short tandem repeats within XNP gene].

PubMed

Liu, Qi-Ji; Gong, Yao-Qin; Guo, Chen-Hong; Chen, Bing-Xi; Li, Jiang-Xia; Guo, Yi-Shou

2002-01-01

To select polymorphic short tandem repeat markers within X-linked nuclear protein (XNP) gene, genomic clones which contain XNP gene were recognized by homologous analysis with XNP cDNA. By comparing the cDNA with genomic DNA, non-exonic sequences were identified, and short tandem repeats were selected from non-exonic sequences by using BCM search Launcher. Polymorphisms of the short tandem repeats in Chinese population were evaluated by PCR amplification and PAGE. Five short tandem repeats were identified from XNP gene, two of which were polymorphic. Four and 11 alleles were observed in Chinese population for XNPSTR1 and XNPSTR4, respectively. Heterozygosities were 47% for XNPSTR1 and 70% for XNPSTR4. XNPSTR1 and XNPSTR4 localized within 3' end and intron 10, respectively. Two polymorphic short tandem repeats have been identified within XNP gene and will be useful for linkage analysis and gene diagnosis of XNP gene.
Use of the LUS in sequence allele designations to facilitate probabilistic genotyping of NGS-based STR typing results.

PubMed

Just, Rebecca S; Irwin, Jodi A

2018-05-01

Some of the expected advantages of next generation sequencing (NGS) for short tandem repeat (STR) typing include enhanced mixture detection and genotype resolution via sequence variation among non-homologous alleles of the same length. However, at the same time that NGS methods for forensic DNA typing have advanced in recent years, many caseworking laboratories have implemented or are transitioning to probabilistic genotyping to assist the interpretation of complex autosomal STR typing results. Current probabilistic software programs are designed for length-based data, and were not intended to accommodate sequence strings as the product input. Yet to leverage the benefits of NGS for enhanced genotyping and mixture deconvolution, the sequence variation among same-length products must be utilized in some form. Here, we propose use of the longest uninterrupted stretch (LUS) in allele designations as a simple method to represent sequence variation within the STR repeat regions and facilitate - in the nearterm - probabilistic interpretation of NGS-based typing results. An examination of published population data indicated that a reference LUS region is straightforward to define for most autosomal STR loci, and that using repeat unit plus LUS length as the allele designator can represent greater than 80% of the alleles detected by sequencing. A proof of concept study performed using a freely available probabilistic software demonstrated that the LUS length can be used in allele designations when a program does not require alleles to be integers, and that utilizing sequence information improves interpretation of both single-source and mixed contributor STR typing results as compared to using repeat unit information alone. The LUS concept for allele designation maintains the repeat-based allele nomenclature that will permit backward compatibility to extant STR databases, and the LUS lengths themselves will be concordant regardless of the NGS assay or analysis tools employed. Further, these biologically based, easy-to-derive designations uphold clear relationships between parent alleles and their stutter products, enabling analysis in fully continuous probabilistic programs that model stutter while avoiding the algorithmic complexities that come with string based searches. Though using repeat unit plus LUS length as the allele designator does not capture variation that occurs outside of the core repeat regions, this straightforward approach would permit the large majority of known STR sequence variation to be used for mixture deconvolution and, in turn, result in more informative mixture statistics in the near term. Ultimately, the method could bridge the gap from current length-based probabilistic systems to facilitate broader adoption of NGS by forensic DNA testing laboratories. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Genome-Wide Characterization and Linkage Mapping of Simple Sequence Repeats in Mei (Prunus mume Sieb. et Zucc.)

PubMed Central

Sun, Lidan; Yang, Weiru; Zhang, Qixiang; Cheng, Tangren; Pan, Huitang; Xu, Zongda; Zhang, Jie; Chen, Chuguang

2013-01-01

Because of its popularity as an ornamental plant in East Asia, mei (Prunus mume Sieb. et Zucc.) has received increasing attention in genetic and genomic research with the recent shotgun sequencing of its genome. Here, we performed the genome-wide characterization of simple sequence repeats (SSRs) in the mei genome and detected a total of 188,149 SSRs occurring at a frequency of 794 SSR/Mb. Mononucleotide repeats were the most common type of SSR in genomic regions, followed by di- and tetranucleotide repeats. Most of the SSRs in coding sequences (CDS) were composed of tri- or hexanucleotide repeat motifs, but mononucleotide repeats were always the most common in intergenic regions. Genome-wide comparison of SSR patterns among the mei, strawberry (Fragaria vesca), and apple (Malus×domestica) genomes showed mei to have the highest density of SSRs, slightly higher than that of strawberry (608 SSR/Mb) and almost twice as high as that of apple (398 SSR/Mb). Mononucleotide repeats were the dominant SSR motifs in the three Rosaceae species. Using 144 SSR markers, we constructed a 670 cM-long linkage map of mei delimited into eight linkage groups (LGs), with an average marker distance of 5 cM. Seventy one scaffolds covering about 27.9% of the assembled mei genome were anchored to the genetic map, depending on which the macro-colinearity between the mei genome and Prunus T×E reference map was identified. The framework map of mei constructed provides a first step into subsequent high-resolution genetic mapping and marker-assisted selection for this ornamental species. PMID:23555708
Sequence of retrovirus provirus resembles that of bacterial transposable elements

NASA Astrophysics Data System (ADS)

Shimotohno, Kunitada; Mizutani, Satoshi; Temin, Howard M.

1980-06-01

The nucleotide sequences of the terminal regions of an infectious integrated retrovirus cloned in the modified λ phage cloning vector Charon 4A have been elucidated. There is a 569-base pair direct repeat at both ends of the viral DNA. The cell-virus junctions at each end consist of a 5-base pair direct repeat of cell DNA next to a 3-base pair inverted repeat of viral DNA. This structure resembles that of a transposable element and is consistent with the protovirus hypothesis that retroviruses evolved from the cell genome.
Evolutionary force of AT-rich repeats to trap genomic and episomal DNAs into the rice genome: lessons from endogenous pararetrovirus.

PubMed

Liu, Ruifang; Koyanagi, Kanako O; Chen, Sunlu; Kishima, Yuji

2012-12-01

In plant genomes, the incorporation of DNA segments is not a common method of artificial gene transfer. Nevertheless, various segments of pararetroviruses have been found in plant genomes in recent decades. The rice genome contains a number of segments of endogenous rice tungro bacilliform virus-like sequences (ERTBVs), many of which are present between AT dinucleotide repeats (ATrs). Comparison of genomic sequences between two closely related rice subspecies, japonica and indica, allowed us to verify the preferential insertion of ERTBVs into ATrs. In addition to ERTBVs, the comparative analyses showed that ATrs occasionally incorporate repeat sequences including transposable elements, and a wide range of other sequences. Besides the known genomic sequences, the insertion sequences also represented DNAs of unclear origins together with ERTBVs, suggesting that ATrs have integrated episomal DNAs that would have been suspended in the nucleus. Such insertion DNAs might be trapped by ATrs in the genome in a host-dependent manner. Conversely, other simple mono- and dinucleotide sequence repeats (SSR) were less frequently involved in insertion events relative to ATrs. Therefore, ATrs could be regarded as hot spots of double-strand breaks that induce non-homologous end joining. The insertions within ATrs occasionally generated new gene-related sequences or involved structural modifications of existing genes. Likewise, in a comparison between Arabidopsis thaliana and Arabidopsis lyrata, the insertions preferred ATrs to other SSRs. Therefore ATrs in plant genomes could be considered as genomic dumping sites that have trapped various DNA molecules and may have exerted a powerful evolutionary force. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.
Human telomeres that contain (CTAGGG)n repeats show replication dependent instability in somatic cells and the male germline

PubMed Central

Mendez-Bermudez, Aaron; Hills, Mark; Pickett, Hilda A.; Phan, Anh Tuân; Mergny, Jean-Louis; Riou, Jean-François; Royle, Nicola J.

2009-01-01

A number of different processes that impact on telomere length dynamics have been identified but factors that affect the turnover of repeats located proximally within the telomeric DNA are poorly defined. We have identified a particular repeat type (CTAGGG) that is associated with an extraordinarily high mutation rate (20% per gamete) in the male germline. The mutation rate is affected by the length and sequence homogeneity of the (CTAGGG)n array. This level of instability was not seen with other sequence-variant repeats, including the TCAGGG repeat type that has the same composition. Telomeres carrying a (CTAGGG)n array are also highly unstable in somatic cells with the mutation process resulting in small gains or losses of repeats that also occasionally result in the deletion of the whole (CTAGGG)n array. These sequences are prone to quadruplex formation in vitro but adopt a different topology from (TTAGGG)n (see accompanying article). Interestingly, short (CTAGGG)2 oligonucleotides induce a DNA damage response (γH2AX foci) as efficiently as (TTAGGG)2 oligos in normal fibroblast cells, suggesting they recruit POT1 from the telomere. Moreover, in vitro assays show that (CTAGGG)n repeats bind POT1 more efficiently than (TTAGGG)n or (TCAGGG)n. We estimate that 7% of human telomeres contain (CTAGGG)n repeats and when present, they create additional problems that probably arise during telomere replication. PMID:19656953
TRDistiller: a rapid filter for enrichment of sequence datasets with proteins containing tandem repeats.

PubMed

Richard, François D; Kajava, Andrey V

2014-06-01

The dramatic growth of sequencing data evokes an urgent need to improve bioinformatics tools for large-scale proteome analysis. Over the last two decades, the foremost efforts of computer scientists were devoted to proteins with aperiodic sequences having globular 3D structures. However, a large portion of proteins contain periodic sequences representing arrays of repeats that are directly adjacent to each other (so called tandem repeats or TRs). These proteins frequently fold into elongated fibrous structures carrying different fundamental functions. Algorithms specific to the analysis of these regions are urgently required since the conventional approaches developed for globular domains have had limited success when applied to the TR regions. The protein TRs are frequently not perfect, containing a number of mutations, and some of them cannot be easily identified. To detect such "hidden" repeats several algorithms have been developed. However, the most sensitive among them are time-consuming and, therefore, inappropriate for large scale proteome analysis. To speed up the TR detection we developed a rapid filter that is based on the comparison of composition and order of short strings in the adjacent sequence motifs. Tests show that our filter discards up to 22.5% of proteins which are known to be without TRs while keeping almost all (99.2%) TR-containing sequences. Thus, we are able to decrease the size of the initial sequence dataset enriching it with TR-containing proteins which allows a faster subsequent TR detection by other methods. The program is available upon request. Copyright © 2014 Elsevier Inc. All rights reserved.
Repeats of base oligomers as the primordial coding sequences of the primeval earth and their vestiges in modern genes.

PubMed

Ohno, S

1984-01-01

Three outstanding properties uniquely qualify repeats of base oligomers as the primordial coding sequences of all polypeptide chains. First, when compared with randomly generated base sequences in general, they are more likely to have long open reading frames. Second, periodical polypeptide chains specified by such repeats are more likely to assume either alpha-helical or beta-sheet secondary structures than are polypeptide chains of random sequence. Third, provided that the number of bases in the oligomeric unit is not a multiple of 3, these internally repetitious coding sequences are impervious to randomly sustained base substitutions, deletions, and insertions. This is because the recurring periodicity of their polypeptide chains is given by three consecutive copies of the oligomeric unit translated in three different reading frames. Accordingly, when one reading frame is open, the other two are automatically open as well, all three being capable of coding for polypeptide chains of identical periodicity. Under this circumstance, a frame shift due to the deletion or insertion of a number of bases that is not a multiple of 3 fails to alter the down-stream amino acid sequence, and even a base change causing premature chain-termination can silence only one of the three potential coding units. Newly arisen coding sequences in modern organisms are oligomeric repeats, and most of the older genes retain various vestiges of their original internal repetitions. Some of the genes (e.g., oncogenes) have even inherited the property of being impervious to randomly sustained base changes.
Perturbation of the Akt/Gsk3-β signalling pathway is common to Drosophila expressing expanded untranslated CAG, CUG and AUUCU repeat RNAs.

PubMed

van Eyk, Clare L; O'Keefe, Louise V; Lawlor, Kynan T; Samaraweera, Saumya E; McLeod, Catherine J; Price, Gareth R; Venter, Deon J; Richards, Robert I

2011-07-15

Recent evidence supports a role for RNA as a common pathogenic agent in both the 'polyglutamine' and 'untranslated' dominant expanded repeat disorders. One feature of all repeat sequences currently associated with disease is their predicted ability to form a hairpin secondary structure at the RNA level. In order to investigate mechanisms by which hairpin-forming repeat RNAs could induce neurodegeneration, we have looked for alterations in gene transcript levels as hallmarks of the cellular response to toxic hairpin repeat RNAs. Three disease-associated repeat sequences--CAG, CUG and AUUCU--were specifically expressed in the neurons of Drosophila and resultant common transcriptional changes assessed by microarray analyses. Transcripts that encode several components of the Akt/Gsk3-β signalling pathway were altered as a consequence of expression of these repeat RNAs, indicating that this pathway is a component of the neuronal response to these pathogenic RNAs and may represent an important common therapeutic target in this class of diseases.
Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

PubMed Central

2011-01-01

Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS) of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA containing putative SNPs was amplified by PCR from AL8/78 and AS75 and resequenced with the ABI 3730 xl. In a sample of 302 randomly selected putative SNPs, 84.0% in gene regions, 88.0% in repeat junctions, and 81.3% in uncharacterized regions were validated. Conclusion An annotation-based genome-wide SNP discovery pipeline for NGS platforms was developed. The pipeline is suitable for SNP discovery in genomic libraries of complex genomes and does not require a reference genome sequence. The pipeline is applicable to all current NGS platforms, provided that at least one such platform generates relatively long reads. The pipeline package, AGSNP, and the discovered 497,118 Ae. tauschii SNPs can be accessed at (http://avena.pw.usda.gov/wheatD/agsnp.shtml). PMID:21266061
REPPER—repeats and their periodicities in fibrous proteins

PubMed Central

Gruber, Markus; Söding, Johannes; Lupas, Andrei N.

2005-01-01

REPPER (REPeats and their PERiodicities) is an integrated server that detects and analyzes regions with short gapless repeats in protein sequences or alignments. It finds periodicities by Fourier Transform (FTwin) and internal similarity analysis (REPwin). FTwin assigns numerical values to amino acids that reflect certain properties, for instance hydrophobicity, and gives information on corresponding periodicities. REPwin uses self-alignments and displays repeats that reveal significant internal similarities. Both programs use a sliding window to ensure that different periodic regions within the same protein are detected independently. FTwin and REPwin are complemented by secondary structure prediction (PSIPRED) and coiled coil prediction (COILS), making the server a versatile analysis tool for sequences of fibrous proteins. REPPER is available at . PMID:15980460

An examination of the origin and evolution of additional tandem repeats in the mitochondrial DNA control region of Japanese sika deer (Cervus Nippon).

PubMed

Ba, Hengxing; Wu, Lang; Liu, Zongyue; Li, Chunyi

2016-01-01

Tandem repeat units are only detected in the left domain of the mitochondrial DNA control region in sika deer. Previous studies showed that Japanese sika deer have more tandem repeat units than its cousins from the Asian continent and Taiwan, which often have only three repeat units. To determine the origin and evolution of these additional repeat units in Japanese sika deer, we obtained the sequence of repeat units from an expanded dataset of the control region from all sika deer lineages. The functional constraint is inferred to act on the first repeat unit because this repeat has the least sequence divergence in comparison to the other units. Based on slipped-strand mispairing mechanisms, the illegitimate elongation model could account for the addition or deletion of these additional repeat units in the Japanese sika deer population. We also report that these additional repeat units could be occurring in the internal positions of tandem repeat regions, possibly via coupling with a homogenization mechanism within and among these lineages. Moreover, the increased number of repeat units in the Japanese sika deer population could reflect a balance between mutation and selection, as well as genetic drift.
Myotonin protein-kinase [AGC]n trinucleotide repeat in seven nonhuman primates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Novelli, G.; Sineo, L.; Pontieri, E.

Myotonic dystrophy (DM) is due to a genomic instability of a trinucleotide [AGC]n motif, located at the 3{prime} UTR region of a protein-kinase gene (myotonin protein kinase, MT-PK). The [AGC] repeat is meiotically and mitotically unstable, and it is directly related to the manifestations of the disorder. Although a gene dosage effect of the MT-PK has been demonstrated n DM muscle, the mechanism(s) by which the intragenic repeat expansion leads to disease is largely unknown. This non-standard mutational event could reflect an evolutionary mechanism widespread among animal genomes. We have isolated and sequenced the complete 3{prime}UTR region of the MT-PKmore » gene in seven primates (macaque, orangutan, gorilla, chimpanzee, gibbon, owl monkey, saimiri), and examined by comparative sequence nucleotide analysis the [AGC]n intragenic repeat and the surrounding nucleotides. The genomic organization, including the [AGC]n repeat structure, was conserved in all examined species, excluding the gibbon (Hylobates agilis), in which the [AGC]n upstream sequence (GGAA) is replaced by a GA dinucleotide. The number of [AGC]n in the examined species ranged between 7 (gorilla) and 13 repeats (owl monkeys), with a polymorphism informative content (PIC) similar to that observed in humans. These results indicate that the 3{prime}UTR [AGC] repeat within the MT-PK gene is evolutionarily conserved, supporting that this region has important regulatory functions.« less
The complete chloroplast genome sequence of Cephalotaxus oliveri (Cephalotaxaceae): evolutionary comparison of cephalotaxus chloroplast DNAs and insights into the loss of inverted repeat copies in gymnosperms.

PubMed

Yi, Xuan; Gao, Lei; Wang, Bo; Su, Ying-Juan; Wang, Ting

2013-01-01

We have determined the complete chloroplast (cp) genome sequence of Cephalotaxus oliveri. The genome is 134,337 bp in length, encodes 113 genes, and lacks inverted repeat (IR) regions. Genome-wide mutational dynamics have been investigated through comparative analysis of the cp genomes of C. oliveri and C. wilsoniana. Gene order transformation analyses indicate that when distinct isomers are considered as alternative structures for the ancestral cp genome of cupressophyte and Pinaceae lineages, it is not possible to distinguish between hypotheses favoring retention of the same IR region in cupressophyte and Pinaceae cp genomes from a hypothesis proposing independent loss of IRA and IRB. Furthermore, in cupressophyte cp genomes, the highly reduced IRs are replaced by short repeats that have the potential to mediate homologous recombination, analogous to the situation in Pinaceae. The importance of repeats in the mutational dynamics of cupressophyte cp genomes is also illustrated by the accD reading frame, which has undergone extreme length expansion in cupressophytes. This has been caused by a large insertion comprising multiple repeat sequences. Overall, we find that the distribution of repeats, indels, and substitutions is significantly correlated in Cephalotaxus cp genomes, consistent with a hypothesis that repeats play a role in inducing substitutions and indels in conifer cp genomes.
Identification, variation and transcription of pneumococcal repeat sequences

PubMed Central

2011-01-01

Background Small interspersed repeats are commonly found in many bacterial chromosomes. Two families of repeats (BOX and RUP) have previously been identified in the genome of Streptococcus pneumoniae, a nasopharyngeal commensal and respiratory pathogen of humans. However, little is known about the role they play in pneumococcal genetics. Results Analysis of the genome of S. pneumoniae ATCC 700669 revealed the presence of a third repeat family, which we have named SPRITE. All three repeats are present at a reduced density in the genome of the closely related species S. mitis. However, they are almost entirely absent from all other streptococci, although a set of elements related to the pneumococcal BOX repeat was identified in the zoonotic pathogen S. suis. In conjunction with information regarding their distribution within the pneumococcal chromosome, this suggests that it is unlikely that these repeats are specialised sequences performing a particular role for the host, but rather that they constitute parasitic elements. However, comparing insertion sites between pneumococcal sequences indicates that they appear to transpose at a much lower rate than IS elements. Some large BOX elements in S. pneumoniae were found to encode open reading frames on both strands of the genome, whilst another was found to form a composite RNA structure with two T box riboswitches. In multiple cases, such BOX elements were demonstrated as being expressed using directional RNA-seq and RT-PCR. Conclusions BOX, RUP and SPRITE repeats appear to have proliferated extensively throughout the pneumococcal chromosome during the species' past, but novel insertions are currently occurring at a relatively slow rate. Through their extensive secondary structures, they seem likely to affect the expression of genes with which they are co-transcribed. Software for annotation of these repeats is freely available from ftp://ftp.sanger.ac.uk/pub/pathogens/strep_repeats/. PMID:21333003
Evidence of birth-and-death evolution of 5S rRNA gene in Channa species (Teleostei, Perciformes).

PubMed

Barman, Anindya Sundar; Singh, Mamta; Singh, Rajeev Kumar; Lal, Kuldeep Kumar

2016-12-01

In higher eukaryotes, minor rDNA family codes for 5S rRNA that is arranged in tandem arrays and comprises of a highly conserved 120 bp long coding sequence with a variable non-transcribed spacer (NTS). Initially the 5S rDNA repeats are considered to be evolved by the process of concerted evolution. But some recent reports, including teleost fishes suggested that evolution of 5S rDNA repeat does not fit into the concerted evolution model and evolution of 5S rDNA family may be explained by a birth-and-death evolution model. In order to study the mode of evolution of 5S rDNA repeats in Perciformes fish species, nucleotide sequence and molecular organization of five species of genus Channa were analyzed in the present study. Molecular analyses revealed several variants of 5S rDNA repeats (four types of NTS) and networks created by a neighbor net algorithm for each type of sequences (I, II, III and IV) did not show a clear clustering in species specific manner. The stable secondary structure is predicted and upstream and downstream conserved regulatory elements were characterized. Sequence analyses also shown the presence of two putative pseudogenes in Channa marulius. Present study supported that 5S rDNA repeats in genus Channa were evolved under the process of birth-and-death.
Drastic stability change of X-X mismatch in d(CXG) trinucleotide repeat disorders under molecular crowding condition.

PubMed

Teng, Ye; Pramanik, Smritimoy; Tateishi-Karimata, Hisae; Ohyama, Tatsuya; Sugimoto, Naoki

2018-02-05

The trinucleotide repeat d(CXG) (X = A, C, G or T) is the most common sequence causing repeat expansion disorders. The formation of non-canonical structures, such as hairpin structures with X-X mismatches, has been proposed to affect gene expression and regulation, which are important in pathological studies of these devastating neurological diseases. However, little information is available regarding the thermodynamics of the repeat sequence under crowded cellular conditions where many non-canonical structures such as G-quadruplexes are highly stabilized, while duplexes are destabilised. In this study, we investigated the different stabilities of X-X mismatches in the context of internal d(CXG) self-complementary sequences in an environment with a high concentration of cosolutes to mimic the crowding conditions in cells. The stabilities of full-matched duplexes and duplexes with A-A, G-G, and T-T mismatched base pairs under molecular crowding conditions were notably decreased compared to under dilute conditions. However, the stability of the DNA duplex with a C-C mismatch base pair was only slightly destabilised. Investigating different stabilities of X-X mismatches in d(CXG) sequences is important for improving our understanding of the formation and transition of multiple non-canonical structures in trinucleotide repeat diseases, and may provide insights for pathological studies and drug development. Copyright © 2018 Elsevier Inc. All rights reserved.
Effects of GABA[subscript A] Modulators on the Repeated Acquisition of Response Sequences in Squirrel Monkeys

ERIC Educational Resources Information Center

Campbell, Una C.; Winsauer, Peter J.; Stevenson, Michael W.; Moerschbaecher, Joseph M.

2004-01-01

The present study investigated the effects of positive and negative GABA[subscript A] modulators under three different baselines of repeated acquisition in squirrel monkeys in which the monkeys acquired a three-response sequence on three keys under a second-order fixed-ratio (FR) schedule of food reinforcement. In two of these baselines, the…
Short intronic repeat sequences facilitate circular RNA production.

PubMed

Liang, Dongming; Wilusz, Jeremy E

2014-10-15

Recent deep sequencing studies have revealed thousands of circular noncoding RNAs generated from protein-coding genes. These RNAs are produced when the precursor messenger RNA (pre-mRNA) splicing machinery "backsplices" and covalently joins, for example, the two ends of a single exon. However, the mechanism by which the spliceosome selects only certain exons to circularize is largely unknown. Using extensive mutagenesis of expression plasmids, we show that miniature introns containing the splice sites along with short (∼ 30- to 40-nucleotide) inverted repeats, such as Alu elements, are sufficient to allow the intervening exons to circularize in cells. The intronic repeats must base-pair to one another, thereby bringing the splice sites into close proximity to each other. More than simple thermodynamics is clearly at play, however, as not all repeats support circularization, and increasing the stability of the hairpin between the repeats can sometimes inhibit circular RNA biogenesis. The intronic repeats and exonic sequences must collaborate with one another, and a functional 3' end processing signal is required, suggesting that circularization may occur post-transcriptionally. These results suggest detailed and generalizable models that explain how the splicing machinery determines whether to produce a circular noncoding RNA or a linear mRNA. © 2014 Liang and Wilusz; Published by Cold Spring Harbor Laboratory Press.
Typing of artiodactyl MHC-DRB genes with the help of intronic simple repeated DNA sequences.

PubMed

Schwaiger, F W; Buitkamp, J; Weyers, E; Epplen, J T

1993-02-01

An efficient oligonucleotide typing method for the highly polymorphic MHC-DRB genes is described for artiodactyls like cattle, sheep and goat. By means of the polymerase chain reaction, the second exon of MHC-DRB is amplified as well as part of the adjacent intron containing a mixed simple repeat sequence. Using this primer combination we were able to amplify the MHC-DRB exons 2 and adjacent introns from all of the investigated 10 species of the family of Bovidae and giraffes. Therefore, the DRB genes of novel artiodactyl species can also be readily studied. Oligonucleotide probes specific for the polymorphisms of ungulate DRB genes are used with which sequences differing in at least one single base can be distinguished. Exonic polymorphism was found to be correlated with the allele lengths and the patterns of the repeat structures. Hence oligonucleotide probes specific for different simple repeats and polymorphic positions serve also for typing across species barriers. The strict correlation of sequence length and exonic polymorphism permits a preselection of specific oligonucleotides for hybridization. Thus more than 20 alleles can already be differentiated from each of the three species.
High Quality Maize Centromere 10 Sequence Reveals Evidence of Frequent Recombination Events

PubMed Central

Wolfgruber, Thomas K.; Nakashima, Megan M.; Schneider, Kevin L.; Sharma, Anupma; Xie, Zidian; Albert, Patrice S.; Xu, Ronghui; Bilinski, Paul; Dawe, R. Kelly; Ross-Ibarra, Jeffrey; Birchler, James A.; Presting, Gernot G.

2016-01-01

The ancestral centromeres of maize contain long stretches of the tandemly arranged CentC repeat. The abundance of tandem DNA repeats and centromeric retrotransposons (CR) has presented a significant challenge to completely assembling centromeres using traditional sequencing methods. Here, we report a nearly complete assembly of the 1.85 Mb maize centromere 10 from inbred B73 using PacBio technology and BACs from the reference genome project. The error rates estimated from overlapping BAC sequences are 7 × 10−6 and 5 × 10−5 for mismatches and indels, respectively. The number of gaps in the region covered by the reassembly was reduced from 140 in the reference genome to three. Three expressed genes are located between 92 and 477 kb from the inferred ancestral CentC cluster, which lies within the region of highest centromeric repeat density. The improved assembly increased the count of full-length CR from 5 to 55 and revealed a 22.7 kb segmental duplication that occurred approximately 121,000 years ago. Our analysis provides evidence of frequent recombination events in the form of partial retrotransposons, deletions within retrotransposons, chimeric retrotransposons, segmental duplications including higher order CentC repeats, a deleted CentC monomer, centromere-proximal inversions, and insertion of mitochondrial sequences. Double-strand DNA break (DSB) repair is the most plausible mechanism for these events and may be the major driver of centromere repeat evolution and diversity. In many cases examined here, DSB repair appears to be mediated by microhomology, suggesting that tandem repeats may have evolved to efficiently repair frequent DSBs in centromeres. PMID:27047500
Rapid and highly efficient construction of TALE-based transcriptional regulators and nucleases for genome modification.

PubMed

Li, Lixin; Piatek, Marek J; Atef, Ahmed; Piatek, Agnieszka; Wibowo, Anjar; Fang, Xiaoyun; Sabir, J S M; Zhu, Jian-Kang; Mahfouz, Magdy M

2012-03-01

Transcription activator-like effectors (TALEs) can be used as DNA-targeting modules by engineering their repeat domains to dictate user-selected sequence specificity. TALEs have been shown to function as site-specific transcriptional activators in a variety of cell types and organisms. TALE nucleases (TALENs), generated by fusing the FokI cleavage domain to TALE, have been used to create genomic double-strand breaks. The identity of the TALE repeat variable di-residues, their number, and their order dictate the DNA sequence specificity. Because TALE repeats are nearly identical, their assembly by cloning or even by synthesis is challenging and time consuming. Here, we report the development and use of a rapid and straightforward approach for the construction of designer TALE (dTALE) activators and nucleases with user-selected DNA target specificity. Using our plasmid set of 100 repeat modules, researchers can assemble repeat domains for any 14-nucleotide target sequence in one sequential restriction-ligation cloning step and in only 24 h. We generated several custom dTALEs and dTALENs with new target sequence specificities and validated their function by transient expression in tobacco leaves and in vitro DNA cleavage assays, respectively. Moreover, we developed a web tool, called idTALE, to facilitate the design of dTALENs and the identification of their genomic targets and potential off-targets in the genomes of several model species. Our dTALE repeat assembly approach along with the web tool idTALE will expedite genome-engineering applications in a variety of cell types and organisms including plants.
Inter-plate aseismic slip on the subducting plate boundaries estimated from repeating earthquakes

NASA Astrophysics Data System (ADS)

Igarashi, T.

2015-12-01

Sequences of repeating earthquakes are caused by repeating slips of small patches surrounded by aseismic slip areas at plate boundary zones. Recently, they have been detected in many regions. In this study, I detected repeating earthquakes which occurred in Japan and the world by using seismograms observed in the Japanese seismic network, and investigated the space-time characteristics of inter-plate aseismic slip on the subducting plate boundaries. To extract repeating earthquakes, I calculate cross-correlation coefficients of band-pass filtering seismograms at each station following Igarashi [2010]. I used two data-set based on USGS catalog for about 25 years from May 1990 and JMA catalog for about 13 years from January 2002. As a result, I found many sequences of repeating earthquakes in the subducting plate boundaries of the Andaman-Sumatra-Java and Japan-Kuril-Kamchatka-Aleutian subduction zones. By applying the scaling relations among a seismic moment, recurrence interval and slip proposed by Nadeau and Johnson [1998], they indicate the space-time changes of inter-plate aseismic slips. Pairs of repeating earthquakes with the longest time interval occurred in the Solomon Islands area and the recurrence interval was about 18.5 years. The estimated slip-rate is about 46 mm/year, which correspond to about half of the relative plate motion in this area. Several sequences with fast slip-rates correspond to the post-seismic slips after the 2004 Sumatra-Andaman earthquake (M9.0), the 2006 Kuril earthquake (M8.3), the 2007 southern Sumatra earthquake (M8.5), and the 2011 Tohoku-oki earthquake (M9.0). The database of global repeating earthquakes enables the comparison of the inter-plate aseismic slips of various plate boundary zones of the world. I believe that I am likely to detect more sequences by extending analysis periods in the area where they were not found in this analysis.
Molecular cloning and sequence analysis of the gene coding for the 57kDa soluble antigen of the salmonid fish pathogen Renibacterium salmoninarum

USGS Publications Warehouse

Chien, Maw-Sheng; Gilbert , Teresa L.; Huang, Chienjin; Landolt, Marsha L.; O'Hara, Patrick J.; Winton, James R.

1992-01-01

The complete sequence coding for the 57-kDa major soluble antigen of the salmonid fish pathogen, Renibacterium salmoninarum, was determined. The gene contained an opening reading frame of 1671 nucleotides coding for a protein of 557 amino acids with a calculated Mr value of 57190. The first 26 amino acids constituted a signal peptide. The deduced sequence for amino acid residues 27–61 was in agreement with the 35 N-terminal amino acid residues determined by microsequencing, suggesting the protein in synthesized as a 557-amino acid precursor and processed to produce a mature protein of Mr 54505. Two regions of the protein contained imperfect direct repeats. The first region contained two copies of an 81-residue repeat, the second contained five copies of an unrelated 25-residue repeat. Also, a perfect inverted repeat (including three in-frame UAA stop codons) was observed at the carboxyl-terminus of the gene.
Direct repeat sequences are essential for function of the cis-acting locus of transfer (clt) of Streptomyces phaeochromogenes plasmid pJV1.

PubMed

Franco, Bernardo; González-Cerón, Gabriela; Servín-González, Luis

2003-11-01

The functionality of direct and inverted repeat sequences inside the cis acting locus of transfer (clt) of the Streptomyces plasmid pJV1 was determined by testing the effect of different deletions on plasmid transfer. The results show that the single most important element for pJV1 clt function is a series of evenly spaced 9 bp long direct repeats which match the consensus CCGCACA(C/G)(C/G), since their deletion caused a dramatic reduction in plasmid transfer. The presence of these repeats in the absence of any other clt sequences allowed plasmid transfer to occur at a frequency that was at least two orders of magnitude higher than that obtained in the complete absence of clt. A database search revealed regions with a similar organization, and in the same position, in Streptomyces plasmids pSN22 and pSLS, which have transfer proteins homologous to those of pJV1.
BAC end sequencing of Pacific white shrimp Litopenaeus vannamei: a glimpse into the genome of Penaeid shrimp

NASA Astrophysics Data System (ADS)

Zhao, Cui; Zhang, Xiaojun; Liu, Chengzhang; Huan, Pin; Li, Fuhua; Xiang, Jianhai; Huang, Chao

2012-05-01

Little is known about the genome of Pacific white shrimp ( Litopenaeus vannamei). To address this, we conducted BAC (bacterial artificial chromosome) end sequencing of L. vannamei. We selected and sequenced 7 812 BAC clones from the BAC library LvHE from the two ends of the inserts by Sanger sequencing. After trimming and quality filtering, 11 279 BAC end sequences (BESs) including 4 609 pairedends BESs were obtained. The total length of the BESs was 4 340 753 bp, representing 0.18% of the L. vannamei haploid genome. The lengths of the BESs ranged from 100 bp to 660 bp with an average length of 385 bp. Analysis of the BESs indicated that the L. vannamei genome is AT-rich and that the primary repeats patterns were simple sequence repeats (SSRs) and low complexity sequences. Dinucleotide and hexanucleotide repeats were the most common SSR types in the BESs. The most abundant transposable element was gypsy, which may contribute to the generation of the large genome size of L. vannamei. We successfully annotated 4 519 BESs by BLAST searching, including genes involved in immunity and sex determination. Our results provide an important resource for functional gene studies, map construction and integration, and complete genome assembly for this species.
Transposon-like properties of the major, long repetitive sequence family in the genome of Physarum polycephalum

PubMed Central

Pearston, Douglas H.; Gordon, Mairi; Hardman, Norman

1985-01-01

A family of long, highly-repetitive sequences, referred to previously as `HpaII-repeats', dominates the genome of the eukaryotic slime mould Physarum polycephalum. These sequences are found exclusively in scrambled clusters. They account for about one-half of the total complement of repetitive DNA in Physarum, and represent the major sequence component found in hypermethylated, 20-50 kb segments of Physarum genomic DNA that fail to be cleaved using the restriction endonuclease HpaII. The structure of this abundant repetitive element was investigated by analysing cloned segments derived from the hypermethylated genomic DNA compartment. We show that the `HpaII-repeat' forms part of a larger repetitive DNA structure, ∼8.6 kb in length, with several structural features in common with recognised eukaryotic transposable genetic elements. Scrambled clusters of the sequence probably arise as a result of transposition-like events, during which the element preferentially recombines in either orientation with target sites located in other copies of the same repeated sequence. The target sites for transposition/recombination are not related in sequence but in all cases studied they are potentially capable of promoting the formation of small `cruciforms' or `Z-DNA' structures which might be recognised during the recombination process. ImagesFig. 3.Fig. 4. PMID:16453652
Sequencing, annotation and comparative analysis of nine BACs of giant panda (Ailuropoda melanoleuca).

PubMed

Zheng, Yang; Cai, Jing; Li, JianWen; Li, Bo; Lin, Runmao; Tian, Feng; Wang, XiaoLing; Wang, Jun

2010-01-01

A 10-fold BAC library for giant panda was constructed and nine BACs were selected to generate finish sequences. These BACs could be used as a validation resource for the de novo assembly accuracy of the whole genome shotgun sequencing reads of giant panda newly generated by the Illumina GA sequencing technology. Complete sanger sequencing, assembly, annotation and comparative analysis were carried out on the selected BACs of a joint length 878 kb. Homologue search and de novo prediction methods were used to annotate genes and repeats. Twelve protein coding genes were predicted, seven of which could be functionally annotated. The seven genes have an average gene size of about 41 kb, an average coding size of about 1.2 kb and an average exon number of 6 per gene. Besides, seven tRNA genes were found. About 27 percent of the BAC sequence is composed of repeats. A phylogenetic tree was constructed using neighbor-join algorithm across five species, including giant panda, human, dog, cat and mouse, which reconfirms dog as the most related species to giant panda. Our results provide detailed sequence and structure information for new genes and repeats of giant panda, which will be helpful for further studies on the giant panda.
Highly sensitive detection of individual HEAT and ARM repeats with HHpred and COACH.

PubMed

Kippert, Fred; Gerloff, Dietlind L

2009-09-24

HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat. Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries. A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high sensitivity at a low false positive rate and will therefore greatly enhance the accuracy of predictions of HEAT and ARM domains.
Highly Sensitive Detection of Individual HEAT and ARM Repeats with HHpred and COACH

PubMed Central

Kippert, Fred; Gerloff, Dietlind L.

2009-01-01

Background HEAT and ARM repeats occur in a large number of eukaryotic proteins. As these repeats are often highly diverged, the prediction of HEAT or ARM domains can be challenging. Except for the most clear-cut cases, identification at the individual repeat level is indispensable, in particular for determining domain boundaries. However, methods using single sequence queries do not have the sensitivity required to deal with more divergent repeats and, when applied to proteins with known structures, in some cases failed to detect a single repeat. Methodology and Principal Findings Testing algorithms which use multiple sequence alignments as queries, we found two of them, HHpred and COACH, to detect HEAT and ARM repeats with greatly enhanced sensitivity. Calibration against experimentally determined structures suggests the use of three score classes with increasing confidence in the prediction, and prediction thresholds for each method. When we applied a new protocol using both HHpred and COACH to these structures, it detected 82% of HEAT repeats and 90% of ARM repeats, with the minimum for a given protein of 57% for HEAT repeats and 60% for ARM repeats. Application to bona fide HEAT and ARM proteins or domains indicated that similar numbers can be expected for the full complement of HEAT/ARM proteins. A systematic screen of the Protein Data Bank for false positive hits revealed their number to be low, in particular for ARM repeats. Double false positive hits for a given protein were rare for HEAT and not at all observed for ARM repeats. In combination with fold prediction and consistency checking (multiple sequence alignments, secondary structure prediction, and position analysis), repeat prediction with the new HHpred/COACH protocol dramatically improves prediction in the twilight zone of fold prediction methods, as well as the delineation of HEAT/ARM domain boundaries. Significance A protocol is presented for the identification of individual HEAT or ARM repeats which is straightforward to implement. It provides high sensitivity at a low false positive rate and will therefore greatly enhance the accuracy of predictions of HEAT and ARM domains. PMID:19777061
Genetic characterization of the UCS and Kex1 loci of Pneumocystis jirovecii.

PubMed

Esteves, F; Tavares, A; Costa, M C; Gaspar, J; Antunes, F; Matos, O

2009-02-01

Nucleotide variation in the Pneumocystis jirovecii upstream conserved sequence (UCS) and kexin-like serine protease (Kex1) loci was studied in pulmonary specimens from Portuguese HIV-positive patients. DNA was extracted and used for specific molecular sequence analysis. The number of UCS tandem repeats detected in 13 successfully sequenced isolates ranged from three (9 isolates, 69%) to four (4 isolates, 31%). A novel tandem repeat pattern and two novel polymorphisms were detected in the UCS region. For the Kex1 gene, the wild-type (24 isolates, 86%) was the most frequent sequence detected among the 28 sequenced isolates. Nevertheless, a nonsynonymous (1 isolate, 3%) and three synonymous (3 isolates, 11%) polymorphisms were detected and are described here for the first time.

APE1 incision activity at abasic sites in tandem repeat sequences.

PubMed

Li, Mengxia; Völker, Jens; Breslauer, Kenneth J; Wilson, David M

2014-05-29

Repetitive DNA sequences, such as those present in microsatellites and minisatellites, telomeres, and trinucleotide repeats (linked to fragile X syndrome, Huntington disease, etc.), account for nearly 30% of the human genome. These domains exhibit enhanced susceptibility to oxidative attack to yield base modifications, strand breaks, and abasic sites; have a propensity to adopt non-canonical DNA forms modulated by the positions of the lesions; and, when not properly processed, can contribute to genome instability that underlies aging and disease development. Knowledge on the repair efficiencies of DNA damage within such repetitive sequences is therefore crucial for understanding the impact of such domains on genomic integrity. In the present study, using strategically designed oligonucleotide substrates, we determined the ability of human apurinic/apyrimidinic endonuclease 1 (APE1) to cleave at apurinic/apyrimidinic (AP) sites in a collection of tandem DNA repeat landscapes involving telomeric and CAG/CTG repeat sequences. Our studies reveal the differential influence of domain sequence, conformation, and AP site location/relative positioning on the efficiency of APE1 binding and strand incision. Intriguingly, our data demonstrate that APE1 endonuclease efficiency correlates with the thermodynamic stability of the DNA substrate. We discuss how these results have both predictive and mechanistic consequences for understanding the success and failure of repair protein activity associated with such oxidatively sensitive, conformationally plastic/dynamic repetitive DNA domains. Published by Elsevier Ltd.
The complete chloroplast genome of Cinnamomum camphora and its comparison with related Lauraceae species.

PubMed

Chen, Caihui; Zheng, Yongjie; Liu, Sian; Zhong, Yongda; Wu, Yanfang; Li, Jiang; Xu, Li-An; Xu, Meng

2017-01-01

Cinnamomum camphora , a member of the Lauraceae family, is a valuable aromatic and timber tree that is indigenous to the south of China and Japan. All parts of Cinnamomum camphora have secretory cells containing different volatile chemical compounds that are utilized as herbal medicines and essential oils. Here, we reported the complete sequencing of the chloroplast genome of Cinnamomum camphora using illumina technology. The chloroplast genome of Cinnamomum camphora is 152,570 bp in length and characterized by a relatively conserved quadripartite structure containing a large single copy region of 93,705 bp, a small single copy region of 19,093 bp and two inverted repeat (IR) regions of 19,886 bp. Overall, the genome contained 123 coding regions, of which 15 were repeated in the IR regions. An analysis of chloroplast sequence divergence revealed that the small single copy region was highly variable among the different genera in the Lauraceae family. A total of 40 repeat structures and 83 simple sequence repeats were detected in both the coding and non-coding regions. A phylogenetic analysis indicated that Calycanthus is most closely related to Lauraceae , both being members of Laurales , which forms a sister group to Magnoliids . The complete sequence of the chloroplast of Cinnamomum camphora will aid in in-depth taxonomical studies of the Lauraceae family in the future. The genetic sequence information will also have valuable applications for chloroplast genetic engineering.
[Detection of CRISPR and its relationship to drug resistance in Shigella].

PubMed

Wang, Linlin; Wang, Yingfang; Duan, Guangcai; Xue, Zerun; Guo, Xiangjiao; Wang, Pengfei; Xi, Yuanlin; Yang, Haiyan

2015-04-04

To detect clustered regularly interspaced short palindromic repeats (CRISPR) in Shigella, and to analyze its relationship to drug resistance. Four pairs of primers were used for the detection of convincing CRISPR structures CRISPR-S2 and CRISPR-S4, questionable CRISPR structures CRISPR-S1 and CRISPR-S3 in 60 Shigella strains. All primers were designed using sequences in CRISPR database. CRISPR Finder was used to analyze CRISPR and susceptibilities of Shigella strains were tested by agar diffusion method. Furthermore, we analyzed the relationship between drug resistance and CRISPR-S4. The positive rate of convincing CRISPR structures was 95%. The four CRISPR loci formed 12 spectral patterns (A-L), all of which contained convincing CRISPR structures except type K. We found one new repeat and 12 new spacers. The multi-drug resistance rate was 53. 33% . We found no significant difference between CRISPR-S4 and drug resistant. However, the repeat sequence of CRISPR-S4 in multi- or TE-resistance strains was mainly R4.1 with AC deletions in the 3' end, and the spacer sequences of CRISPR-S4 in multi-drug resistance strains were mainly Sp5.1, Sp6.1 and Sp7. CRISPR was common in Shigella. Variations df repeat sequences and diversities of spacer sequences might be related to drug resistance in Shigella.
The structure of TON1937 from archaeon Thermococcus onnurineus NA1 reveals a eukaryotic HEAT-like architecture.

PubMed

Jeong, Jae-Hee; Kim, Yi-Seul; Rojviriya, Catleya; Cha, Hyung Jin; Ha, Sung-Chul; Kim, Yeon-Gil

2013-10-01

The members of the ARM/HEAT repeat-containing protein superfamily in eukaryotes have been known to mediate protein-protein interactions by using their concave surface. However, little is known about the ARM/HEAT repeat proteins in prokaryotes. Here we report the crystal structure of TON1937, a hypothetical protein from the hyperthermophilic archaeon Thermococcus onnurineus NA1. The structure reveals a crescent-shaped molecule composed of a double layer of α-helices with seven anti-parallel α-helical repeats. A structure-based sequence alignment of the α-helical repeats identified a conserved pattern of hydrophobic or aliphatic residues reminiscent of the consensus sequence of eukaryotic HEAT repeats. The individual repeats of TON1937 also share high structural similarity with the canonical eukaryotic HEAT repeats. In addition, the concave surface of TON1937 is proposed to be its potential binding interface based on this structural comparison and its surface properties. These observations lead us to speculate that the archaeal HEAT-like repeats of TON1937 have evolved to engage in protein-protein interactions in the same manner as eukaryotic HEAT repeats. Copyright © 2013 Elsevier B.V. All rights reserved.
Mutation at a distance caused by homopolymeric guanine repeats in Saccharomyces cerevisiae

PubMed Central

McDonald, Michael J.; Yu, Yen-Hsin; Guo, Jheng-Fen; Chong, Shin Yen; Kao, Cheng-Fu; Leu, Jun-Yi

2016-01-01

Mutation provides the raw material from which natural selection shapes adaptations. The rate at which new mutations arise is therefore a key factor that determines the tempo and mode of evolution. However, an accurate assessment of the mutation rate of a given organism is difficult because mutation rate varies on a fine scale within a genome. A central challenge of evolutionary genetics is to determine the underlying causes of this variation. In earlier work, we had shown that repeat sequences not only are prone to a high rate of expansion and contraction but also can cause an increase in mutation rate (on the order of kilobases) of the sequence surrounding the repeat. We perform experiments that show that simple guanine repeats 13 bp (base pairs) in length or longer (G13+) increase the substitution rate 4- to 18-fold in the downstream DNA sequence, and this correlates with DNA replication timing (R = 0.89). We show that G13+ mutagenicity results from the interplay of both error-prone translesion synthesis and homologous recombination repair pathways. The mutagenic repeats that we study have the potential to be exploited for the artificial elevation of mutation rate in systems biology and synthetic biology applications. PMID:27386516
From NGS assembly challenges to instability of fungal mitochondrial genomes: A case study in genome complexity.

PubMed

Misas, Elizabeth; Muñoz, José Fernando; Gallo, Juan Esteban; McEwen, Juan Guillermo; Clay, Oliver Keatinge

2016-04-01

The presence of repetitive or non-unique DNA persisting over sizable regions of a eukaryotic genome can hinder the genome's successful de novo assembly from short reads: ambiguities in assigning genome locations to the non-unique subsequences can result in premature termination of contigs and thus overfragmented assemblies. Fungal mitochondrial (mtDNA) genomes are compact (typically less than 100 kb), yet often contain short non-unique sequences that can be shown to impede their successful de novo assembly in silico. Such repeats can also confuse processes in the cell in vivo. A well-studied example is ectopic (out-of-register, illegitimate) recombination associated with repeat pairs, which can lead to deletion of functionally important genes that are located between the repeats. Repeats that remain conserved over micro- or macroevolutionary timescales despite such risks may indicate functionally or structurally (e.g., for replication) important regions. This principle could form the basis of a mining strategy for accelerating discovery of function in genome sequences. We present here our screening of a sample of 11 fully sequenced fungal mitochondrial genomes by observing where exact k-mer repeats occurred several times; initial analyses motivated us to focus on 17-mers occurring more than three times. Based on the diverse repeats we observe, we propose that such screening may serve as an efficient expedient for gaining a rapid but representative first insight into the repeat landscapes of sparsely characterized mitochondrial chromosomes. Our matching of the flagged repeats to previously reported regions of interest supports the idea that systems of persisting, non-trivial repeats in genomes can often highlight features meriting further attention. Copyright © 2016 Elsevier Ltd. All rights reserved.
Comparison of the carboxy-terminal DP-repeat region in the co-chaperones Hop and Hip

PubMed Central

Nelson, Gregory M.; Huffman, Holly; Smith, David F.

2003-01-01

Functional steroid receptor complexes are assembled and maintained by an ordered pathway of interactions involving multiple components of the cellular chaperone machinery. Two of these components, Hop and Hip, serve as co-chaperones to the major heat shock proteins (Hsps), Hsp70 and Hsp90, and participate in intermediate stages of receptor assembly. In an effort to better understand the functions of Hop and Hip in the assembly process, we focused on a region of similarity located near the C-terminus of each co-chaperone. Contained within this region is a repeated sequence motif we have termed the DP repeat. Earlier mutagenesis studies implicated the DP repeat of either Hop or Hip in Hsp70 binding and in normal assembly of the co-chaperones with progesterone receptor (PR) complexes. We report here that the DP repeat lies within a protease-resistant domain that extends to or is near the C-terminus of both co-chaperones. Point mutations in the DP repeats render the C-terminal regions hypersensitive to proteolysis. In addition, a Hop DP mutant displays altered proteolytic digestion patterns, which suggest that the DP-repeat region influences the folding of other Hop domains. Although the respective DP regions of Hop and Hip share sequence and structural similarities, they are not functionally interchangeable. Moreover, a double-point mutation within the second DP-repeat unit of Hop that converts this to the sequence found in Hip disrupts Hop function; however, the corresponding mutation in Hip does not alter its function. We conclude that the DP repeats are important structural elements within a C-terminal domain, which is important for Hop and Hip function. PMID:14627198
Comparison of the carboxy-terminal DP-repeat region in the co-chaperones Hop and Hip.

PubMed

Nelson, Gregory M; Huffman, Holly; Smith, David F

2003-01-01

Functional steroid receptor complexes are assembled and maintained by an ordered pathway of interactions involving multiple components of the cellular chaperone machinery. Two of these components, Hop and Hip, serve as co-chaperones to the major heat shock proteins (Hsps), Hsp70 and Hsp90, and participate in intermediate stages of receptor assembly. In an effort to better understand the functions of Hop and Hip in the assembly process, we focused on a region of similarity located near the C-terminus of each co-chaperone. Contained within this region is a repeated sequence motif we have termed the DP repeat. Earlier mutagenesis studies implicated the DP repeat of either Hop or Hip in Hsp70 binding and in normal assembly of the co-chaperones with progesterone receptor (PR) complexes. We report here that the DP repeat lies within a protease-resistant domain that extends to or is near the C-terminus of both co-chaperones. Point mutations in the DP repeats render the C-terminal regions hypersensitive to proteolysis. In addition, a Hop DP mutant displays altered proteolytic digestion patterns, which suggest that the DP-repeat region influences the folding of other Hop domains. Although the respective DP regions of Hop and Hip share sequence and structural similarities, they are not functionally interchangeable. Moreover, a double-point mutation within the second DP-repeat unit of Hop that converts this to the sequence found in Hip disrupts Hop function; however, the corresponding mutation in Hip does not alter its function. We conclude that the DP repeats are important structural elements within a C-terminal domain, which is important for Hop and Hip function.
Selfish DNA in protein-coding genes of Rickettsia.

PubMed

Ogata, H; Audic, S; Barbe, V; Artiguenave, F; Fournier, P E; Raoult, D; Claverie, J M

2000-10-13

Rickettsia conorii, the aetiological agent of Mediterranean spotted fever, is an intracellular bacterium transmitted by ticks. Preliminary analyses of the nearly complete genome sequence of R. conorii have revealed 44 occurrences of a previously undescribed palindromic repeat (150 base pairs long) throughout the genome. Unexpectedly, this repeat was found inserted in-frame within 19 different R. conorii open reading frames likely to encode functional proteins. We found the same repeat in proteins of other Rickettsia species. The finding of a mobile element inserted in many unrelated genes suggests the potential role of selfish DNA in the creation of new protein sequences.
Complex structure of knob DNA on maize chromosome 9. Retrotransposon invasion into heterochromatin.

PubMed Central

Ananiev, E V; Phillips, R L; Rines, H W

1998-01-01

The recovery of maize (Zea mays L.) chromosome addition lines of oat (Avena sativa L.) from oat x maize crosses enables us to analyze the structure and composition of specific regions, such as knobs, of individual maize chromosomes. A DNA hybridization blot panel of eight individual maize chromosome addition lines revealed that 180-bp repeats found in knobs are present in each of these maize chromosomes, but the copy number varies from approximately 100 to 25, 000. Cosmid clones with knob DNA segments were isolated from a genomic library of an oat-maize chromosome 9 addition line with the help of the 180-bp knob-associated repeated DNA sequence used as a probe. Cloned knob DNA segments revealed a complex organization in which blocks of tandemly arranged 180-bp repeating units are interrupted by insertions of other repeated DNA sequences, mostly represented by individual full size copies of retrotransposable elements. There is an obvious preference for the integration of retrotransposable elements into certain sites (hot spots) of the 180-bp repeat. Sequence microheterogeneity including point mutations and duplications was found in copies of 180-bp repeats. The 180-bp repeats within an array all had the same polarity. Restriction maps constructed for 23 cloned knob DNA fragments revealed the positions of polymorphic sites and sites of integration of insertion elements. Discovery of the interspersion of retrotransposable elements among blocks of tandem repeats in maize and some other organisms suggests that this pattern may be basic to heterochromatin organization for eukaryotes. PMID:9691055
Concerted evolution of the tandem array encoding primate U2 snRNA occurs in situ, without changing the cytological context of the RNU2 locus.

PubMed Central

Pavelitz, T; Rusché, L; Matera, A G; Scharf, J M; Weiner, A M

1995-01-01

In primates, the tandemly repeated genes encoding U2 small nuclear RNA evolve concertedly, i.e. the sequence of the U2 repeat unit is essentially homogeneous within each species but differs somewhat between species. Using chromosome painting and the NGFR gene as an outside marker, we show that the U2 tandem array (RNU2) has remained at the same chromosomal locus (equivalent to human 17q21) through multiple speciation events over > 35 million years leading to the Old World monkey and hominoid lineages. The data suggest that the U2 tandem repeat, once established in the primate lineage, contained sequence elements favoring perpetuation and concerted evolution of the array in situ, despite a pericentric inversion in chimpanzee, a reciprocal translocation in gorilla and a paracentric inversion in orang utan. Comparison of the 11 kb U2 repeat unit found in baboon and other Old World monkeys with the 6 kb U2 repeat unit in humans and other hominids revealed that an ancestral U2 repeat unit was expanded by insertion of a 5 kb retrovirus bearing 1 kb long terminal repeats (LTRs). Subsequent excision of the provirus by homologous recombination between the LTRs generated a 6 kb U2 repeat unit containing a solo LTR. Remarkably, both junctions between the human U2 tandem array and flanking chromosomal DNA at 17q21 fall within the solo LTR sequence, suggesting a role for the LTR in the origin or maintenance of the primate U2 array. Images PMID:7828589
CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats.

PubMed

Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

2007-07-01

Clustered regularly interspaced short palindromic repeats (CRISPRs) constitute a particular family of tandem repeats found in a wide range of prokaryotic genomes (half of eubacteria and almost all archaea). They consist of a succession of highly conserved regions (DR) varying in size from 23 to 47 bp, separated by similarly sized unique sequences (spacer) of usually viral origin. A CRISPR cluster is flanked on one side by an AT-rich sequence called the leader and assumed to be a transcriptional promoter. Recent studies suggest that this structure represents a putative RNA-interference-based immune system. Here we describe CRISPRFinder, a web service offering tools to (i) detect CRISPRs including the shortest ones (one or two motifs); (ii) define DRs and extract spacers; (iii) get the flanking sequences to determine the leader; (iv) blast spacers against Genbank database and (v) check if the DR is found elsewhere in prokaryotic sequenced genomes. CRISPRFinder is freely accessible at http://crispr.u-psud.fr/Server/CRISPRfinder.php.
Tactile Ranschburg effects: facilitation and inhibitory repetition effects analogous to verbal memory.

PubMed

Roe, Daisy; Miles, Christopher; Johnson, Andrew J

2017-07-01

The present paper examines the effect of within-sequence item repetitions in tactile order memory. Employing an immediate serial recall procedure, participants reconstructed a six-item sequence tapped upon their fingers by moving those fingers in the order of original stimulation. In Experiment 1a, within-sequence repetition of an item separated by two-intervening items resulted in a significant reduction in recall accuracy for that repeated item (i.e., the Ranschburg effect). In Experiment 1b, within-sequence repetition of an adjacent item resulted in significant recall facilitation for that repeated item. These effects mirror those reported for verbal stimuli (e.g., Henson, 1998a . Item repetition in short-term memory: Ranschburg repeated. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24(5), 1162-1181. doi:doi.org/10.1037/0278-7393.24.5.1162). These data are the first to demonstrate the Ranschburg effect with non-verbal stimuli and suggest further cross-modal similarities in order memory.
Development, characterization and cross species amplification of polymorphic microsatellite markers from expressed sequence tags of turmeric (Curcuma longa L.).

PubMed

Siju, S; Dhanya, K; Syamkumar, S; Sasikumar, B; Sheeja, T E; Bhat, A I; Parthasarathy, V A

2010-02-01

Expressed sequence tags (ESTs) from turmeric (Curcuma longa L.) were used for the screening of type and frequency of Class I (hypervariable) simple sequence repeats (SSRs). A total of 231 microsatellite repeats were detected from 12,593 EST sequences of turmeric after redundancy elimination. The average density of Class I SSRs accounts to one SSR per 17.96 kb of EST. Mononucleotides were the most abundant class of microsatellite repeat in turmeric ESTs followed by trinucleotides. A robust set of 17 polymorphic EST-SSRs were developed and used for evaluating 20 turmeric accessions. The number of alleles detected ranged from 3 to 8 per loci. The developed markers were also evaluated in 13 related species of C. longa confirming high rate (100%) of cross species transferability. The polymorphic microsatellite markers generated from this study could be used for genetic diversity analysis and resolving the taxonomic confusion prevailing in the genus.
A Glance at Microsatellite Motifs from 454 Sequencing Reads of Watermelon Genomic DNA

USDA-ARS?s Scientific Manuscript database

A single 454 (Life Sciences Sequencing Technology) run of Charleston Gray watermelon (Citrullus lanatus var. lanatus) genomic DNA was performed and sequence data were assembled. A large scale identification of simple sequence repeat (SSR) was performed and SSR sequence data were used for the develo...
Fast and Cost-Effective Mining of Microsatellite Markers Using NGS Technology: An Example of a Korean Water Deer Hydropotes inermis argyropus

PubMed Central

Yu, Jeong-Nam; Won, Changman; Jun, Jumin; Lim, YoungWoon; Kwak, Myounghai

2011-01-01

Background Microsatellites, a special class of repetitive DNA sequence, have become one of the most popular genetic markers for population/conservation genetic studies. However, its application to endangered species has been impeded by high development costs, a lack of available sequences, and technical difficulties. The water deer Hydropotes inermis is the sole existing endangered species of the subfamily Capreolinae. Although population genetics studies are urgently required for conservation management, no species-specific microsatellite marker has been reported. Methods We adopted next-generation sequencing (NGS) to elucidate the microsatellite markers of Korean water deer and overcome these impediments on marker developments. We performed genotyping to determine the efficiency of this method as applied to population genetics. Results We obtained 98 Mbp of nucleotide information from 260,467 sequence reads. A total of 20,101 di-/tri-nucleotide repeat motifs were identified; di-repeats were 5.9-fold more common than tri-repeats. [CA]n and [AAC]n/[AAT]n repeats were the most frequent di- and tri-repeats, respectively. Of the 17,206 di-repeats, 12,471 microsatellite primer pairs were derived. PCR amplification of 400 primer pairs yielded 106 amplicons and 79 polymorphic markers from 20 individual Korean water deer. Polymorphic rates of the 79 new microsatellites varied from 2 to 11 alleles per locus (He: 0.050–0.880; Ho: 0.000–1.000), while those of known microsatellite markers transferred from cattle to Chinese water deer ranged from 4 to 6 alleles per locus (He: 0.279–0.714; Ho: 0.300–0.400). Conclusions Polymorphic microsatellite markers from Korean water deer were successfully identified using NGS without any prior sequence information and deposited into the public database. Thus, the methods described herein represent a rapid and low-cost way to investigate the population genetics of endangered/non-model species. PMID:22069476
Billions of basepairs of recently expanded, repetitive sequences are eliminated from the somatic genome during copepod development.

PubMed

Sun, Cheng; Wyngaard, Grace; Walton, D Brian; Wichman, Holly A; Mueller, Rachel Lockridge

2014-03-11

Chromatin diminution is the programmed deletion of DNA from presomatic cell or nuclear lineages during development, producing single organisms that contain two different nuclear genomes. Phylogenetically diverse taxa undergo chromatin diminution--some ciliates, nematodes, copepods, and vertebrates. In cyclopoid copepods, chromatin diminution occurs in taxa with massively expanded germline genomes; depending on species, germline genome sizes range from 15 - 75 Gb, 12-74 Gb of which are lost from pre-somatic cell lineages at germline--soma differentiation. This is more than an order of magnitude more sequence than is lost from other taxa. To date, the sequences excised from copepods have not been analyzed using large-scale genomic datasets, and the processes underlying germline genomic gigantism in this clade, as well as the functional significance of chromatin diminution, have remained unknown. Here, we used high-throughput genomic sequencing and qPCR to characterize the germline and somatic genomes of Mesocyclops edax, a freshwater cyclopoid copepod with a germline genome of ~15 Gb and a somatic genome of ~3 Gb. We show that most of the excised DNA consists of repetitive sequences that are either 1) verifiable transposable elements (TEs), or 2) non-simple repeats of likely TE origin. Repeat elements in both genomes are skewed towards younger (i.e. less divergent) elements. Excised DNA is a non-random sample of the germline repeat element landscape; younger elements, and high frequency DNA transposons and LINEs, are disproportionately eliminated from the somatic genome. Our results suggest that germline genome expansion in M. edax reflects explosive repeat element proliferation, and that billions of base pairs of such repeats are deleted from the somatic genome every generation. Thus, we hypothesize that chromatin diminution is a mechanism that controls repeat element load, and that this load can evolve to be divergent between tissue types within single organisms.
Billions of basepairs of recently expanded, repetitive sequences are eliminated from the somatic genome during copepod development

PubMed Central

2014-01-01

Background Chromatin diminution is the programmed deletion of DNA from presomatic cell or nuclear lineages during development, producing single organisms that contain two different nuclear genomes. Phylogenetically diverse taxa undergo chromatin diminution — some ciliates, nematodes, copepods, and vertebrates. In cyclopoid copepods, chromatin diminution occurs in taxa with massively expanded germline genomes; depending on species, germline genome sizes range from 15 – 75 Gb, 12–74 Gb of which are lost from pre-somatic cell lineages at germline – soma differentiation. This is more than an order of magnitude more sequence than is lost from other taxa. To date, the sequences excised from copepods have not been analyzed using large-scale genomic datasets, and the processes underlying germline genomic gigantism in this clade, as well as the functional significance of chromatin diminution, have remained unknown. Results Here, we used high-throughput genomic sequencing and qPCR to characterize the germline and somatic genomes of Mesocyclops edax, a freshwater cyclopoid copepod with a germline genome of ~15 Gb and a somatic genome of ~3 Gb. We show that most of the excised DNA consists of repetitive sequences that are either 1) verifiable transposable elements (TEs), or 2) non-simple repeats of likely TE origin. Repeat elements in both genomes are skewed towards younger (i.e. less divergent) elements. Excised DNA is a non-random sample of the germline repeat element landscape; younger elements, and high frequency DNA transposons and LINEs, are disproportionately eliminated from the somatic genome. Conclusions Our results suggest that germline genome expansion in M. edax reflects explosive repeat element proliferation, and that billions of base pairs of such repeats are deleted from the somatic genome every generation. Thus, we hypothesize that chromatin diminution is a mechanism that controls repeat element load, and that this load can evolve to be divergent between tissue types within single organisms. PMID:24618421
Diversity and evolution of centromere repeats in the maize genome.

PubMed

Bilinski, Paul; Distor, Kevin; Gutierrez-Lopez, Jose; Mendoza, Gabriela Mendoza; Shi, Jinghua; Dawe, R Kelly; Ross-Ibarra, Jeffrey

2015-03-01

Centromere repeats are found in most eukaryotes and play a critical role in kinetochore formation. Though centromere repeats exhibit considerable diversity both within and among species, little is understood about the mechanisms that drive centromere repeat evolution. Here, we use maize as a model to investigate how a complex history involving polyploidy, fractionation, and recent domestication has impacted the diversity of the maize centromeric repeat CentC. We first validate the existence of long tandem arrays of repeats in maize and other taxa in the genus Zea. Although we find considerable sequence diversity among CentC copies genome-wide, genetic similarity among repeats is highest within these arrays, suggesting that tandem duplications are the primary mechanism for the generation of new copies. Nonetheless, clustering analyses identify similar sequences among distant repeats, and simulations suggest that this pattern may be due to homoplasious mutation. Although the two ancestral subgenomes of maize have contributed nearly equal numbers of centromeres, our analysis shows that the majority of all CentC repeats derive from one of the parental genomes, with an even stronger bias when examining the largest assembled contiguous clusters. Finally, by comparing maize with its wild progenitor teosinte, we find that the abundance of CentC likely decreased after domestication, while the pericentromeric repeat Cent4 has drastically increased.
Medium-sized tandem repeats represent an abundant component of the Drosophila virilis genome.

PubMed

Abdurashitov, Murat A; Gonchar, Danila A; Chernukhin, Valery A; Tomilov, Victor N; Tomilova, Julia E; Schostak, Natalia G; Zatsepina, Olga G; Zelentsova, Elena S; Evgen'ev, Michael B; Degtyarev, Sergey K H

2013-11-09

Previously, we developed a simple method for carrying out a restriction enzyme analysis of eukaryotic DNA in silico, based on the known DNA sequences of the genomes. This method allows the user to calculate lengths of all DNA fragments that are formed after a whole genome is digested at the theoretical recognition sites of a given restriction enzyme. A comparison of the observed peaks in distribution diagrams with the results from DNA cleavage using several restriction enzymes performed in vitro have shown good correspondence between the theoretical and experimental data in several cases. Here, we applied this approach to the annotated genome of Drosophila virilis which is extremely rich in various repeats. Here we explored the combined approach to perform the restriction analysis of D. virilis DNA. This approach enabled to reveal three abundant medium-sized tandem repeats within the D. virilis genome. While the 225 bp repeats were revealed previously in intergenic non-transcribed spacers between ribosomal genes of D. virilis, two other families comprised of 154 bp and 172 bp repeats were not described. Tandem Repeats Finder search demonstrated that 154 bp and 172 bp units are organized in multiple clusters in the genome of D. virilis. Characteristically, only 154 bp repeats derived from Helitron transposon are transcribed. Using in silico digestion in combination with conventional restriction analysis and sequencing of repeated DNA fragments enabled us to isolate and characterize three highly abundant families of medium-sized repeats present in the D. virilis genome. These repeats comprise a significant portion of the genome and may have important roles in genome function and structural integrity. Therefore, we demonstrated an approach which makes possible to investigate in detail the gross arrangement and expression of medium-sized repeats basing on sequencing data even in the case of incompletely assembled and/or annotated genomes.

Targeting of Repeated Sequences Unique to a Gene Results in Significant Increases in Antisense Oligonucleotide Potency

PubMed Central

Vickers, Timothy A.; Freier, Susan M.; Bui, Huynh-Hoa; Watt, Andrew; Crooke, Stanley T.

2014-01-01

A new strategy for identifying potent RNase H-dependent antisense oligonucleotides (ASOs) is presented. Our analysis of the human transcriptome revealed that a significant proportion of genes contain unique repeated sequences of 16 or more nucleotides in length. Activities of ASOs targeting these repeated sites in several representative genes were compared to those of ASOs targeting unique single sites in the same transcript. Antisense activity at repeated sites was also evaluated in a highly controlled minigene system. Targeting both native and minigene repeat sites resulted in significant increases in potency as compared to targeting of non-repeated sites. The increased potency at these sites is a result of increased frequency of ASO/RNA interactions which, in turn, increases the probability of a productive interaction between the ASO/RNA heteroduplex and human RNase H1 in the cell. These results suggest a new, highly efficient strategy for rapid identification of highly potent ASOs. PMID:25334092
Comparative molecular cytogenetic analyses of a major tandemly repeated DNA family and retrotransposon sequences in cultivated jute Corchorus species (Malvaceae).

PubMed

Begum, Rabeya; Zakrzewski, Falk; Menzel, Gerhard; Weber, Beatrice; Alam, Sheikh Shamimul; Schmidt, Thomas

2013-07-01

The cultivated jute species Corchorus olitorius and Corchorus capsularis are important fibre crops. The analysis of repetitive DNA sequences, comprising a major part of plant genomes, has not been carried out in jute but is useful to investigate the long-range organization of chromosomes. The aim of this study was the identification of repetitive DNA sequences to facilitate comparative molecular and cytogenetic studies of two jute cultivars and to develop a fluorescent in situ hybridization (FISH) karyotype for chromosome identification. A plasmid library was generated from C. olitorius and C. capsularis with genomic restriction fragments of 100-500 bp, which was complemented by targeted cloning of satellite DNA by PCR. The diversity of the repetitive DNA families was analysed comparatively. The genomic abundance and chromosomal localization of different repeat classes were investigated by Southern analysis and FISH, respectively. The cytosine methylation of satellite arrays was studied by immunolabelling. Major satellite repeats and retrotransposons have been identified from C. olitorius and C. capsularis. The satellite family CoSat I forms two undermethylated species-specific subfamilies, while the long terminal repeat (LTR) retrotransposons CoRetro I and CoRetro II show similarity to the Metaviridea of plant retroelements. FISH karyotypes were developed by multicolour FISH using these repetitive DNA sequences in combination with 5S and 18S-5·8S-25S rRNA genes which enable the unequivocal chromosome discrimination in both jute species. The analysis of the structure and diversity of the repeated DNA is crucial for genome sequence annotation. The reference karyotypes will be useful for breeding of jute and provide the basis for karyotyping homeologous chromosomes of wild jute species to reveal the genetic and evolutionary relationship between cultivated and wild Corchorus species.
A Method for WD40 Repeat Detection and Secondary Structure Prediction

PubMed Central

Wang, Yang; Jiang, Fan; Zhuo, Zhu; Wu, Xian-Hui; Wu, Yun-Dong

2013-01-01

WD40-repeat proteins (WD40s), as one of the largest protein families in eukaryotes, play vital roles in assembling protein-protein/DNA/RNA complexes. WD40s fold into similar β-propeller structures despite diversified sequences. A program WDSP (WD40 repeat protein Structure Predictor) has been developed to accurately identify WD40 repeats and predict their secondary structures. The method is designed specifically for WD40 proteins by incorporating both local residue information and non-local family-specific structural features. It overcomes the problem of highly diversified protein sequences and variable loops. In addition, WDSP achieves a better prediction in identifying multiple WD40-domain proteins by taking the global combination of repeats into consideration. In secondary structure prediction, the average Q3 accuracy of WDSP in jack-knife test reaches 93.7%. A disease related protein LRRK2 was used as a representive example to demonstrate the structure prediction. PMID:23776530
Memory for sequences of events impaired in typical aging.

PubMed

Allen, Timothy A; Morris, Andrea M; Stark, Shauna M; Fortin, Norbert J; Stark, Craig E L

2015-03-01

Typical aging is associated with diminished episodic memory performance. To improve our understanding of the fundamental mechanisms underlying this age-related memory deficit, we previously developed an integrated, cross-species approach to link converging evidence from human and animal research. This novel approach focuses on the ability to remember sequences of events, an important feature of episodic memory. Unlike existing paradigms, this task is nonspatial, nonverbal, and can be used to isolate different cognitive processes that may be differentially affected in aging. Here, we used this task to make a comprehensive comparison of sequence memory performance between younger (18-22 yr) and older adults (62-86 yr). Specifically, participants viewed repeated sequences of six colored, fractal images and indicated whether each item was presented "in sequence" or "out of sequence." Several out of sequence probe trials were used to provide a detailed assessment of sequence memory, including: (i) repeating an item from earlier in the sequence ("Repeats"; e.g., AB A: DEF), (ii) skipping ahead in the sequence ("Skips"; e.g., AB D: DEF), and (iii) inserting an item from a different sequence into the same ordinal position ("Ordinal Transfers"; e.g., AB 3: DEF). We found that older adults performed as well as younger controls when tested on well-known and predictable sequences, but were severely impaired when tested using novel sequences. Importantly, overall sequence memory performance in older adults steadily declined with age, a decline not detected with other measures (RAVLT or BPS-O). We further characterized this deficit by showing that performance of older adults was severely impaired on specific probe trials that required detailed knowledge of the sequence (Skips and Ordinal Transfers), and was associated with a shift in their underlying mnemonic representation of the sequences. Collectively, these findings provide unambiguous evidence that the capacity to remember sequences of events is fundamentally affected by typical aging. © 2015 Allen et al.; Published by Cold Spring Harbor Laboratory Press.
Construction of a self-cloning sake yeast that overexpresses alcohol acetyltransferase gene by a two-step gene replacement protocol.

PubMed

Hirosawa, I; Aritomi, K; Hoshida, H; Kashiwagi, S; Nishizawa, Y; Akada, R

2004-07-01

The commercial application of genetically modified industrial microorganisms has been problematic due to public concerns. We constructed a "self-cloning" sake yeast strain that overexpresses the ATF1 gene encoding alcohol acetyltransferase, to improve the flavor profile of Japanese sake. A constitutive yeast overexpression promoter, TDH3p, derived from the glyceraldehyde-3-phosphate dehydrogenase gene from sake yeast was fused to ATF1; and the 5' upstream non-coding sequence of ATF1 was further fused to TDH3p-ATF1. The fragment was placed on a binary vector, pGG119, containing a drug-resistance marker for transformation and a counter-selection marker for excision of unwanted DNA. The plasmid was integrated into the ATF1 locus of a sake yeast strain. This integration constructed tandem repeats of ATF1 and TDH3p-ATF1 sequences, between which the plasmid was inserted. Loss of the plasmid, which occurs through homologous recombination between either the TDH3p downstream ATF1 repeats or the TDH3p upstream repeat sequences, was selected by growing transformants on counter-selective medium. Recombination between the downstream repeats led to reversion to a wild type strain, but that between the upstream repeats resulted in a strain that possessed TDH3p-ATF1 without the extraneous DNA sequences. The self-cloning TDH3p-ATF1 yeast strain produced a higher amount of isoamyl acetate. This is the first expression-controlled self-cloning industrial yeast.
The Complete Chloroplast Genome Sequence of a Relict Conifer Glyptostrobus pensilis: Comparative Analysis and Insights into Dynamics of Chloroplast Genome Rearrangement in Cupressophytes and Pinaceae

PubMed Central

Zheng, Renhua; Xu, Haibin; Zhou, Yanwei; Li, Meiping; Lu, Fengjuan; Dong, Yini; Liu, Xin; Chen, Jinhui; Shi, Jisen

2016-01-01

Glyptostrobus pensilis, belonging to the monotypic genus Glyptostrobus (Family: Cupressaceae), is an ancient conifer that is naturally distributed in low-lying wet areas. Here, we report the complete chloroplast (cp) genome sequence (132,239 bp) of G. pensilis. The G. pensilis cp genome is similar in gene content, organization and genome structure to the sequenced cp genomes from other cupressophytes, especially with respect to the loss of the inverted repeat region A (IRA). Through phylogenetic analysis, we demonstrated that the genus Glyptostrobus is closely related to the genus Cryptomeria, supporting previous findings based on physiological characteristics. Since IRs play an important role in stabilize cp genome and conifer cp genomes lost different IR regions after splitting in two clades (cupressophytes and Pinaceae), we performed cp genome rearrangement analysis and found more extensive cp genome rearrangements among the species of cupressophytes relative to Pinaceae. Additional repeat analysis indicated that cupressophytes cp genomes contained less potential functional repeats, especially in Cupressaceae, compared with Pinaceae. These results suggested that dynamics of cp genome rearrangement in conifers differed since the two clades, Pinaceae and cupressophytes, lost IR copies independently and developed different repeats to complement the residual IRs. In addition, we identified 170 perfect simple sequence repeats that will be useful in future research focusing on the evolution of genetic diversity and conservation of genetic variation for this endangered species in the wild. PMID:27560965
Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

PubMed

Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

2017-07-01

PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.
Length and sequence variability in mitochondrial control region of the milkfish, Chanos chanos.

PubMed

Ravago, Rachel G; Monje, Virginia D; Juinio-Meñez, Marie Antonette

2002-01-01

Extensive length variability was observed in the mitochondrial control region of the milkfish, Chanos chanos. The nucleotide sequence of the control region and flanking regions was determined. Length variability and heteroplasmy was due to the presence of varying numbers of a 41-bp tandemly repeated sequence and a 48-bp insertion/deletion (indel). The structure and organization of the milkfish control region is similar to that of other teleost fish and vertebrates. However, extensive variation in the copy number of tandem repeats (4-20 copies) and the presence of a relatively large (48-bp) indel, are apparently uncommon in teleost fish control region sequences reported to date. High sequence variability of control region peripheral domains indicates the potential utility of selected regions as markers for population-level studies.
Length and sequence heterogeneity in 5S rDNA of Populus deltoides.

PubMed

Negi, Madan S; Rajagopal, Jyothi; Chauhan, Neeti; Cronn, Richard; Lakshmikumaran, Malathi

2002-12-01

The 5S rRNA genes and their associated non-transcribed spacer (NTS) regions are present as repeat units arranged in tandem arrays in plant genomes. Length heterogeneity in 5S rDNA repeats was previously identified in Populus deltoides and was also observed in the present study. Primers were designed to amplify the 5S rDNA NTS variants from the P. deltoides genome. The PCR-amplified products from the two accessions of P. deltoides (G3 and G48) suggested the presence of length heterogeneity of 5S rDNA units within and among accessions, and the size of the spacers ranged from 385 to 434 bp. Sequence analysis of the non-transcribed spacer (NTS) revealed two distinct classes of 5S rDNA within both accessions: class 1, which contained GAA trinucleotide microsatellite repeats, and class 2, which lacked the repeats. The class 1 spacer shows length variation owing to the microsatellite, with two clones exhibiting 10 GAA repeat units and one clone exhibiting 16 such repeat units. However, distance analysis shows that class 1 spacer sequences are highly similar inter se, yielding nucleotide diversity (pi) estimates that are less than 0.15% of those obtained for class 2 spacers (pi = 0.0183 vs. 0.1433, respectively). The presence of microsatellite in the NTS region leading to variation in spacer length is reported and discussed for the first time in P. deltoides.
Cytogenetic Analysis of Populus trichocarpa - Ribosomal DNA, Telomere Repeat Sequence, and Marker-selected BACs

Treesearch

M.N. lslam-Faridi; C.D. Nelson; S.P. DiFazio; L.E. Gunter; G.A. Tuskan

2009-01-01

The 185-285 rDNA and 55 rDNA loci in Populus trichocarpa were localized using fluorescent in situ hybridization (FISH). Two 185-285 rDNA sites and one 55 rDNA site were identified and located at the ends of 3 different chromosomes. FISH signals from the Arabidopsis-type telomere repeat sequence were observed at the distal ends of each chromosome. Six BAC clones...
Non-RVD mutations that enhance the dynamics of the TAL repeat array along the superhelical axis improve TALEN genome editing efficacy

PubMed Central

Tochio, Naoya; Umehara, Kohei; Uewaki, Jun-ichi; Flechsig, Holger; Kondo, Masaharu; Dewa, Takehisa; Sakuma, Tetsushi; Yamamoto, Takashi; Saitoh, Takashi; Togashi, Yuichi; Tate, Shin-ichi

2016-01-01

Transcription activator-like effector (TALE) nuclease (TALEN) is widely used as a tool in genome editing. The DNA binding part of TALEN consists of a tandem array of TAL-repeats that form a right-handed superhelix. Each TAL-repeat recognises a specific base by the repeat variable diresidue (RVD) at positions 12 and 13. TALEN comprising the TAL-repeats with periodic mutations to residues at positions 4 and 32 (non-RVD sites) in each repeat (VT-TALE) exhibits increased efficacy in genome editing compared with a counterpart without the mutations (CT-TALE). The molecular basis for the elevated efficacy is unknown. In this report, comparison of the physicochemical properties between CT- and VT-TALEs revealed that VT-TALE has a larger amplitude motion along the superhelical axis (superhelical motion) compared with CT-TALE. The greater superhelical motion in VT-TALE enabled more TAL-repeats to engage in the target sequence recognition compared with CT-TALE. The extended sequence recognition by the TAL-repeats improves site specificity with limiting the spatial distribution of FokI domains to facilitate their dimerization at the desired site. Molecular dynamics simulations revealed that the non-RVD mutations alter inter-repeat hydrogen bonding to amplify the superhelical motion of VT-TALE. The TALEN activity is associated with the inter-repeat hydrogen bonding among the TAL repeats. PMID:27883072
Evaluation of anonymous and expressed sequence tag derived polymorphic microsatellite markers in the tobacco budworm Heliothis virescens (Lepidoptera: noctuidae)

USDA-ARS?s Scientific Manuscript database

Polymorphic genetic markers were identified and characterized using a partial genomic library of Heliothis virescens enriched for simple sequence repeats (SSR) and nucleotide sequences of expressed sequence tags (EST). Nucleotide sequences of 192 clones from the partial genomic library yielded 147 u...
Development of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L.) Millspaugh].

PubMed

Dutta, Sutapa; Kumawat, Giriraj; Singh, Bikram P; Gupta, Deepak K; Singh, Sangeeta; Dogra, Vivek; Gaikwad, Kishor; Sharma, Tilak R; Raje, Ranjeet S; Bandhopadhya, Tapas K; Datta, Subhojit; Singh, Mahendra N; Bashasab, Fakrudin; Kulwal, Pawan; Wanjari, K B; K Varshney, Rajeev; Cook, Douglas R; Singh, Nagendra K

2011-01-20

Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥ 18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. We developed 550 validated genic-SSR markers in pigeonpea using deep transcriptome sequencing. From these, 20 highly polymorphic markers were used to evaluate the genetic relationship among species of the genus Cajanus. A comprehensive set of genic-SSR markers was developed as an important genomic resource for diversity analysis and genetic mapping in pigeonpea.
Development of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L.) Millspaugh

PubMed Central

2011-01-01

Background Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. Results In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. Conclusion We developed 550 validated genic-SSR markers in pigeonpea using deep transcriptome sequencing. From these, 20 highly polymorphic markers were used to evaluate the genetic relationship among species of the genus Cajanus. A comprehensive set of genic-SSR markers was developed as an important genomic resource for diversity analysis and genetic mapping in pigeonpea. PMID:21251263
Contrasting Patterns of rDNA Homogenization within the Zygosaccharomyces rouxii Species Complex

PubMed Central

Chand Dakal, Tikam; Giudici, Paolo; Solieri, Lisa

2016-01-01

Arrays of repetitive ribosomal DNA (rDNA) sequences are generally expected to evolve as a coherent family, where repeats within such a family are more similar to each other than to orthologs in related species. The continuous homogenization of repeats within individual genomes is a recombination process termed concerted evolution. Here, we investigated the extent and the direction of concerted evolution in 43 yeast strains of the Zygosaccharomyces rouxii species complex (Z. rouxii, Z. sapae, Z. mellis), by analyzing two portions of the 35S rDNA cistron, namely the D1/D2 domains at the 5’ end of the 26S rRNA gene and the segment including the internal transcribed spacers (ITS) 1 and 2 (ITS regions). We demonstrate that intra-genomic rDNA sequence variation is unusually frequent in this clade and that rDNA arrays in single genomes consist of an intermixing of Z. rouxii, Z. sapae and Z. mellis-like sequences, putatively evolved by reticulate evolutionary events that involved repeated hybridization between lineages. The levels and distribution of sequence polymorphisms vary across rDNA repeats in different individuals, reflecting four patterns of rDNA evolution: I) rDNA repeats that are homogeneous within a genome but are chimeras derived from two parental lineages via recombination: Z. rouxii in the ITS region and Z. sapae in the D1/D2 region; II) intra-genomic rDNA repeats that retain polymorphisms only in ITS regions; III) rDNA repeats that vary only in their D1/D2 domains; IV) heterogeneous rDNA arrays that have both polymorphic ITS and D1/D2 regions. We argue that an ongoing process of homogenization following allodiplodization or incomplete lineage sorting gave rise to divergent evolutionary trajectories in different strains, depending upon temporal, structural and functional constraints. We discuss the consequences of these findings for Zygosaccharomyces species delineation and, more in general, for yeast barcoding. PMID:27501051
The organization and evolution of the Responder satellite in species of the Drosophila melanogaster group: dynamic evolution of a target of meiotic drive.

PubMed

Larracuente, Amanda M

2014-11-25

Satellite DNA can make up a substantial fraction of eukaryotic genomes and has roles in genome structure and chromosome segregation. The rapid evolution of satellite DNA can contribute to genomic instability and genetic incompatibilities between species. Despite its ubiquity and its contribution to genome evolution, we currently know little about the dynamics of satellite DNA evolution. The Responder (Rsp) satellite DNA family is found in the pericentric heterochromatin of chromosome 2 of Drosophila melanogaster. Rsp is well-known for being the target of Segregation Distorter (SD)- an autosomal meiotic drive system in D. melanogaster. I present an evolutionary genetic analysis of the Rsp family of repeats in D. melanogaster and its closely-related species in the melanogaster group (D. simulans, D. sechellia, D. mauritiana, D. erecta, and D. yakuba) using a combination of available BAC sequences, whole genome shotgun Sanger reads, Illumina short read deep sequencing, and fluorescence in situ hybridization. I show that Rsp repeats have euchromatic locations throughout the D. melanogaster genome, that Rsp arrays show evidence for concerted evolution, and that Rsp repeats exist outside of D. melanogaster, in the melanogaster group. The repeats in these species are considerably diverged at the sequence level compared to D. melanogaster, and have a strikingly different genomic distribution, even between closely-related sister taxa. The genomic organization of the Rsp repeat in the D. melanogaster genome is complex-it exists of large blocks of tandem repeats in the heterochromatin and small blocks of tandem repeats in the euchromatin. My discovery of heterochromatic Rsp-like sequences outside of D. melanogaster suggests that SD evolved after its target satellite and that the evolution of the Rsp satellite family is highly dynamic over a short evolutionary time scale (<240,000 years).
Sequence of contactin, a 130-kD glycoprotein concentrated in areas of interneuronal contact, defines a new member of the immunoglobulin supergene family in the nervous system

PubMed Central

1988-01-01

The primary amino acid sequence of contactin, a neuronal cell surface glycoprotein of 130 kD that is isolated in association with components of the cytoskeleton (Ranscht, B., D. J. Moss, and C. Thomas. 1984. J. Cell Biol. 99:1803-1813), was deduced from the nucleotide sequence of cDNA clones and is reported here. The cDNA sequence contains an open reading frame for a 1,071-amino acid transmembrane protein with 962 extracellular and 89 cytoplasmic amino acids. In its extracellular portion, the polypeptide features six type 1 and two type 2 repeats. The six amino-terminal type 1 repeats (I-VI) each consist of 81-99 amino acids and contain two cysteine residues that are in the right context to form globular domains as described for molecules with immunoglobulin structure. Within the proposed globular region, contactin shares 31% identical amino acids with the neural cell adhesion molecule NCAM. The two type 2 repeats (I-II) are each composed of 100 amino acids and lack cysteine residues. They are 20-31% identical to fibronectin type III repeats. Both the structural similarity of contactin to molecules of the immunoglobulin supergene family, in particular the amino acid sequence resemblance to NCAM, and its relationship to fibronectin indicate that contactin could be involved in some aspect of cellular adhesion. This suggestion is further strengthened by its localization in neuropil containing axon fascicles and synapses. PMID:3049624
The Peculiar Landscape of Repetitive Sequences in the Olive (Olea europaea L.) Genome

PubMed Central

Barghini, Elena; Natali, Lucia; Cossu, Rosa Maria; Giordani, Tommaso; Pindo, Massimo; Cattonaro, Federica; Scalabrin, Simone; Velasco, Riccardo; Morgante, Michele; Cavallini, Andrea

2014-01-01

Analyzing genome structure in different species allows to gain an insight into the evolution of plant genome size. Olive (Olea europaea L.) has a medium-sized haploid genome of 1.4 Gb, whose structure is largely uncharacterized, despite the growing importance of this tree as oil crop. Next-generation sequencing technologies and different computational procedures have been used to study the composition of the olive genome and its repetitive fraction. A total of 2.03 and 2.3 genome equivalents of Illumina and 454 reads from genomic DNA, respectively, were assembled following different procedures, which produced more than 200,000 differently redundant contigs, with mean length higher than 1,000 nt. Mapping Illumina reads onto the assembled sequences was used to estimate their redundancy. The genome data set was subdivided into highly and medium redundant and nonredundant contigs. By combining identification and mapping of repeated sequences, it was established that tandem repeats represent a very large portion of the olive genome (∼31% of the whole genome), consisting of six main families of different length, two of which were first discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements (especially long terminal repeat-retrotransposons). On the whole, the results of our analyses show the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, more than that reported for any other sequenced plant genome. PMID:24671744
The peculiar landscape of repetitive sequences in the olive (Olea europaea L.) genome.

PubMed

Barghini, Elena; Natali, Lucia; Cossu, Rosa Maria; Giordani, Tommaso; Pindo, Massimo; Cattonaro, Federica; Scalabrin, Simone; Velasco, Riccardo; Morgante, Michele; Cavallini, Andrea

2014-04-01

Analyzing genome structure in different species allows to gain an insight into the evolution of plant genome size. Olive (Olea europaea L.) has a medium-sized haploid genome of 1.4 Gb, whose structure is largely uncharacterized, despite the growing importance of this tree as oil crop. Next-generation sequencing technologies and different computational procedures have been used to study the composition of the olive genome and its repetitive fraction. A total of 2.03 and 2.3 genome equivalents of Illumina and 454 reads from genomic DNA, respectively, were assembled following different procedures, which produced more than 200,000 differently redundant contigs, with mean length higher than 1,000 nt. Mapping Illumina reads onto the assembled sequences was used to estimate their redundancy. The genome data set was subdivided into highly and medium redundant and nonredundant contigs. By combining identification and mapping of repeated sequences, it was established that tandem repeats represent a very large portion of the olive genome (∼31% of the whole genome), consisting of six main families of different length, two of which were first discovered in these experiments. The other large redundant class in the olive genome is represented by transposable elements (especially long terminal repeat-retrotransposons). On the whole, the results of our analyses show the peculiar landscape of the olive genome, related to the massive amplification of tandem repeats, more than that reported for any other sequenced plant genome.
The complete chloroplast genome of Gentiana straminea (Gentianaceae), an endemic species to the Sino-Himalayan subregion.

PubMed

Ni, Lianghong; Zhao, Zhili; Xu, Hongxi; Chen, Shilin; Dorje, Gaawe

2016-02-15

Endemic to the Sino-Himalayan subregion, the medicinal alpine plant Gentiana straminea is a threatened species. The genetic and molecular data about it is deficient. Here we report the complete chloroplast (cp) genome sequence of G. straminea, as the first sequenced member of the family Gentianaceae. The cp genome is 148,991bp in length, including a large single copy (LSC) region of 81,240bp, a small single copy (SSC) region of 17,085bp and a pair of inverted repeats (IRs) of 25,333bp. It contains 112 unique genes, including 78 protein-coding genes, 30 tRNAs and 4 rRNAs. The rps16 gene lacks exon2 between trnK-UUU and trnQ-UUG, which is the first rps16 pseudogene found in the nonparasitic plants of Asterids clade. Sequence analysis revealed the presence of 13 forward repeats, 13 palindrome repeats and 39 simple sequence repeats (SSRs). An entire cp genome comparison study of G. straminea and four other species in Gentianales was carried out. Phylogenetic analyses using maximum likelihood (ML) and maximum parsimony (MP) were performed based on 69 protein-coding genes from 36 species of Asterids. The results strongly supported the position of Gentianaceae as one member of the order Gentianales. The complete chloroplast genome sequence will provide intragenic information for its conservation and contribute to research on the genetic and phylogenetic analyses of Gentianales and Asterids. Copyright © 2015 Elsevier B.V. All rights reserved.

Characterization of the variable-number tandem repeats in vrrA from different Bacillus anthracis isolates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jackson, P.J.; Walthers, E.A.; Richmond, K.L.

1997-04-01

PCR analysis of 198 Bacillus anthracis isolates revealed a variable region of DNA sequence differing in length among the isolates. Five Polymorphisms differed by the presence Of two to six copies of the 12-bp tandem repeat 5{prime}-CAATATCAACAA-3{prime}. This variable-number tandem repeat (VNTR) region is located within a larger sequence containing one complete open reading frame that encodes a putative 30-kDa protein. Length variation did not change the reading frame of the encoded protein and only changed the copy number of a 4-amino-acid sequence (QYQQ) from 2 to 6. The structure of the VNTR region suggests that these multiple repeats aremore » generated by recombination or polymerase slippage. Protein structures predicted from the reverse-translated DNA sequence suggest that any structural changes in the encoded protein are confined to the region encoded by the VNTR sequence. Copy number differences in the VNTR region were used to define five different B. anthracis alleles. Characterization of 198 isolates revealed allele frequencies of 6.1, 17.7, 59.6, 5.6, and 11.1% sequentially from shorter to longer alleles. The high degree of polymorphism in the VNTR region provides a criterion for assigning isolates to five allelic categories. There is a correlation between categories and geographic distribution. Such molecular markers can be used to monitor the epidemiology of anthrax outbreaks in domestic and native herbivore populations. 22 refs., 4 figs., 3 tabs.« less
Modular probes for enriching and detecting complex nucleic acid sequences

NASA Astrophysics Data System (ADS)

Wang, Juexiao Sherry; Yan, Yan Helen; Zhang, David Yu

2017-12-01

Complex DNA sequences are difficult to detect and profile, but are important contributors to human health and disease. Existing hybridization probes lack the capability to selectively bind and enrich hypervariable, long or repetitive sequences. Here, we present a generalized strategy for constructing modular hybridization probes (M-Probes) that overcomes these challenges. We demonstrate that M-Probes can tolerate sequence variations of up to 7 nt at prescribed positions while maintaining single nucleotide sensitivity at other positions. M-Probes are also shown to be capable of sequence-selectively binding a continuous DNA sequence of more than 500 nt. Furthermore, we show that M-Probes can detect genes with triplet repeats exceeding a programmed threshold. As a demonstration of this technology, we have developed a hybrid capture method to determine the exact triplet repeat expansion number in the Huntington's gene of genomic DNA using quantitative PCR.
Rapid construction of insulated genetic circuits via synthetic sequence-guided isothermal assembly

DOE Office of Scientific and Technical Information (OSTI.GOV)

Torella, JP; Boehm, CR; Lienert, F

2013-12-28

In vitro recombination methods have enabled one-step construction of large DNA sequences from multiple parts. Although synthetic biological circuits can in principle be assembled in the same fashion, they typically contain repeated sequence elements such as standard promoters and terminators that interfere with homologous recombination. Here we use a computational approach to design synthetic, biologically inactive unique nucleotide sequences (UNSes) that facilitate accurate ordered assembly. Importantly, our designed UNSes make it possible to assemble parts with repeated terminator and insulator sequences, and thereby create insulated functional genetic circuits in bacteria and mammalian cells. Using UNS-guided assembly to construct repeating promoter-gene-terminatormore » parts, we systematically varied gene expression to optimize production of a deoxychromoviridans biosynthetic pathway in Escherichia coli. We then used this system to construct complex eukaryotic AND-logic gates for genomic integration into embryonic stem cells. Construction was performed by using a standardized series of UNS-bearing BioBrick-compatible vectors, which enable modular assembly and facilitate reuse of individual parts. UNS-guided isothermal assembly is broadly applicable to the construction and optimization of genetic circuits and particularly those requiring tight insulation, such as complex biosynthetic pathways, sensors, counters and logic gates.« less
The La-related protein 1-specific domain repurposes HEAT-like repeats to directly bind a 5'TOP sequence.

PubMed

Lahr, Roni M; Mack, Seshat M; Héroux, Annie; Blagden, Sarah P; Bousquet-Antonelli, Cécile; Deragon, Jean-Marc; Berman, Andrea J

2015-09-18

La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. A putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. These studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
The La-related protein 1-specific domain repurposes HEAT-like repeats to directly bind a 5'TOP sequence

DOE PAGES

Lahr, Roni M.; Mack, Seshat M.; Heroux, Annie; ...

2015-07-22

La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. Amore » putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. Ultimately, these studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis.« less
Iterative dictionary construction for compression of large DNA data sets.

PubMed

Kuruppu, Shanika; Beresford-Smith, Bryan; Conway, Thomas; Zobel, Justin

2012-01-01

Genomic repositories increasingly include individual as well as reference sequences, which tend to share long identical and near-identical strings of nucleotides. However, the sequential processing used by most compression algorithms, and the volumes of data involved, mean that these long-range repetitions are not detected. An order-insensitive, disk-based dictionary construction method can detect this repeated content and use it to compress collections of sequences. We explore a dictionary construction method that improves repeat identification in large DNA data sets. Our adaptation, COMRAD, of an existing disk-based method identifies exact repeated content in collections of sequences with similarities within and across the set of input sequences. COMRAD compresses the data over multiple passes, which is an expensive process, but allows COMRAD to compress large data sets within reasonable time and space. COMRAD allows for random access to individual sequences and subsequences without decompressing the whole data set. COMRAD has no competitor in terms of the size of data sets that it can compress (extending to many hundreds of gigabytes) and, even for smaller data sets, the results are competitive compared to alternatives; as an example, 39 S. cerevisiae genomes compressed to 0.25 bits per base.
The primitive code and repeats of base oligomers as the primordial protein-encoding sequence.

PubMed Central

Ohno, S; Epplen, J T

1983-01-01

Even if the prebiotic self-replication of nucleic acids and the subsequent emergence of primitive, enzyme-independent tRNAs are accepted as plausible, the origin of life by spontaneous generation still appears improbable. This is because the just-emerged primitive translational machinery had to cope with base sequences that were not preselected for their coding potentials. Particularly if the primitive mitochondria-like code with four chain-terminating base triplets preceded the universal code, the translation of long, randomly generated, base sequences at this critical stage would have merely resulted in the production of short oligopeptides instead of long polypeptide chains. We present the base sequence of a mouse transcript containing tetranucleotide repeats conserved during evolution. Even if translated in accordance with the primitive mitochondria-like code, this transcript in its three reading frames can yield 245-, 246-, and 251-residue-long tetrapeptidic periodical polypeptides that are already acquiring longer periodicities. We contend that the first set of base sequences translated at the beginning of life were such oligonucleotide repeats. By quickly acquiring longer periodicities, their products must have soon gained characteristic secondary structures--alpha-helical or beta-sheet or both. PMID:6574491
Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution

USDA-ARS?s Scientific Manuscript database

Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres comprise of megabase-scale arrays of tandem repeats. The true prevalence of centromere tandem repeats, and whether they exhibit conserved seque...
Insights on genome size evolution from a miniature inverted repeat transposon driving a satellite DNA.

PubMed

Scalvenzi, Thibault; Pollet, Nicolas

2014-12-01

The genome size in eukaryotes does not correlate well with the number of genes they contain. We can observe this so-called C-value paradox in amphibian species. By analyzing an amphibian genome we asked how repetitive DNA can impact genome size and architecture. We describe here our discovery of a Tc1/mariner miniature inverted-repeat transposon family present in Xenopus frogs. These transposons named miDNA4 are unique since they contain a satellite DNA motif. We found that miDNA4 measured 331 bp, contained 25 bp long inverted terminal repeat sequences and a sequence motif of 119 bp present as a unique copy or as an array of 2-47 copies. We characterized the structure, dynamics, impact and evolution of the miDNA4 family and its satellite DNA in Xenopus frog genomes. This led us to propose a model for the evolution of these two repeated sequences and how they can synergize to increase genome size. Copyright © 2014 Elsevier Inc. All rights reserved.
The CentO satellite confers translational and rotational phasing on cenH3 nucleosomes in rice centromeres.

PubMed

Zhang, Tao; Talbert, Paul B; Zhang, Wenli; Wu, Yufeng; Yang, Zujun; Henikoff, Jorja G; Henikoff, Steven; Jiang, Jiming

2013-12-10

Plant and animal centromeres comprise megabases of highly repeated satellite sequences, yet centromere function can be specified epigenetically on single-copy DNA by the presence of nucleosomes containing a centromere-specific variant of histone H3 (cenH3). We determined the positions of cenH3 nucleosomes in rice (Oryza sativa), which has centromeres composed of both the 155-bp CentO satellite repeat and single-copy non-CentO sequences. We find that cenH3 nucleosomes protect 90-100 bp of DNA from micrococcal nuclease digestion, sufficient for only a single wrap of DNA around the cenH3 nucleosome core. cenH3 nucleosomes are translationally phased with 155-bp periodicity on CentO repeats, but not on non-CentO sequences. CentO repeats have an ∼10-bp periodicity in WW dinucleotides and in micrococcal nuclease cleavage, providing evidence for rotational phasing of cenH3 nucleosomes on CentO and suggesting that satellites evolve for translational and rotational stabilization of centromeric nucleosomes.
Fingerprinting of Cyanobacteria Based on PCR with Primers Derived from Short and Long Tandemly Repeated Repetitive Sequences

PubMed Central

Rasmussen, Ulla; Svenning, Mette M.

1998-01-01

The presence of repeated DNA (short tandemly repeated repetitive [STRR] and long tandemly repeated repetitive [LTRR]) sequences in the genome of cyanobacteria was used to generate a fingerprint method for symbiotic and free-living isolates. Primers corresponding to the STRR and LTRR sequences were used in the PCR, resulting in a method which generate specific fingerprints for individual isolates. The method was useful both with purified DNA and with intact cyanobacterial filaments or cells as templates for the PCR. Twenty-three Nostoc isolates from a total of 35 were symbiotic isolates from the angiosperm Gunnera species, including isolates from the same Gunnera species as well as from different species. The results show a genetic similarity among isolates from different Gunnera species as well as a genetic heterogeneity among isolates from the same Gunnera species. Isolates which have been postulated to be closely related or identical revealed similar results by the PCR method, indicating that the technique is useful for clustering of even closely related strains. The method was applied to nonheterocystus cyanobacteria from which a fingerprint pattern was obtained. PMID:16349487
Similarities in the chromosomal distribution of AG and AC repeats within and between Drosophila, human and barley chromosomes.

PubMed

Cuadrado, A; Jouve, N

2007-01-01

Two simple sequence repeats (SSRs), AG and AC, were mapped directly in the metaphase chromosomes of man and barley (Hordeum vulgare L.), and in the metaphase and polytene chromosomes of Drosophila melanogaster. To this end, synthetic oligonucleotides corresponding to (AG)(12) and (AC)(8) were labelled by the random primer technique and used as probes in fluorescent in situ hybridisation (FISH) under high stringency and strict washing conditions. The distribution and intensity of the signals for the repeat sequences were found to be characteristic of the chromosomes and genomes of the three species analysed. The AC repeat sites were uniformly dispersed along the euchromatic segments of all three genomes; in fact, they were largely excluded from the heterochromatin. The Drosophila genome showed a high density of AC sequences on the X chromosome in both mitotic and polytene nuclei. In contrast, the AG repeats were associated with the euchromatic regions of the polytene chromosomes (and in high density on the X chromosome), but were only seen in specific heterochromatic regions in the mitotic chromosomes of all three species. In Drosophila, the AG repeats were exclusively distributed on the tips of the Y chromosome and near the centromere on both arms of chromosome 2. In barley and man, AG repeats were associated with the centromeres (of all chromosomes) and nucleolar organizer regions, respectively. The conserved chromosome distribution of AC within and between these three phylogenetically distant species, and the association of AG in specific chromosome regions with structural or functional properties, suggests that long clusters of these repeats may have some, as yet unknown, role. Copyright (c) 2007 S. Karger AG, Basel.
Short Tandem Repeat DNA Internet Database

National Institute of Standards and Technology Data Gateway

SRD 130 Short Tandem Repeat DNA Internet Database (Web, free access) Short Tandem Repeat DNA Internet Database is intended to benefit research and application of short tandem repeat DNA markers for human identity testing. Facts and sequence information on each STR system, population data, commonly used multiplex STR systems, PCR primers and conditions, and a review of various technologies for analysis of STR alleles have been included.
CRISPR recognition tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats.

PubMed

Bland, Charles; Ramsey, Teresa L; Sabree, Fareedah; Lowe, Micheal; Brown, Kyndall; Kyrpides, Nikos C; Hugenholtz, Philip

2007-06-18

Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel type of direct repeat found in a wide range of bacteria and archaea. CRISPRs are beginning to attract attention because of their proposed mechanism; that is, defending their hosts against invading extrachromosomal elements such as viruses. Existing repeat detection tools do a poor job of identifying CRISPRs due to the presence of unique spacer sequences separating the repeats. In this study, a new tool, CRT, is introduced that rapidly and accurately identifies CRISPRs in large DNA strings, such as genomes and metagenomes. CRT was compared to CRISPR detection tools, Patscan and Pilercr. In terms of correctness, CRT was shown to be very reliable, demonstrating significant improvements over Patscan for measures precision, recall and quality. When compared to Pilercr, CRT showed improved performance for recall and quality. In terms of speed, CRT proved to be a huge improvement over Patscan. Both CRT and Pilercr were comparable in speed, however CRT was faster for genomes containing large numbers of repeats. In this paper a new tool was introduced for the automatic detection of CRISPR elements. This tool, CRT, showed some important improvements over current techniques for CRISPR identification. CRT's approach to detecting repetitive sequences is straightforward. It uses a simple sequential scan of a DNA sequence and detects repeats directly without any major conversion or preprocessing of the input. This leads to a program that is easy to describe and understand; yet it is very accurate, fast and memory efficient, being O(n) in space and O(nm/l) in time.
CRISPR Recognition Tool (CRT): a tool for automatic detection ofclustered regularly interspaced palindromic repeats

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bland, Charles; Ramsey, Teresa L.; Sabree, Fareedah

Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel type of direct repeat found in a wide range of bacteria and archaea. CRISPRs are beginning to attract attention because of their proposed mechanism; that is, defending their hosts against invading extrachromosomal elements such as viruses. Existing repeat detection tools do a poor job of identifying CRISPRs due to the presence of unique spacer sequences separating the repeats. In this study, a new tool, CRT, is introduced that rapidly and accurately identifies CRISPRs in large DNA strings, such as genomes and metagenomes. CRT was compared to CRISPR detection tools, Patscan andmore » Pilercr. In terms of correctness, CRT was shown to be very reliable, demonstrating significant improvements over Patscan for measures precision, recall and quality. When compared to Pilercr, CRT showed improved performance for recall and quality. In terms of speed, CRT also demonstrated superior performance, especially for genomes containing large numbers of repeats. In this paper a new tool was introduced for the automatic detection of CRISPR elements. This tool, CRT, was shown to be a significant improvement over the current techniques for CRISPR identification. CRT's approach to detecting repetitive sequences is straightforward. It uses a simple sequential scan of a DNA sequence and detects repeats directly without any major conversion or preprocessing of the input. This leads to a program that is easy to describe and understand; yet it is very accurate, fast and memory efficient, being O(n) in space and O(nm/l) in time.« less
A complete mitochondrial genome sequence of Asian black bear Sichuan subspecies (Ursus thibetanus mupinensis)

PubMed Central

Hou, Wan-ru; Chen, Yu; Wu, Xia; Hu, Jin-chu; Peng, Zheng-song; Yang, Jung; Tang, Zong-xiang; Zhou, Cai-Quan; Li, Yu-ming; Yang, Shi-kui; Du, Yu-jie; Kong, Ling-lu; Ren, Zheng-long; Zhang, Huai-yu; Shuai, Su-rong

2007-01-01

We obtained the complete mitochondrial genome of U.thibetanus mupinensis by DNA sequencing based on the PCR fragments of 18 primers we designed. The results indicate that the mtDNA is 16 868 bp in size, encodes 13 protein genes, 22 tRNA genes, and 2 rRNA genes, with an overall H-strand base composition of 31.2% A, 25.4% C, 15.5% G and 27.9% T. The sequence of the control region (CR) located between tRNA-Pro and tRNA-Phe is 1422 bp in size, consists of 8.43% of the whole genome, GC content is 51.9% and has a 6bp tandem repeat and two 10bp tandem repeats identified by using the Tandem Repeats Finder. U. thibetanus mupinensis mitochondrial genome shares high similarity with those of three other Ursidae: U. americanus (91.46%), U. arctos (89.25%) and U. maritimus (87.66%). PMID:17205108
The DL1 repeats in the genome of Diphyllobothrium latum.

PubMed

Usmanova, Nadezhda M; Kazakov, Vasiliy I

2010-07-01

Diphyllobothrium latum is a widespread intestinal parasite, which has a great clinical relevance, but there are no sequences of its nuclear genome. In this paper, a repetitive element in the D. latum genome is firstly described. The adult D. latum was obtained in the result of expulsion from intestinum of a patient suffering from diphyllobothriasis. Genomic DNA was isolated from several proglottids of this individual. PstI restriction products of D. latum genomic DNA were sequenced. Polymerase chain reaction (PCR) amplification of these products using genomic DNA and selected primers was carried out. Thereby a cluster of a repetitive element, called DL1, was discovered. For precise identification of a beginning and an end of the repeat, a product of PCR amplification of D. latum genomic DNA with one specific primer was sequenced. In discussion, several evidences that DL1 repeat is a member of the SINE family of retroposons were adduced.
Complete Sequence and Comparative Analysis of the Chloroplast Genome of Coconut Palm (Cocos nucifera)

PubMed Central

Huang, Ya-Yi; Matzke, Antonius J. M.; Matzke, Marjori

2013-01-01

Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available. PMID:24023703
Complete sequence and comparative analysis of the chloroplast genome of coconut palm (Cocos nucifera).

PubMed

Huang, Ya-Yi; Matzke, Antonius J M; Matzke, Marjori

2013-01-01

Coconut, a member of the palm family (Arecaceae), is one of the most economically important trees used by mankind. Despite its diverse morphology, coconut is recognized taxonomically as only a single species (Cocos nucifera L.). There are two major coconut varieties, tall and dwarf, the latter of which displays traits resulting from selection by humans. We report here the complete chloroplast (cp) genome of a dwarf coconut plant, and describe the gene content and organization, inverted repeat fluctuations, repeated sequence structure, and occurrence of RNA editing. Phylogenetic relationships of monocots were inferred based on 47 chloroplast protein-coding genes. Potential nodes for events of gene duplication and pseudogenization related to inverted repeat fluctuation were mapped onto the tree using parsimony criteria. We compare our findings with those from other palm species for which complete cp genome sequences are available.
Sequence heuristics to encode phase behaviour in intrinsically disordered protein polymers

PubMed Central

Quiroz, Felipe García; Chilkoti, Ashutosh

2015-01-01

Proteins and synthetic polymers that undergo aqueous phase transitions mediate self-assembly in nature and in man-made material systems. Yet little is known about how the phase behaviour of a protein is encoded in its amino acid sequence. Here, by synthesizing intrinsically disordered, repeat proteins to test motifs that we hypothesized would encode phase behaviour, we show that the proteins can be designed to exhibit tunable lower or upper critical solution temperature (LCST and UCST, respectively) transitions in physiological solutions. We also show that mutation of key residues at the repeat level abolishes phase behaviour or encodes an orthogonal transition. Furthermore, we provide heuristics to identify, at the proteome level, proteins that might exhibit phase behaviour and to design novel protein polymers consisting of biologically active peptide repeats that exhibit LCST or UCST transitions. These findings set the foundation for the prediction and encoding of phase behaviour at the sequence level. PMID:26390327

A SSR-based genetic linkage map of cultivated peanut (Arachis hypogaea L.)

USDA-ARS?s Scientific Manuscript database

The objective of this study was to construct a molecular linkage map of cultivated tetraploid peanut using simple sequence repeat (SSR) markers derived primarily from peanut genomic sequences, expressed sequence tags (ESTs), and by "data mining" sequences released in GenBank. Three recombinant inbre...
Long-read sequencing and de novo assembly of a Chinese genome

USDA-ARS?s Scientific Manuscript database

Short-read sequencing has enabled the de novo assembly of several individual human genomes, but with inherent limitations in characterizing repeat elements. Here we sequence a Chinese individual HX1 by single-molecule real-time (SMRT) long-read sequencing, construct a physical map by NanoChannel arr...
Development and transferability of black and red raspberry microsatellite markers from short-read sequences

USDA-ARS?s Scientific Manuscript database

The advent of next-generation sequencing technologies has been a boon to the cost-effective development of molecular markers, particularly in non-model species. Here, we demonstrate the efficiency of microsatellite or simple sequence repeat (SSR) marker development from short-read sequences using th...
Sequence analysis reveals genomic factors affecting EST-SSR primer performance and polymorphism

USDA-ARS?s Scientific Manuscript database

Search for simple sequence repeat (SSR) motifs and design of flanking primers in expressed sequence tag (EST) sequences can be easily done at a large scale using bioinformatics programs. However, failed amplification and/or detection, along with lack of polymorphism, is often seen among randomly sel...
Repetitive sequence analysis and karyotyping reveals centromere-associated DNA sequences in radish (Raphanus sativus L.).

PubMed

He, Qunyan; Cai, Zexi; Hu, Tianhua; Liu, Huijun; Bao, Chonglai; Mao, Weihai; Jin, Weiwei

2015-04-18

Radish (Raphanus sativus L., 2n = 2x = 18) is a major root vegetable crop especially in eastern Asia. Radish root contains various nutritions which play an important role in strengthening immunity. Repetitive elements are primary components of the genomic sequence and the most important factors in genome size variations in higher eukaryotes. To date, studies about repetitive elements of radish are still limited. To better understand genome structure of radish, we undertook a study to evaluate the proportion of repetitive elements and their distribution in radish. We conducted genome-wide characterization of repetitive elements in radish with low coverage genome sequencing followed by similarity-based cluster analysis. Results showed that about 31% of the genome was composed of repetitive sequences. Satellite repeats were the most dominating elements of the genome. The distribution pattern of three satellite repeat sequences (CL1, CL25, and CL43) on radish chromosomes was characterized using fluorescence in situ hybridization (FISH). CL1 was predominantly located at the centromeric region of all chromosomes, CL25 located at the subtelomeric region, and CL43 was a telomeric satellite. FISH signals of two satellite repeats, CL1 and CL25, together with 5S rDNA and 45S rDNA, provide useful cytogenetic markers to identify each individual somatic metaphase chromosome. The centromere-specific histone H3 (CENH3) has been used as a marker to identify centromere DNA sequences. One putative CENH3 (RsCENH3) was characterized and cloned from radish. Its deduced amino acid sequence shares high similarities to those of the CENH3s in Brassica species. An antibody against B. rapa CENH3, specifically stained radish centromeres. Immunostaining and chromatin immunoprecipitation (ChIP) tests with anti-BrCENH3 antibody demonstrated that both the centromere-specific retrotransposon (CR-Radish) and satellite repeat (CL1) are directly associated with RsCENH3 in radish. Proportions of repetitive elements in radish were estimated and satellite repeats were the most dominating elements. Fine karyotyping analysis was established which allow us to easily identify each individual somatic metaphase chromosome. Immunofluorescence- and ChIP-based assays demonstrated the functional significance of satellite and centromere-specific retrotransposon at centromeres. Our study provides a valuable basis for future genomic studies in radish.
R-loops: targets for nuclease cleavage and repeat instability.

PubMed

Freudenreich, Catherine H

2018-01-11

R-loops form when transcribed RNA remains bound to its DNA template to form a stable RNA:DNA hybrid. Stable R-loops form when the RNA is purine-rich, and are further stabilized by DNA secondary structures on the non-template strand. Interestingly, many expandable and disease-causing repeat sequences form stable R-loops, and R-loops can contribute to repeat instability. Repeat expansions are responsible for multiple neurodegenerative diseases, including Huntington's disease, myotonic dystrophy, and several types of ataxias. Recently, it was found that R-loops at an expanded CAG/CTG repeat tract cause DNA breaks as well as repeat instability (Su and Freudenreich, Proc Natl Acad Sci USA 114, E8392-E8401, 2017). Two factors were identified as causing R-loop-dependent breaks at CAG/CTG tracts: deamination of cytosines and the MutLγ (Mlh1-Mlh3) endonuclease, defining two new mechanisms for how R-loops can generate DNA breaks (Su and Freudenreich, Proc Natl Acad Sci USA 114, E8392-E8401, 2017). Following R-loop-dependent nicking, base excision repair resulted in repeat instability. These results have implications for human repeat expansion diseases and provide a paradigm for how RNA:DNA hybrids can cause genome instability at structure-forming DNA sequences. This perspective summarizes mechanisms of R-loop-induced fragility at G-rich repeats and new links between DNA breaks and repeat instability.
QueTAL: a suite of tools to classify and compare TAL effectors functionally and phylogenetically

PubMed Central

Pérez-Quintero, Alvaro L.; Lamy, Léo; Gordon, Jonathan L.; Escalon, Aline; Cunnac, Sébastien; Szurek, Boris; Gagnevin, Lionel

2015-01-01

Transcription Activator-Like (TAL) effectors from Xanthomonas plant pathogenic bacteria can bind to the promoter region of plant genes and induce their expression. DNA-binding specificity is governed by a central domain made of nearly identical repeats, each determining the recognition of one base pair via two amino acid residues (a.k.a. Repeat Variable Di-residue, or RVD). Knowing how TAL effectors differ from each other within and between strains would be useful to infer functional and evolutionary relationships, but their repetitive nature precludes reliable use of traditional alignment methods. The suite QueTAL was therefore developed to offer tailored tools for comparison of TAL effector genes. The program DisTAL considers each repeat as a unit, transforms a TAL effector sequence into a sequence of coded repeats and makes pair-wise alignments between these coded sequences to construct trees. The program FuncTAL is aimed at finding TAL effectors with similar DNA-binding capabilities. It calculates correlations between position weight matrices of potential target DNA sequence predicted from the RVD sequence, and builds trees based on these correlations. The programs accurately represented phylogenetic and functional relationships between TAL effectors using either simulated or literature-curated data. When using the programs on a large set of TAL effector sequences, the DisTAL tree largely reflected the expected species phylogeny. In contrast, FuncTAL showed that TAL effectors with similar binding capabilities can be found between phylogenetically distant taxa. This suite will help users to rapidly analyse any TAL effector genes of interest and compare them to other available TAL genes and should improve our understanding of TAL effectors evolution. It is available at http://bioinfo-web.mpl.ird.fr/cgi-bin2/quetal/quetal.cgi. PMID:26284082
In-silico mining, type and frequency analysis of genic microsatellites of finger millet (Eleusine coracana (L.) Gaertn.): a comparative genomic analysis of NBS-LRR regions of finger millet with rice.

PubMed

Kalyana Babu, B; Pandey, Dinesh; Agrawal, P K; Sood, Salej; Kumar, Anil

2014-05-01

In recent years, the increased availability of the DNA sequences has given the possibility to develop and explore the expressed sequence tags (ESTs) derived SSR markers. In the present study, a total of 1956 ESTs of finger millet were used to find the microsatellite type, distribution, frequency and developed a total of 545 primer pairs from the ESTs of finger millet. Thirty-two EST sequences had more than two microsatellites and 1357 sequences did not have any SSR repeats. The most frequent type of repeats was trimeric motif, however the second place was occupied by dimeric motif followed by tetra-, hexa- and penta repeat motifs. The most common dimer repeat motif was GA and in case of trimeric SSRs, it was CGG. The EST sequences of NBS-LRR region of finger millet and rice showed higher synteny and were found on nearly same positions on the rice chromosome map. A total of eight, out of 15 EST based SSR primers were polymorphic among the selected resistant and susceptible finger millet genotypes. The primer FMBLEST5 could able to differentiate them into resistant and susceptible genotypes. The alleles specific to the resistant and susceptible genotypes were sequenced using the ABI 3130XL genetic analyzer and found similarity to NBS-LRR regions of rice and finger millet and contained the characteristic kinase-2 and kinase 3a motifs of plant R-genes belonged to NBS-LRR region. The In-silico and comparative analysis showed that the genes responsible for blast resistance can be identified, mapped and further introgressed through molecular breeding approaches for enhancing the blast resistance in finger millet.
Repeated-Sprint Sequences During Female Soccer Matches Using Fixed and Individual Speed Thresholds.

PubMed

Nakamura, Fábio Y; Pereira, Lucas A; Loturco, Irineu; Rosseti, Marcelo; Moura, Felipe A; Bradley, Paul S

2017-07-01

Nakamura, FY, Pereira, LA, Loturco, I, Rosseti, M, Moura, FA, and Bradley, PS. Repeated-sprint sequences during female soccer matches using fixed and individual speed thresholds. J Strength Cond Res 31(7): 1802-1810, 2017-The main objective of this study was to characterize the occurrence of single sprint and repeated-sprint sequences (RSS) during elite female soccer matches, using fixed (20 km·h) and individually based speed thresholds (>90% of the mean speed from a 20-m sprint test). Eleven elite female soccer players from the same team participated in the study. All players performed a 20-m linear sprint test, and were assessed in up to 10 official matches using Global Positioning System technology. Magnitude-based inferences were used to test for meaningful differences. Results revealed that irrespective of adopting fixed or individual speed thresholds, female players produced only a few RSS during matches (2.3 ± 2.4 sequences using the fixed threshold and 3.3 ± 3.0 sequences using the individually based threshold), with most sequences composing of just 2 sprints. Additionally, central defenders performed fewer sprints (10.2 ± 4.1) than other positions (fullbacks: 28.1 ± 5.5; midfielders: 21.9 ± 10.5; forwards: 31.9 ± 11.1; with the differences being likely to almost certainly associated with effect sizes ranging from 1.65 to 2.72), and sprinting ability declined in the second half. The data do not support the notion that RSS occurs frequently during soccer matches in female players, irrespective of using fixed or individual speed thresholds to define sprint occurrence. However, repeated-sprint ability development cannot be ruled out from soccer training programs because of its association with match-related performance.
Rapid evolutionary change of common bean (Phaseolus vulgaris L) plastome, and the genomic diversification of legume chloroplasts

PubMed Central

Guo, Xianwu; Castillo-Ramírez, Santiago; González, Víctor; Bustos, Patricia; Luís Fernández-Vázquez, José; Santamaría, Rosa Isela; Arellano, Jesús; Cevallos, Miguel A; Dávila, Guillermo

2007-01-01

Background Fabaceae (legumes) is one of the largest families of flowering plants, and some members are important crops. In contrast to what we know about their great diversity or economic importance, our knowledge at the genomic level of chloroplast genomes (cpDNAs or plastomes) for these crops is limited. Results We sequenced the complete genome of the common bean (Phaseolus vulgaris cv. Negro Jamapa) chloroplast. The plastome of P. vulgaris is a 150,285 bp circular molecule. It has gene content similar to that of other legume plastomes, but contains two pseudogenes, rpl33 and rps16. A distinct inversion occurred at the junction points of trnH-GUG/rpl14 and rps19/rps8, as in adzuki bean [1]. These two pseudogenes and the inversion were confirmed in 10 varieties representing the two domestication centers of the bean. Genomic comparative analysis indicated that inversions generally occur in legume plastomes and the magnitude and localization of insertions/deletions (indels) also vary. The analysis of repeat sequences demonstrated that patterns and sequences of tandem repeats had an important impact on sequence diversification between legume plastomes and tandem repeats did not belong to dispersed repeats. Interestingly, P. vulgaris plastome had higher evolutionary rates of change on both genomic and gene levels than G. max, which could be the consequence of pressure from both mutation and natural selection. Conclusion Legume chloroplast genomes are widely diversified in gene content, gene order, indel structure, abundance and localization of repetitive sequences, intracellular sequence exchange and evolutionary rates. The P. vulgaris plastome is a rapidly evolving genome. PMID:17623083
[Comparative genomics and evolutionary analysis of CRISPR loci in acetic acid bacteria].

PubMed

Xia, Kai; Liang, Xin-le; Li, Yu-dong

2015-12-01

The clustered regularly interspaced short palindromic repeat (CRISPR) is a widespread adaptive immunity system that exists in most archaea and many bacteria against foreign DNA, such as phages, viruses and plasmids. In general, CRISPR system consists of direct repeat, leader, spacer and CRISPR-associated sequences. Acetic acid bacteria (AAB) play an important role in industrial fermentation of vinegar and bioelectrochemistry. To investigate the polymorphism and evolution pattern of CRISPR loci in acetic acid bacteria, bioinformatic analyses were performed on 48 species from three main genera (Acetobacter, Gluconacetobacter and Gluconobacter) with whole genome sequences available from the NCBI database. The results showed that the CRISPR system existed in 32 species of the 48 strains studied. Most of the CRISPR-Cas system in AAB belonged to type I CRISPR-Cas system (subtype E and C), but type II CRISPR-Cas system which contain cas9 gene was only found in the genus Acetobacter and Gluconacetobacter. The repeat sequences of some CRISPR were highly conserved among species from different genera, and the leader sequences of some CRISPR possessed conservative motif, which was associated with regulated promoters. Moreover, phylogenetic analysis of cas1 demonstrated that they were suitable for classification of species. The conservation of cas1 genes was associated with that of repeat sequences among different strains, suggesting they were subjected to similar functional constraints. Moreover, the number of spacer was positively correlated with the number of prophages and insertion sequences, indicating the acetic acid bacteria were continually invaded by new foreign DNA. The comparative analysis of CRISR loci in acetic acid bacteria provided the basis for investigating the molecular mechanism of different acetic acid tolerance and genome stability in acetic acid bacteria.
Microsatellite DNA in genomic survey sequences and UniGenes of loblolly pine

Treesearch

Craig S Echt; Surya Saha; Dennis L Deemer; C Dana Nelson

2011-01-01

Genomic DNA sequence databases are a potential and growing resource for simple sequence repeat (SSR) marker development in loblolly pine (Pinus taeda L.). Loblolly pine also has many expressed sequence tags (ESTs) available for microsatellite (SSR) marker development. We compared loblolly pine SSR densities in genome survey sequences (GSSs) to those in non-redundant...
Characterization and Amplification of Gene-Based Simple Sequence Repeat (SSR) Markers in Date Palm.

PubMed

Zhao, Yongli; Keremane, Manjunath; Prakash, Channapatna S; He, Guohao

2017-01-01

The paucity of molecular markers limits the application of genetic and genomic research in date palm (Phoenix dactylifera L.). Availability of expressed sequence tag (EST) sequences in date palm may provide a good resource for developing gene-based markers. This study characterizes a substantial fraction of transcriptome sequences containing simple sequence repeats (SSRs) from the EST sequences in date palm. The EST sequences studied are mainly homologous to those of Elaeis guineensis and Musa acuminata. A total of 911 gene-based SSR markers, characterized with functional annotations, have provided a useful basis not only for discovering candidate genes and understanding genetic basis of traits of interest but also for developing genetic and genomic tools for molecular research in date palm, such as diversity study, quantitative trait locus (QTL) mapping, and molecular breeding. The procedures of DNA extraction, polymerase chain reaction (PCR) amplification of these gene-based SSR markers, and gel electrophoresis of PCR products are described in this chapter.
Distribution and Characteristics of Repeating Earthquakes in Northern California

NASA Astrophysics Data System (ADS)

Waldhauser, F.; Schaff, D. P.; Zechar, J. D.; Shaw, B. E.

2012-12-01

Repeating earthquakes are playing an increasingly important role in the study of fault processes and behavior, and have the potential to improve hazard assessment, earthquake forecast, and seismic monitoring capabilities. These events rupture the same fault patch repeatedly, generating virtually identical seismograms. In California, repeating earthquakes have been found predominately along the creeping section of the central San Andreas Fault, where they are believed to represent failing asperities on an otherwise creeping fault. Here, we use the northern California double-difference catalog of 450,000 precisely located events (1984-2009) and associated database of 2 billion waveform cross-correlation measurements to systematically search for repeating earthquakes across various tectonic regions. An initial search for pairs of earthquakes with high-correlation coefficients and similar magnitudes resulted in 4,610 clusters including a total of over 26,000 earthquakes. A subsequent double-difference re-analysis of these clusters resulted in 1,879 sequences (8,640 events) where a common rupture area can be resolved to the precision of a few tens of meters or less. These repeating earthquake sequences (RES) include between 3 and 24 events with magnitudes up to ML=4. We compute precise relative magnitudes between events in each sequence from differential amplitude measurements. Differences between these and standard coda-duration magnitudes have a standard deviation of 0.09. The RES occur throughout northern California, but RES with 10 or more events (6%) only occur along the central San Andreas and Calaveras faults. We are establishing baseline characteristics for each sequence, such as recurrence intervals and their coefficient of variation (CV), in order to compare them across tectonic regions. CVs for these clusters range from 0.002 to 2.6, indicating a range of behavior between periodic occurrence (CV~0), random occurrence, and temporal clustering. 10% of the RES show burst-like behavior with mean recurrence times smaller than one month. 5% of the RES have mean recurrence times greater than one year and include more than 10 earthquakes. Earthquakes in the 50 most periodic sequences (CV<0.2) do not appear to be predictable by either time- or slip-predictable models, consistent with previous findings. We demonstrate that changes in recurrence intervals of repeating earthquakes can be routinely monitored. This is especially important for sequences with CV~0, as they may indicate changes in the loading rate. We also present results from retrospective forecast experiments based on near-real time hazard functions.
Pms2 Suppresses Large Expansions of the (GAA·TTC)n Sequence in Neuronal Tissues

PubMed Central

Bourn, Rebecka L.; De Biase, Irene; Pinto, Ricardo Mouro; Sandi, Chiranjeevi; Al-Mahdawi, Sahar; Pook, Mark A.; Bidichandani, Sanjay I.

2012-01-01

Expanded trinucleotide repeat sequences are the cause of several inherited neurodegenerative diseases. Disease pathogenesis is correlated with several features of somatic instability of these sequences, including further large expansions in postmitotic tissues. The presence of somatic expansions in postmitotic tissues is consistent with DNA repair being a major determinant of somatic instability. Indeed, proteins in the mismatch repair (MMR) pathway are required for instability of the expanded (CAG·CTG)n sequence, likely via recognition of intrastrand hairpins by MutSβ. It is not clear if or how MMR would affect instability of disease-causing expanded trinucleotide repeat sequences that adopt secondary structures other than hairpins, such as the triplex/R-loop forming (GAA·TTC)n sequence that causes Friedreich ataxia. We analyzed somatic instability in transgenic mice that carry an expanded (GAA·TTC)n sequence in the context of the human FXN locus and lack the individual MMR proteins Msh2, Msh6 or Pms2. The absence of Msh2 or Msh6 resulted in a dramatic reduction in somatic mutations, indicating that mammalian MMR promotes instability of the (GAA·TTC)n sequence via MutSα. The absence of Pms2 resulted in increased accumulation of large expansions in the nervous system (cerebellum, cerebrum, and dorsal root ganglia) but not in non-neuronal tissues (heart and kidney), without affecting the prevalence of contractions. Pms2 suppressed large expansions specifically in tissues showing MutSα-dependent somatic instability, suggesting that they may act on the same lesion or structure associated with the expanded (GAA·TTC)n sequence. We conclude that Pms2 specifically suppresses large expansions of a pathogenic trinucleotide repeat sequence in neuronal tissues, possibly acting independently of the canonical MMR pathway. PMID:23071719
Pms2 suppresses large expansions of the (GAA·TTC)n sequence in neuronal tissues.

PubMed

Bourn, Rebecka L; De Biase, Irene; Pinto, Ricardo Mouro; Sandi, Chiranjeevi; Al-Mahdawi, Sahar; Pook, Mark A; Bidichandani, Sanjay I

2012-01-01

Expanded trinucleotide repeat sequences are the cause of several inherited neurodegenerative diseases. Disease pathogenesis is correlated with several features of somatic instability of these sequences, including further large expansions in postmitotic tissues. The presence of somatic expansions in postmitotic tissues is consistent with DNA repair being a major determinant of somatic instability. Indeed, proteins in the mismatch repair (MMR) pathway are required for instability of the expanded (CAG·CTG)(n) sequence, likely via recognition of intrastrand hairpins by MutSβ. It is not clear if or how MMR would affect instability of disease-causing expanded trinucleotide repeat sequences that adopt secondary structures other than hairpins, such as the triplex/R-loop forming (GAA·TTC)(n) sequence that causes Friedreich ataxia. We analyzed somatic instability in transgenic mice that carry an expanded (GAA·TTC)(n) sequence in the context of the human FXN locus and lack the individual MMR proteins Msh2, Msh6 or Pms2. The absence of Msh2 or Msh6 resulted in a dramatic reduction in somatic mutations, indicating that mammalian MMR promotes instability of the (GAA·TTC)(n) sequence via MutSα. The absence of Pms2 resulted in increased accumulation of large expansions in the nervous system (cerebellum, cerebrum, and dorsal root ganglia) but not in non-neuronal tissues (heart and kidney), without affecting the prevalence of contractions. Pms2 suppressed large expansions specifically in tissues showing MutSα-dependent somatic instability, suggesting that they may act on the same lesion or structure associated with the expanded (GAA·TTC)(n) sequence. We conclude that Pms2 specifically suppresses large expansions of a pathogenic trinucleotide repeat sequence in neuronal tissues, possibly acting independently of the canonical MMR pathway.
Characterization of genic microsatellite markers derived from expressed sequence tags in Pacific abalone ( Haliotis discus hannai)

NASA Astrophysics Data System (ADS)

Li, Qi; Shu, Jing; Zhao, Cui; Liu, Shikai; Kong, Lingfeng; Zheng, Xiaodong

2010-01-01

Simple sequence repeat (SSR) markers were developed from the expressed sequence tags (ESTs) of Pacific abalone ( Haliotis discus hannai). Repeat motifs were found in 4.95% of the ESTs at a frequency of one repeat every 10.04 kb of EST sequences, after redundancy elimination. Seventeen polymorphic EST-SSRs were developed. The number of alleles per locus varied from 2-17, with an average of 6.8 alleles per locus. The expected and observed heterozygosities ranged from 0.159 to 0.928 and from 0.132 to 0.922, respectively. Twelve of the 17 loci (70.6%) were successfully amplified in H. diversicolor. Seventeen loci segregated in three families, with three showing the presence of null alleles (17.6%). The adequate level of variability and low frequency of null alleles observed in H. discus hannai, together with the high rate of transportability across Haliotis species, make this set of EST-SSR markers an important tool for comparative mapping, marker-assisted selection, and evolutionary studies, not only in the Pacific abalone, but also in related species.
Human Y chromosome copy number variation in the next generation sequencing era and beyond.

PubMed

Massaia, Andrea; Xue, Yali

2017-05-01

The human Y chromosome provides a fertile ground for structural rearrangements owing to its haploidy and high content of repeated sequences. The methodologies used for copy number variation (CNV) studies have developed over the years. Low-throughput techniques based on direct observation of rearrangements were developed early on, and are still used, often to complement array-based or sequencing approaches which have limited power in regions with high repeat content and specifically in the presence of long, identical repeats, such as those found in human sex chromosomes. Some specific rearrangements have been investigated for decades; because of their effects on fertility, or their outstanding evolutionary features, the interest in these has not diminished. However, following the flourishing of large-scale genomics, several studies have investigated CNVs across the whole chromosome. These studies sometimes employ data generated within large genomic projects such as the DDD study or the 1000 Genomes Project, and often survey large samples of healthy individuals without any prior selection. Novel technologies based on sequencing long molecules and combinations of technologies, promise to stimulate the study of Y-CNVs in the immediate future.
ProGeRF: Proteome and Genome Repeat Finder Utilizing a Fast Parallel Hash Function

PubMed Central

Moraes, Walas Jhony Lopes; Rodrigues, Thiago de Souza; Bartholomeu, Daniella Castanheira

2015-01-01

Repetitive element sequences are adjacent, repeating patterns, also called motifs, and can be of different lengths; repetitions can involve their exact or approximate copies. They have been widely used as molecular markers in population biology. Given the sizes of sequenced genomes, various bioinformatics tools have been developed for the extraction of repetitive elements from DNA sequences. However, currently available tools do not provide options for identifying repetitive elements in the genome or proteome, displaying a user-friendly web interface, and performing-exhaustive searches. ProGeRF is a web site for extracting repetitive regions from genome and proteome sequences. It was designed to be efficient, fast, and accurate and primarily user-friendly web tool allowing many ways to view and analyse the results. ProGeRF (Proteome and Genome Repeat Finder) is freely available as a stand-alone program, from which the users can download the source code, and as a web tool. It was developed using the hash table approach to extract perfect and imperfect repetitive regions in a (multi)FASTA file, while allowing a linear time complexity. PMID:25811026
Analysis of the genome sequence of the pathogenic Muscovy duck parvovirus strain YY reveals a 14-nucleotide-pair deletion in the inverted terminal repeats.

PubMed

Wang, Jianye; Huang, Yu; Zhou, Mingxu; Zhu, Guoqiang

2016-09-01

Genomic information about Muscovy duck parvovirus is still limited. In this study, the genome of the pathogenic MDPV strain YY was sequenced. The full-length genome of YY is 5075 nucleotides (nt) long, 57 nt shorter than that of strain FM. Sequence alignment indicates that the 5' and 3' inverted terminal repeats (ITR) of strain YY contain a 14-nucleotide-pair deletion in the stem of the palindromic hairpin structure in comparison to strain FM and FZ91-30. The deleted region contains one "E-box" site and one repeated motif with the sequence "TTCCGGT" or "ACCGGAA". Phylogenetic trees constructed based the protein coding genes concordantly showed that YY, together with nine other MDPV isolates from various places, clustered in a separate branch, distinct from the branch formed by goose parvovirus (GPV) strains. These results demonstrate that, despite the distinctive deletion, the YY strain still belongs to the classical MDPV group. Moreover, the deletion of ITR may contribute to the genome evolution of MDPV under immunization pressure.

PolyQ repeat expansions in ATXN2 associated with ALS are CAA interrupted repeats.

PubMed

Yu, Zhenming; Zhu, Yongqing; Chen-Plotkin, Alice S; Clay-Falcone, Dana; McCluskey, Leo; Elman, Lauren; Kalb, Robert G; Trojanowski, John Q; Lee, Virginia M-Y; Van Deerlin, Vivianna M; Gitler, Aaron D; Bonini, Nancy M

2011-03-29

Amyotrophic lateral sclerosis (ALS) is a devastating, rapidly progressive disease leading to paralysis and death. Recently, intermediate length polyglutamine (polyQ) repeats of 27-33 in ATAXIN-2 (ATXN2), encoding the ATXN2 protein, were found to increase risk for ALS. In ATXN2, polyQ expansions of ≥ 34, which are pure CAG repeat expansions, cause spinocerebellar ataxia type 2. However, similar length expansions that are interrupted with other codons, can present atypically with parkinsonism, suggesting that configuration of the repeat sequence plays an important role in disease manifestation in ATXN2 polyQ expansion diseases. Here we determined whether the expansions in ATXN2 associated with ALS were pure or interrupted CAG repeats, and defined single nucleotide polymorphisms (SNPs) rs695871 and rs695872 in exon 1 of the gene, to assess haplotype association. We found that the expanded repeat alleles of 40 ALS patients and 9 long-repeat length controls were all interrupted, bearing 1-3 CAA codons within the CAG repeat. 21/21 expanded ALS chromosomes with 3CAA interruptions arose from one haplotype (GT), while 18/19 expanded ALS chromosomes with <3CAA interruptions arose from a different haplotype (CC). Moreover, age of disease onset was significantly earlier in patients bearing 3 interruptions vs fewer, and was distinct between haplotypes. These results indicate that CAG repeat expansions in ATXN2 associated with ALS are uniformly interrupted repeats and that the nature of the repeat sequence and haplotype, as well as length of polyQ repeat, may play a role in the neurological effect conferred by expansions in ATXN2.
Microsatellites for Lindera species

Treesearch

Craig S. Echt; D. Deemer; T.L. Kubisiak; C.D. Nelson

2006-01-01

Microsatellite markers were developed for conservation genetic studies of Lindera melissifolia (pondberry), a federally endangered shrub of southern bottomland ecosystems. Microsatellite sequences were obtained from DNA libraries that were enriched for the (AC)n simple sequence repeat motif. From 35 clone sequences, 20 primer...
A divergent Pumilio repeat protein family for pre-rRNA processing and mRNA localization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Qiu, Chen; McCann, Kathleen L.; Wine, Robert N.

Pumilio/feminization of XX and XO animals (fem)-3 mRNA-binding factor (PUF) proteins bind sequence specifically to mRNA targets using a single-stranded RNA-binding domain comprising eight Pumilio (PUM) repeats. PUM repeats have now been identified in proteins that function in pre-rRNA processing, including human Puf-A and yeast Puf6. This is a role not previously ascribed to PUF proteins. In this paper we present crystal structures of human Puf-A that reveal a class of nucleic acid-binding proteins with 11 PUM repeats arranged in an “L”-like shape. In contrast to classical PUF proteins, Puf-A forms sequence-independent interactions with DNA or RNA, mediated by conservedmore » basic residues. We demonstrate that equivalent basic residues in yeast Puf6 are important for RNA binding, pre-rRNA processing, and mRNA localization. Finally, PUM repeats can be assembled into alternative folds that bind to structured nucleic acids in addition to forming canonical eight-repeat crescent-shaped RNA-binding domains found in classical PUF proteins.« less
A divergent Pumilio repeat protein family for pre-rRNA processing and mRNA localization

DOE PAGES

Qiu, Chen; McCann, Kathleen L.; Wine, Robert N.; ...

2014-12-15

Pumilio/feminization of XX and XO animals (fem)-3 mRNA-binding factor (PUF) proteins bind sequence specifically to mRNA targets using a single-stranded RNA-binding domain comprising eight Pumilio (PUM) repeats. PUM repeats have now been identified in proteins that function in pre-rRNA processing, including human Puf-A and yeast Puf6. This is a role not previously ascribed to PUF proteins. In this paper we present crystal structures of human Puf-A that reveal a class of nucleic acid-binding proteins with 11 PUM repeats arranged in an “L”-like shape. In contrast to classical PUF proteins, Puf-A forms sequence-independent interactions with DNA or RNA, mediated by conservedmore » basic residues. We demonstrate that equivalent basic residues in yeast Puf6 are important for RNA binding, pre-rRNA processing, and mRNA localization. Finally, PUM repeats can be assembled into alternative folds that bind to structured nucleic acids in addition to forming canonical eight-repeat crescent-shaped RNA-binding domains found in classical PUF proteins.« less
Exploring the repeat protein universe through computational protein design

DOE PAGES

Brunette, TJ; Parmeggiani, Fabio; Huang, Po-Ssu; ...

2015-12-16

A central question in protein evolution is the extent to which naturally occurring proteins sample the space of folded structures accessible to the polypeptide chain. Repeat proteins composed of multiple tandem copies of a modular structure unit are widespread in nature and have critical roles in molecular recognition, signalling, and other essential biological processes. Naturally occurring repeat proteins have been re-engineered for molecular recognition and modular scaffolding applications. In this paper, we use computational protein design to investigate the space of folded structures that can be generated by tandem repeating a simple helix–loop–helix–loop structural motif. Eighty-three designs with sequences unrelatedmore » to known repeat proteins were experimentally characterized. Of these, 53 are monomeric and stable at 95 °C, and 43 have solution X-ray scattering spectra consistent with the design models. Crystal structures of 15 designs spanning a broad range of curvatures are in close agreement with the design models with root mean square deviations ranging from 0.7 to 2.5 Å. Finally, our results show that existing repeat proteins occupy only a small fraction of the possible repeat protein sequence and structure space and that it is possible to design novel repeat proteins with precisely specified geometries, opening up a wide array of new possibilities for biomolecular engineering.« less
Evolution of short inverted repeat in cupressophytes, transfer of accD to nucleus in Sciadopitys verticillata and phylogenetic position of Sciadopityaceae.

PubMed

Li, Jia; Gao, Lei; Chen, Shanshan; Tao, Ke; Su, Yingjuan; Wang, Ting

2016-02-11

Sciadopitys verticillata is an evergreen conifer and an economically valuable tree used in construction, which is the only member of the family Sciadopityaceae. Acquisition of the S. verticillata chloroplast (cp) genome will be useful for understanding the evolutionary mechanism of conifers and phylogenetic relationships among gymnosperm. In this study, we have first reported the complete chloroplast genome of S. verticillata. The total genome is 138,284 bp in length, consisting of 118 unique genes. The S. verticillata cp genome has lost one copy of the canonical inverted repeats and shown distinctive genomic structure comparing with other cupressophytes. Fifty-three simple sequence repeat loci and 18 forward tandem repeats were identified in the S. verticillata cp genome. According to the rearrangement of cupressophyte cp genome, we proposed one mechanism for the formation of inverted repeat: tandem repeat occured first, then rearrangement divided the tandem repeat into inverted repeats located at different regions. Phylogenetic estimates inferred from 59-gene sequences and cpDNA organizations have both shown that S. verticillata was sister to the clade consisting of Cupressaceae, Taxaceae, and Cephalotaxaceae. Moreover, accD gene was found to be lost in the S. verticillata cp genome, and a nucleus copy was identified from two transcriptome data.
TALE-Like Effectors Are an Ancestral Feature of the Ralstonia solanacearum Species Complex and Converge in DNA Targeting Specificity.

PubMed

Schandry, Niklas; de Lange, Orlando; Prior, Philippe; Lahaye, Thomas

2016-01-01

Ralstonia solanacearum, a species complex of bacterial plant pathogens divided into four monophyletic phylotypes, causes plant diseases in tropical climates around the world. Some strains exhibit a broad host range on solanaceous hosts, while others are highly host-specific as for example some banana-pathogenic strains. Previous studies showed that transcription activator-like (TAL) effectors from Ralstonia, termed RipTALs, are capable of activating reporter genes in planta, if these are preceded by a matching effector binding element (EBE). RipTALs target DNA via their central repeat domain (CRD), where one repeat pairs with one DNA-base of the given EBE. The repeat variable diresidue dictates base repeat specificity in a predictable fashion, known as the TALE code. In this work, we analyze RipTALs across all phylotypes of the Ralstonia solanacearum species complex. We find that RipTALs are prevalent in phylotypes I and IV but absent from most phylotype III and II strains (10/12, 8/14, 1/24, and 1/5 strains contained a RipTAL, respectively). RipTALs originating from strains of the same phylotype show high levels of sequence similarity (>98%) in the N-terminal and C-terminal regions, while RipTALs isolated from different phylotypes show 47-91% sequence similarity in those regions, giving rise to four RipTAL classes. We show that, despite sequence divergence, the base preference for guanine, mediated by the N-terminal region, is conserved across RipTALs of all classes. Using the number and order of repeats found in the CRD, we functionally sub-classify RipTALs, introduce a new simple nomenclature, and predict matching EBEs for all seven distinct RipTALs identified. We experimentally study RipTAL EBEs and uncover that some RipTALs are able to target the EBEs of other RipTALs, referred to as cross-reactivity. In particular, RipTALs from strains with a broad host range on solanaceous hosts cross-react on each other's EBEs. Investigation of sequence divergence between RipTAL repeats allows for a reconstruction of repeat array biogenesis, for example through slipped strand mispairing or gene conversion. Using these studies we show how RipTALs of broad host range strains evolved convergently toward a shared target sequence. Finally, we discuss the differences between TALE-likes of plant pathogens in the context of disease ecology.
High-Frame-Rate Doppler Ultrasound Using a Repeated Transmit Sequence

PubMed Central

Podkowa, Anthony S.; Oelze, Michael L.; Ketterling, Jeffrey A.

2018-01-01

The maximum detectable velocity of high-frame-rate color flow Doppler ultrasound is limited by the imaging frame rate when using coherent compounding techniques. Traditionally, high quality ultrasonic images are produced at a high frame rate via coherent compounding of steered plane wave reconstructions. However, this compounding operation results in an effective downsampling of the slow-time signal, thereby artificially reducing the frame rate. To alleviate this effect, a new transmit sequence is introduced where each transmit angle is repeated in succession. This transmit sequence allows for direct comparison between low resolution, pre-compounded frames at a short time interval in ways that are resistent to sidelobe motion. Use of this transmit sequence increases the maximum detectable velocity by a scale factor of the transmit sequence length. The performance of this new transmit sequence was evaluated using a rotating cylindrical phantom and compared with traditional methods using a 15-MHz linear array transducer. Axial velocity estimates were recorded for a range of ±300 mm/s and compared to the known ground truth. Using these new techniques, the root mean square error was reduced from over 400 mm/s to below 50 mm/s in the high-velocity regime compared to traditional techniques. The standard deviation of the velocity estimate in the same velocity range was reduced from 250 mm/s to 30 mm/s. This result demonstrates the viability of the repeated transmit sequence methods in detecting and quantifying high-velocity flow. PMID:29910966
Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae).

PubMed

Bonatelli, Isabel A S; Carstens, Bryan C; Moraes, Evandro M

2015-01-01

Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms.
Using Next Generation RAD Sequencing to Isolate Multispecies Microsatellites for Pilosocereus (Cactaceae)

PubMed Central

Bonatelli, Isabel A. S.; Carstens, Bryan C.; Moraes, Evandro M.

2015-01-01

Microsatellite markers (also known as SSRs, Simple Sequence Repeats) are widely used in plant science and are among the most informative molecular markers for population genetic investigations, but the development of such markers presents substantial challenges. In this report, we discuss how next generation sequencing can replace the cloning, Sanger sequencing, identification of polymorphic loci, and testing cross-amplification that were previously required to develop microsatellites. We report the development of a large set of microsatellite markers for five species of the Neotropical cactus genus Pilosocereus using a restriction-site-associated DNA sequencing (RAD-seq) on a Roche 454 platform. We identified an average of 165 microsatellites per individual, with the absolute numbers across individuals proportional to the sequence reads obtained per individual. Frequency distribution of the repeat units was similar in the five species, with shorter motifs such as di- and trinucleotide being the most abundant repeats. In addition, we provide 72 microsatellites that could be potentially amplified in the sampled species and 22 polymorphic microsatellites validated in two populations of the species Pilosocereus machrisii. Although low coverage sequencing among individuals was observed for most of the loci, which we suggest to be more related to the nature of the microsatellite markers and the possible bias inserted by the restriction enzymes than to the genome size, our work demonstrates that an NGS approach is an efficient method to isolate multispecies microsatellites even in non-model organisms. PMID:26561396
Differential effects of simple repeating DNA sequences on gene expression from the SV40 early promoter.

PubMed

Amirhaeri, S; Wohlrab, F; Wells, R D

1995-02-17

The influence of simple repeat sequences, cloned into different positions relative to the SV40 early promoter/enhancer, on the transient expression of the chloramphenicol acetyltransferase (CAT) gene was investigated. Insertion of (G)29.(C)29 in either orientation into the 5'-untranslated region of the CAT gene reduced expression in CV-1 cells 50-100 fold when compared with controls with random sequence inserts. Analysis of CAT-specific mRNA levels demonstrated that the effect was due to a reduction of CAT mRNA production rather than to posttranscriptional events. In contrast, insertion of the same insert in either orientation upstream of the promoter-enhancer or downstream of the gene stimulated gene expression 2-3-fold. These effects could be reversed by cotransfection of a competitor plasmid carrying (G)25.(C)25 sequences. The results suggest that a G.C-binding transcription factor modulates gene expression in this system and that promoter strength can be regulated by providing protein-binding sites in trans. Although constructs containing longer tracts of alternating (C-G), (T-G), or (A-T) sequences inhibited CAT expression when inserted in the 5'-untranslated region of the CAT gene, the amount of CAT mRNA was unaffected. Hence, these inhibitions must be due to posttranscriptional events, presumably at the level of translation. These effects of microsatellite sequences on gene expression are discussed with respect to recent data on related simple repeat sequences which cause several human genetic diseases.
Giardia telomeric sequence d(TAGGG)4 forms two intramolecular G-quadruplexes in K+ solution: effect of loop length and sequence on the folding topology.

PubMed

Hu, Lanying; Lim, Kah Wai; Bouaziz, Serge; Phan, Anh Tuân

2009-11-25

Recently, it has been shown that in K(+) solution the human telomeric sequence d[TAGGG(TTAGGG)(3)] forms a (3 + 1) intramolecular G-quadruplex, while the Bombyx mori telomeric sequence d[TAGG(TTAGG)(3)], which differs from the human counterpart only by one G deletion in each repeat, forms a chair-type intramolecular G-quadruplex, indicating an effect of G-tract length on the folding topology of G-quadruplexes. To explore the effect of loop length and sequence on the folding topology of G-quadruplexes, here we examine the structure of the four-repeat Giardia telomeric sequence d[TAGGG(TAGGG)(3)], which differs from the human counterpart only by one T deletion within the non-G linker in each repeat. We show by NMR that this sequence forms two different intramolecular G-quadruplexes in K(+) solution. The first one is a novel basket-type antiparallel-stranded G-quadruplex containing two G-tetrads, a G x (A-G) triad, and two A x T base pairs; the three loops are consecutively edgewise-diagonal-edgewise. The second one is a propeller-type parallel-stranded G-quadruplex involving three G-tetrads; the three loops are all double-chain-reversal. Recurrence of several structural elements in the observed structures suggests a "cut and paste" principle for the design and prediction of G-quadruplex topologies, for which different elements could be extracted from one G-quadruplex and inserted into another.
Stable CoT-1 repeat RNA is abundant and associated with euchromatic interphase chromosomes

PubMed Central

Hall, Lisa L.; Carone, Dawn M.; Gomez, Alvin; Kolpa, Heather J.; Byron, Meg; Mehta, Nitish; Fackelmayer, Frank O.; Lawrence, Jeanne B.

2014-01-01

SUMMARY Recent studies recognize a vast diversity of non-coding RNAs with largely unknown functions, but few have examined interspersed repeat sequences, which constitute almost half our genome. RNA hybridization in situ using CoT-1 (highly repeated) DNA probes detects surprisingly abundant euchromatin-associated RNA comprised predominantly of repeat sequences (“CoT-1 RNA”), including LINE-1. CoT-1-hybridizing RNA strictly localizes to the interphase chromosome territory in cis, and remains stably associated with the chromosome territory following prolonged transcriptional inhibition. The CoT-1 RNA territory resists mechanical disruption and fractionates with the non-chromatin scaffold, but can be experimentally released. Loss of repeat-rich, stable nuclear RNAs from euchromatin corresponds to aberrant chromatin distribution and condensation. CoT-1 RNA has several properties similar to XIST chromosomal RNA, but is excluded from chromatin condensed by XIST. These findings impact two “black boxes” of genome science: the poorly understood diversity of non-coding RNA and the unexplained abundance of repetitive elements. PMID:24581492
Neural Mechanisms Underlying Visual Short-Term Memory Gain for Temporally Distinct Objects.

PubMed

Ihssen, Niklas; Linden, David E J; Miller, Claire E; Shapiro, Kimron L

2015-08-01

Recent research has shown that visual short-term memory (VSTM) can substantially be improved when the to-be-remembered objects are split in 2 half-arrays (i.e., sequenced) or the entire array is shown twice (i.e., repeated), rather than presented simultaneously. Here we investigate the hypothesis that sequencing and repeating displays overcomes attentional "bottlenecks" during simultaneous encoding. Using functional magnetic resonance imaging, we show that sequencing and repeating displays increased brain activation in extrastriate and primary visual areas, relative to simultaneous displays (Study 1). Passively viewing identical stimuli did not increase visual activation (Study 2), ruling out a physical confound. Importantly, areas of the frontoparietal attention network showed increased activation in repetition but not in sequential trials. This dissociation suggests that repeating a display increases attentional control by allowing attention to be reallocated in a second encoding episode. In contrast, sequencing the array poses fewer demands on control, with competition from nonattended objects being reduced by the half-arrays. This idea was corroborated by a third study in which we found optimal VSTM for sequential displays minimizing attentional demands. Importantly these results provide support within the same experimental paradigm for the role of stimulus-driven and top-down attentional control aspects of biased competition theory in setting constraints on VSTM. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Development of novel simple sequence repeat markers in bitter gourd (Momordica charantia L.) through enriched genomic libraries and their utilization in analysis of genetic diversity and cross-species transferability.

PubMed

Saxena, Swati; Singh, Archana; Archak, Sunil; Behera, Tushar K; John, Joseph K; Meshram, Sudhir U; Gaikwad, Ambika B

2015-01-01

Microsatellite or simple sequence repeat (SSR) markers are the preferred markers for genetic analyses of crop plants. The availability of a limited number of such markers in bitter gourd (Momordica charantia L.) necessitates the development and characterization of more SSR markers. These were developed from genomic libraries enriched for three dinucleotide, five trinucleotide, and two tetranucleotide core repeat motifs. Employing the strategy of polymerase chain reaction-based screening, the number of clones to be sequenced was reduced by 81 % and 93.7 % of the sequenced clones contained in microsatellite repeats. Unique primer-pairs were designed for 160 microsatellite loci, and amplicons of expected length were obtained for 151 loci (94.4 %). Evaluation of diversity in 54 bitter gourd accessions at 51 loci indicated that 20 % of the loci were polymorphic with the polymorphic information content values ranging from 0.13 to 0.77. Fifteen Indian varieties were clearly distinguished indicative of the usefulness of the developed markers. Markers at 40 loci (78.4 %) were transferable to six species, viz. Momordica cymbalaria, Momordica subangulata subsp. renigera, Momordica balsamina, Momordica dioca, Momordica cochinchinesis, and Momordica sahyadrica. The microsatellite markers reported will be useful in various genetic and molecular genetic studies in bitter gourd, a cucurbit of immense nutritive, medicinal, and economic importance.
Cas6 is an endoribonuclease that generates guide RNAs for invader defense in prokaryotes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Carte, Jason; Wang, Ruiying; Li, Hong

An RNA-based gene silencing pathway that protects bacteria and archaea from viruses and other genome invaders is hypothesized to arise from guide RNAs encoded by CRISPR loci and proteins encoded by the cas genes. CRISPR loci contain multiple short invader-derived sequences separated by short repeats. The presence of virus-specific sequences within CRISPR loci of prokaryotic genomes confers resistance against corresponding viruses. The CRISPR loci are transcribed as long RNAs that must be processed to smaller guide RNAs. Here we identified Pyrococcus furiosus Cas6 as a novel endoribonuclease that cleaves CRISPR RNAs within the repeat sequences to release individual invader targetingmore » RNAs. Cas6 interacts with a specific sequence motif in the 5{prime} region of the CRISPR repeat element and cleaves at a defined site within the 3{prime} region of the repeat. The 1.8 angstrom crystal structure of the enzyme reveals two ferredoxin-like folds that are also found in other RNA-binding proteins. The predicted active site of the enzyme is similar to that of tRNA splicing endonucleases, and concordantly, Cas6 activity is metal-independent. cas6 is one of the most widely distributed CRISPR-associated genes. Our findings indicate that Cas6 functions in the generation of CRISPR-derived guide RNAs in numerous bacteria and archaea.« less
The B chromosomes in Brachycome.

PubMed

Leach, C R; Houben, A; Timmis, J N

2004-01-01

This review presents a historical account of studies of B chromosomes in the genus Brachycome Cass. (synonym: Brachyscome) from the earliest cytological investigations carried out in the late 1960s though to the most recent molecular analyses. Molecular analyses provide insights into the origin and evolution of the B chromosomes (Bs) of Brachycome dichromosomatica, a species which has Bs of two different sizes. The larger Bs are somatically stable whereas the smaller, or micro, Bs are somatically unstable. Both B types contain clusters of ribosomal RNA genes that have been shown unequivocally to be inactive in the case of the larger Bs. The large Bs carry a family of tandem repeat sequences (Bd49) that are located mainly at the centromere. Multiple copies of sequences related to this repeat are present on the A chromosomes (As) of related species, whereas only a few copies exist in the A chromosomes of B. dichromosomatica. The micro Bs share DNA sequences with the As and the larger Bs, and they also have B-specific repeats (Bdm29 and Bdm54). In some cases repeat sequences on the micro Bs have been shown to occur as clusters on the A chromosomes in a proportion of individuals within a population. It is clear that none of these B types originated by simple excision of segments from the A chromosomes. Copyright 2004 S. Karger AG, Basel
All gene-sized DNA molecules in four species of hypotrichs have the same terminal sequence and an unusual 3' terminus.

PubMed Central

Klobutcher, L A; Swanton, M T; Donini, P; Prescott, D M

1981-01-01

In hypotrichous ciliates, all of the macronuclear DNA is in the form of low molecular weight molecules with an average size of approximately 2200 base pairs. Total macronuclear DNA from four hypotrichs has been shown to have inverted terminal repeats by direct sequence analysis. In Oxytricha nova, Oxytricha sp., and Stylonychia pustulata, this terminal sequence may be written as 5'-C4A4C4A4C4 ... 3'-G4T4G4T4G4T4G4T4G4 ... In Euplotes aediculatus, the sequences is similar but differs in the lengths of the duplex region (28 base pairs) and of the putative 3' extension (14 base pairs). Also in Euplotes, a second common sequence of 5 base pairs (A-A-C-T-T-T-T-G-A-A) occurs internal to the terminal repeat and a 17-base-pair heterogeneous region: 5'-C4A4C4A4C4A4C4(X)17T-T-G-A-A ... 3'-G2T4G4T4G4T4G4T4G4T4G4(X)17A-A-C-T-T ... The length of the terminal repeat sequence for O. nova was confirmed in cloned macronuclear DNA molecules. Images PMID:6265931
Interstitial telomeric repeats are not preferentially involved in radiation-induced chromosome aberrations in human cells.

PubMed

Desmaze, C; Pirzio, L M; Blaise, R; Mondello, C; Giulotto, E; Murnane, J P; Sabatier, L

2004-01-01

Telomeric repeat sequences, located at the end of eukaryotic chromosomes, have been detected at intrachromosomal locations in many species. Large blocks of telomeric sequences are located near the centromeres in hamster cells, and have been reported to break spontaneously or after exposure to ionizing radiation, leading to chromosome aberrations. In human cells, interstitial telomeric sequences (ITS) can be composed of short tracts of telomeric repeats (less than twenty), or of longer stretches of exact and degenerated hexanucleotides, mainly localized at subtelomeres. In this paper, we analyzed the radiation sensitivity of a naturally occurring short ITS localized in 2q31 and we found that this region is not a hot spot of radiation-induced chromosome breaks. We then selected a human cell line in which approximately 800 bp of telomeric DNA had been introduced by transfection into an internal euchromatic chromosomal region in chromosome 4q. In parallel, a cell line containing the plasmid without telomeric sequences was also analyzed. Both regions containing the transfected plasmids showed a higher frequency of radiation-induced breaks than expected, indicating that the instability of the regions containing the transfected sequences is not due to the presence of telomeric sequences. Taken together, our data show that ITS themselves do not enhance the formation of radiation-induced chromosome rearrangements in these human cell lines. Copyright 2003 S. Karger AG, Basel
Integrated and Independent Learning of Hand-Related Constituent Sequences

ERIC Educational Resources Information Center

Berner, Michael P.; Hoffmann, Joachim

2009-01-01

In almost all daily activities fingers of both hands are used in coordinated succession. The present experiments explored whether learning in such tasks pertains not only to the overall sequence spanning both hands but also to the constituent sequences of each hand. In a serial reaction time task, 2 repeating hand-related sequences were…

Detection of possible restriction sites for type II restriction enzymes in DNA sequences.

PubMed

Gagniuc, P; Cimponeriu, D; Ionescu-Tîrgovişte, C; Mihai, Andrada; Stavarachi, Monica; Mihai, T; Gavrilă, L

2011-01-01

In order to make a step forward in the knowledge of the mechanism operating in complex polygenic disorders such as diabetes and obesity, this paper proposes a new algorithm (PRSD -possible restriction site detection) and its implementation in Applied Genetics software. This software can be used for in silico detection of potential (hidden) recognition sites for endonucleases and for nucleotide repeats identification. The recognition sites for endonucleases may result from hidden sequences through deletion or insertion of a specific number of nucleotides. Tests were conducted on DNA sequences downloaded from NCBI servers using specific recognition sites for common type II restriction enzymes introduced in the software database (n = 126). Each possible recognition site indicated by the PRSD algorithm implemented in Applied Genetics was checked and confirmed by NEBcutter V2.0 and Webcutter 2.0 software. In the sequence NG_008724.1 (which includes 63632 nucleotides) we found a high number of potential restriction sites for ECO R1 that may be produced by deletion (n = 43 sites) or insertion (n = 591 sites) of one nucleotide. The second module of Applied Genetics has been designed to find simple repeats sizes with a real future in understanding the role of SNPs (Single Nucleotide Polymorphisms) in the pathogenesis of the complex metabolic disorders. We have tested the presence of simple repetitive sequences in five DNA sequence. The software indicated exact position of each repeats detected in the tested sequences. Future development of Applied Genetics can provide an alternative for powerful tools used to search for restriction sites or repetitive sequences or to improve genotyping methods.
Transcription factor IID in the Archaea: sequences in the Thermococcus celer genome would encode a product closely related to the TATA-binding protein of eukaryotes

NASA Technical Reports Server (NTRS)

Marsh, T. L.; Reich, C. I.; Whitelock, R. B.; Olsen, G. J.; Woese, C. R. (Principal Investigator)

1994-01-01

The first step in transcription initiation in eukaryotes is mediated by the TATA-binding protein, a subunit of the transcription factor IID complex. We have cloned and sequenced the gene for a presumptive homolog of this eukaryotic protein from Thermococcus celer, a member of the Archaea (formerly archaebacteria). The protein encoded by the archaeal gene is a tandem repeat of a conserved domain, corresponding to the repeated domain in its eukaryotic counterparts. Molecular phylogenetic analyses of the two halves of the repeat are consistent with the duplication occurring before the divergence of the archael and eukaryotic domains. In conjunction with previous observations of similarity in RNA polymerase subunit composition and sequences and the finding of a transcription factor IIB-like sequence in Pyrococcus woesei (a relative of T. celer) it appears that major features of the eukaryotic transcription apparatus were well-established before the origin of eukaryotic cellular organization. The divergence between the two halves of the archael protein is less than that between the halves of the individual eukaryotic sequences, indicating that the average rate of sequence change in the archael protein has been less than in its eukaryotic counterparts. To the extent that this lower rate applies to the genome as a whole, a clearer picture of the early genes (and gene families) that gave rise to present-day genomes is more apt to emerge from the study of sequences from the Archaea than from the corresponding sequences from eukaryotes.
Centromere reference models for human chromosomes X and Y satellite arrays

PubMed Central

Miga, Karen H.; Newton, Yulia; Jain, Miten; Altemose, Nicolas; Willard, Huntington F.; Kent, W. James

2014-01-01

The human genome sequence remains incomplete, with multimegabase-sized gaps representing the endogenous centromeres and other heterochromatic regions. Available sequence-based studies within these sites in the genome have demonstrated a role in centromere function and chromosome pairing, necessary to ensure proper chromosome segregation during cell division. A common genomic feature of these regions is the enrichment of long arrays of near-identical tandem repeats, known as satellite DNAs, which offer a limited number of variant sites to differentiate individual repeat copies across millions of bases. This substantial sequence homogeneity challenges available assembly strategies and, as a result, centromeric regions are omitted from ongoing genomic studies. To address this problem, we utilize monomer sequence and ordering information obtained from whole-genome shotgun reads to model two haploid human satellite arrays on chromosomes X and Y, resulting in an initial characterization of 3.83 Mb of centromeric DNA within an individual genome. To further expand the utility of each centromeric reference sequence model, we evaluate sites within the arrays for short-read mappability and chromosome specificity. Because satellite DNAs evolve in a concerted manner, we use these centromeric assemblies to assess the extent of sequence variation among 366 individuals from distinct human populations. We thus identify two satellite array variants in both X and Y centromeres, as determined by array length and sequence composition. This study provides an initial sequence characterization of a regional centromere and establishes a foundation to extend genomic characterization to these sites as well as to other repeat-rich regions within complex genomes. PMID:24501022
De novo transcriptome sequencing reveals a considerable bias in the incidence of simple sequence repeats towards the downstream of 'Pre-miRNAs' of black pepper.

PubMed

Joy, Nisha; Asha, Srinivasan; Mallika, Vijayan; Soniya, Eppurathu Vasudevan

2013-01-01

Next generation sequencing has an advantageon transformational development of species with limited available sequence data as it helps to decode the genome and transcriptome. We carried out the de novo sequencing using illuminaHiSeq™ 2000 to generate the first leaf transcriptome of black pepper (Piper nigrum L.), an important spice variety native to South India and also grown in other tropical regions. Despite the economic and biochemical importance of pepper, a scientifically rigorous study at the molecular level is far from complete due to lack of sufficient sequence information and cytological complexity of its genome. The 55 million raw reads obtained, when assembled using Trinity program generated 2,23,386 contigs and 1,28,157 unigenes. Reports suggest that the repeat-rich genomic regions give rise to small non-coding functional RNAs. MicroRNAs (miRNAs) are the most abundant type of non-coding regulatory RNAs. In spite of the widespread research on miRNAs, little is known about the hair-pin precursors of miRNAs bearing Simple Sequence Repeats (SSRs). We used the array of transcripts generated, for the in silico prediction and detection of '43 pre-miRNA candidates bearing different types of SSR motifs'. The analysis identified 3913 different types of SSR motifs with an average of one SSR per 3.04 MB of thetranscriptome. About 0.033% of the transcriptome constituted 'pre-miRNA candidates bearing SSRs'. The abundance, type and distribution of SSR motifs studied across the hair-pin miRNA precursors, showed a significant bias in the position of SSRs towards the downstream of predicted 'pre-miRNA candidates'. The catalogue of transcripts identified, together with the demonstration of reliable existence of SSRs in the miRNA precursors, permits future opportunities for understanding the genetic mechanism of black pepper and likely functions of 'tandem repeats' in miRNAs.
Analyses of Expressed Sequence Tags from Apple1

PubMed Central

Newcomb, Richard D.; Crowhurst, Ross N.; Gleave, Andrew P.; Rikkerink, Erik H.A.; Allan, Andrew C.; Beuning, Lesley L.; Bowen, Judith H.; Gera, Emma; Jamieson, Kim R.; Janssen, Bart J.; Laing, William A.; McArtney, Steve; Nain, Bhawana; Ross, Gavin S.; Snowden, Kimberley C.; Souleyre, Edwige J.F.; Walton, Eric F.; Yauk, Yar-Khing

2006-01-01

The domestic apple (Malus domestica; also known as Malus pumila Mill.) has become a model fruit crop in which to study commercial traits such as disease and pest resistance, grafting, and flavor and health compound biosynthesis. To speed the discovery of genes involved in these traits, develop markers to map genes, and breed new cultivars, we have produced a substantial expressed sequence tag collection from various tissues of apple, focusing on fruit tissues of the cultivar Royal Gala. Over 150,000 expressed sequence tags have been collected from 43 different cDNA libraries representing 34 different tissues and treatments. Clustering of these sequences results in a set of 42,938 nonredundant sequences comprising 17,460 tentative contigs and 25,478 singletons, together representing what we predict are approximately one-half the expressed genes from apple. Many potential molecular markers are abundant in the apple transcripts. Dinucleotide repeats are found in 4,018 nonredundant sequences, mainly in the 5′-untranslated region of the gene, with a bias toward one repeat type (containing AG, 88%) and against another (repeats containing CG, 0.1%). Trinucleotide repeats are most common in the predicted coding regions and do not show a similar degree of sequence bias in their representation. Bi-allelic single-nucleotide polymorphisms are highly abundant with one found, on average, every 706 bp of transcribed DNA. Predictions of the numbers of representatives from protein families indicate the presence of many genes involved in disease resistance and the biosynthesis of flavor and health-associated compounds. Comparisons of some of these gene families with Arabidopsis (Arabidopsis thaliana) suggest instances where there have been duplications in the lineages leading to apple of biosynthetic and regulatory genes that are expressed in fruit. This resource paves the way for a concerted functional genomics effort in this important temperate fruit crop. PMID:16531485
Exploring the genome of the salt-marsh Spartina maritima (Poaceae, Chloridoideae) through BAC end sequence analysis.

PubMed

Ferreira de Carvalho, J; Chelaifa, H; Boutte, J; Poulain, J; Couloux, A; Wincker, P; Bellec, A; Fourment, J; Bergès, H; Salmon, A; Ainouche, M

2013-12-01

Spartina species play an important ecological role on salt marshes. Spartina maritima is an Old-World species distributed along the European and North-African Atlantic coasts. This hexaploid species (2n = 6x = 60, 2C = 3,700 Mb) hybridized with different Spartina species introduced from the American coasts, which resulted in the formation of new invasive hybrids and allopolyploids. Thus, S. maritima raises evolutionary and ecological interests. However, genomic information is dramatically lacking in this genus. In an effort to develop genomic resources, we analysed 40,641 high-quality bacterial artificial chromosome-end sequences (BESs), representing 26.7 Mb of the S. maritima genome. BESs were searched for sequence homology against known databases. A fraction of 16.91% of the BESs represents known repeats including a majority of long terminal repeat (LTR) retrotransposons (13.67%). Non-LTR retrotransposons represent 0.75%, DNA transposons 0.99%, whereas small RNA, simple repeats and low-complexity sequences account for 1.38% of the analysed BESs. In addition, 4,285 simple sequence repeats were detected. Using the coding sequence database of Sorghum bicolor, 6,809 BESs found homology accounting for 17.1% of all BESs. Comparative genomics with related genera reveals that the microsynteny is better conserved with S. bicolor compared to other sequenced Poaceae, where 37.6% of the paired matching BESs are correctly orientated on the chromosomes. We did not observe large macrosyntenic rearrangements using the mapping strategy employed. However, some regions appeared to have experienced rearrangements when comparing Spartina to Sorghum and to Oryza. This work represents the first overview of S. maritima genome regarding the respective coding and repetitive components. The syntenic relationships with other grass genomes examined here help clarifying evolution in Poaceae, S. maritima being a part of the poorly-known Chloridoideae sub-family.
Serological profiling of the EBV immune response in Chronic Fatigue Syndrome using a peptide microarray.

PubMed

Loebel, Madlen; Eckey, Maren; Sotzny, Franziska; Hahn, Elisabeth; Bauer, Sandra; Grabowski, Patricia; Zerweck, Johannes; Holenya, Pavlo; Hanitsch, Leif G; Wittke, Kirsten; Borchmann, Peter; Rüffer, Jens-Ulrich; Hiepe, Falk; Ruprecht, Klemens; Behrends, Uta; Meindl, Carola; Volk, Hans-Dieter; Reimer, Ulf; Scheibenbogen, Carmen

2017-01-01

Epstein-Barr-Virus (EBV) plays an important role as trigger or cofactor for various autoimmune diseases. In a subset of patients with Chronic Fatigue Syndrome (CFS) disease starts with infectious mononucleosis as late primary EBV-infection, whereby altered levels of EBV-specific antibodies can be observed in another subset of patients. We performed a comprehensive mapping of the IgG response against EBV comparing 50 healthy controls with 92 CFS patients using a microarray platform. Patients with multiple sclerosis (MS), systemic lupus erythematosus (SLE) and cancer-related fatigue served as controls. 3054 overlapping peptides were synthesised as 15-mers from 14 different EBV proteins. Array data was validated by ELISA for selected peptides. Prevalence of EBV serotypes was determined by qPCR from throat washing samples. EBV type 1 infections were found in patients and controls. EBV seroarray profiles between healthy controls and CFS were less divergent than that observed for MS or SLE. We found significantly enhanced IgG responses to several EBNA-6 peptides containing a repeat sequence in CFS patients compared to controls. EBNA-6 peptide IgG responses correlated well with EBNA-6 protein responses. The EBNA-6 repeat region showed sequence homologies to various human proteins. Patients with CFS had a quite similar EBV IgG antibody response pattern as healthy controls. Enhanced IgG reactivity against an EBNA-6 repeat sequence and against EBNA-6 protein is found in CFS patients. Homologous sequences of various human proteins with this EBNA-6 repeat sequence might be potential targets for antigenic mimicry.
Identification of genes in anonymous DNA sequences. Annual performance report, February 1, 1991--January 31, 1992

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fields, C.A.

1996-06-01

The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progressmore » report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.« less
Stabilization of perfect and imperfect tandem repeats by single-strand DNA exonucleases

PubMed Central

Feschenko, Vladimir V.; Rajman, Luis A.; Lovett, Susan T.

2003-01-01

Rearrangements between tandemly repeated DNA sequences are a common source of genetic instability. Such rearrangements underlie several human genetic diseases. In many organisms, the mismatch-repair (MMR) system functions to stabilize repeats when the repeat unit is short or when sequence imperfections are present between the repeats. We show here that the action of single-stranded DNA (ssDNA) exonucleases plays an additional, important role in stabilizing tandem repeats, independent of their role in MMR. For perfect repeats of ≈100 bp in Escherichia coli that are not susceptible to MMR, exonuclease (Exo)-I, ExoX, and RecJ exonuclease redundantly inhibit deletion. Our data suggest that >90% of potential deletion events are avoided by the combined action of these three exonucleases. Imperfect tandem repeats, less prone to rearrangements, are stabilized by both the MMR-pathway and ssDNA-specific exonucleases. For 100-bp repeats containing four mispairs, ExoI alone aborts most deletion events, even in the presence of a functional MMR system. By genetic analysis, we show that the inhibitory effect of ssDNA exonucleases on deletion formation is independent of the MutS and UvrD proteins. Exonuclease degradation of DNA displaced during the deletion process may abort slipped misalignment. Exonuclease action is therefore a significant force in genetic stabilization of many forms of repetitive DNA. PMID:12538867
Operating characteristics of the implicit learning system supporting serial interception sequence learning.

PubMed

Sanchez, Daniel J; Reber, Paul J

2012-04-01

The memory system that supports implicit perceptual-motor sequence learning relies on brain regions that operate separately from the explicit, medial temporal lobe memory system. The implicit learning system therefore likely has distinct operating characteristics and information processing constraints. To attempt to identify the limits of the implicit sequence learning mechanism, participants performed the serial interception sequence learning (SISL) task with covertly embedded repeating sequences that were much longer than most previous studies: ranging from 30 to 60 (Experiment 1) and 60 to 90 (Experiment 2) items in length. Robust sequence-specific learning was observed for sequences up to 80 items in length, extending the known capacity of implicit sequence learning. In Experiment 3, 12-item repeating sequences were embedded among increasing amounts of irrelevant nonrepeating sequences (from 20 to 80% of training trials). Despite high levels of irrelevant trials, learning occurred across conditions. A comparison of learning rates across all three experiments found a surprising degree of constancy in the rate of learning regardless of sequence length or embedded noise. Sequence learning appears to be constant with the logarithm of the number of sequence repetitions practiced during training. The consistency in learning rate across experiments and conditions implies that the mechanisms supporting implicit sequence learning are not capacity-constrained by very long sequences nor adversely affected by high rates of irrelevant sequences during training.
Implicit sequence-specific motor learning after sub-cortical stroke is associated with increased prefrontal brain activations: An fMRI study

PubMed Central

Meehan, Sean K.; Randhawa, Bubblepreet; Wessel, Brenda; Boyd, Lara A.

2010-01-01

Implicit motor learning is preserved after stroke, but how the brain compensates for damage to facilitate learning is unclear. We used a random effects analysis to determine how stroke alters patterns of brain activity during implicit sequence-specific motor learning as compared to general improvements in motor control. Nine healthy participants and 9 individuals with chronic, right focal sub-cortical stroke performed a continuous joystick-based tracking task during an initial fMRI session, over 5 days of practice, and a retention test during a separate fMRI session. Sequence-specific implicit motor learning was differentiated from general improvements in motor control by comparing tracking performance on a novel, repeated tracking sequences during early practice and again at the retention test. Both groups demonstrated implicit sequence-specific motor learning at the retention test, yet substantial differences were apparent. At retention, healthy control participants demonstrated increased BOLD response in left dorsal premotor cortex (BA 6) but decreased BOLD response left dorsolateral prefrontal cortex (DLPFC; BA 9) during repeated sequence tracking. In contrast, at retention individuals with stroke did not show this reduction in DLPFC during repeated tracking. Instead implicit sequence-specific motor learning and general improvements in motor control were associated with increased BOLD response in the left middle frontal gyrus BA 8, regardless of sequence type after stroke. These data emphasize the potential importance of a prefrontal-based attentional network for implicit motor learning after stroke. The present study is the first to highlight the importance of the prefrontal cortex for implicit sequence-specific motor learning after stroke. PMID:20725908
The complete chloroplast genome sequence of the medicinal plant Salvia miltiorrhiza.

PubMed

Qian, Jun; Song, Jingyuan; Gao, Huanhuan; Zhu, Yingjie; Xu, Jiang; Pang, Xiaohui; Yao, Hui; Sun, Chao; Li, Xian'en; Li, Chuyuan; Liu, Juyan; Xu, Haibin; Chen, Shilin

2013-01-01

Salvia miltiorrhiza is an important medicinal plant with great economic and medicinal value. The complete chloroplast (cp) genome sequence of Salvia miltiorrhiza, the first sequenced member of the Lamiaceae family, is reported here. The genome is 151,328 bp in length and exhibits a typical quadripartite structure of the large (LSC, 82,695 bp) and small (SSC, 17,555 bp) single-copy regions, separated by a pair of inverted repeats (IRs, 25,539 bp). It contains 114 unique genes, including 80 protein-coding genes, 30 tRNAs and four rRNAs. The genome structure, gene order, GC content and codon usage are similar to the typical angiosperm cp genomes. Four forward, three inverted and seven tandem repeats were detected in the Salvia miltiorrhiza cp genome. Simple sequence repeat (SSR) analysis among the 30 asterid cp genomes revealed that most SSRs are AT-rich, which contribute to the overall AT richness of these cp genomes. Additionally, fewer SSRs are distributed in the protein-coding sequences compared to the non-coding regions, indicating an uneven distribution of SSRs within the cp genomes. Entire cp genome comparison of Salvia miltiorrhiza and three other Lamiales cp genomes showed a high degree of sequence similarity and a relatively high divergence of intergenic spacers. Sequence divergence analysis discovered the ten most divergent and ten most conserved genes as well as their length variation, which will be helpful for phylogenetic studies in asterids. Our analysis also supports that both regional and functional constraints affect gene sequence evolution. Further, phylogenetic analysis demonstrated a sister relationship between Salvia miltiorrhiza and Sesamum indicum. The complete cp genome sequence of Salvia miltiorrhiza reported in this paper will facilitate population, phylogenetic and cp genetic engineering studies of this medicinal plant.
Comparative Genomic and Transcriptomic Characterization of the Toxigenic Marine Dinoflagellate Alexandrium ostenfeldii

PubMed Central

Jaeckisch, Nina; Yang, Ines; Wohlrab, Sylke; Glöckner, Gernot; Kroymann, Juergen; Vogel, Heiko; Cembella, Allan; John, Uwe

2011-01-01

Many dinoflagellate species are notorious for the toxins they produce and ecological and human health consequences associated with harmful algal blooms (HABs). Dinoflagellates are particularly refractory to genomic analysis due to the enormous genome size, lack of knowledge about their DNA composition and structure, and peculiarities of gene regulation, such as spliced leader (SL) trans-splicing and mRNA transposition mechanisms. Alexandrium ostenfeldii is known to produce macrocyclic imine toxins, described as spirolides. We characterized the genome of A. ostenfeldii using a combination of transcriptomic data and random genomic clones for comparison with other dinoflagellates, particularly Alexandrium species. Examination of SL sequences revealed similar features as in other dinoflagellates, including Alexandrium species. SL sequences in decay indicate frequent retro-transposition of mRNA species. This probably contributes to overall genome complexity by generating additional gene copies. Sequencing of several thousand fosmid and bacterial artificial chromosome (BAC) ends yielded a wealth of simple repeats and tandemly repeated longer sequence stretches which we estimated to comprise more than half of the whole genome. Surprisingly, the repeats comprise a very limited set of 79–97 bp sequences; in part the genome is thus a relatively uniform sequence space interrupted by coding sequences. Our genomic sequence survey (GSS) represents the largest genomic data set of a dinoflagellate to date. Alexandrium ostenfeldii is a typical dinoflagellate with respect to its transcriptome and mRNA transposition but demonstrates Alexandrium-like stop codon usage. The large portion of repetitive sequences and the organization within the genome is in agreement with several other studies on dinoflagellates using different approaches. It remains to be determined whether this unusual composition is directly correlated to the exceptionally genome organization of dinoflagellates with a low amount of histones and histone-like proteins. PMID:22164224
An Adapting Auditory-motor Feedback Loop Can Contribute to Generating Vocal Repetition

PubMed Central

Brainard, Michael S.; Jin, Dezhe Z.

2015-01-01

Consecutive repetition of actions is common in behavioral sequences. Although integration of sensory feedback with internal motor programs is important for sequence generation, if and how feedback contributes to repetitive actions is poorly understood. Here we study how auditory feedback contributes to generating repetitive syllable sequences in songbirds. We propose that auditory signals provide positive feedback to ongoing motor commands, but this influence decays as feedback weakens from response adaptation during syllable repetitions. Computational models show that this mechanism explains repeat distributions observed in Bengalese finch song. We experimentally confirmed two predictions of this mechanism in Bengalese finches: removal of auditory feedback by deafening reduces syllable repetitions; and neural responses to auditory playback of repeated syllable sequences gradually adapt in sensory-motor nucleus HVC. Together, our results implicate a positive auditory-feedback loop with adaptation in generating repetitive vocalizations, and suggest sensory adaptation is important for feedback control of motor sequences. PMID:26448054
Minimal and Contributing Sequence Determinants of the cis-Acting Locus of Transfer (clt) of Streptomycete Plasmid pIJ101 Occur within an Intrinsically Curved Plasmid Region

PubMed Central

Ducote, Matthew J.; Prakash, Shubha; Pettis, Gregg S.

2000-01-01

Efficient interbacterial transfer of streptomycete plasmid pIJ101 requires the pIJ101 tra gene, as well as a cis-acting plasmid function known as clt. Here we show that the minimal pIJ101 clt locus consists of a sequence no greater than 54 bp in size that includes essential inverted-repeat and direct-repeat sequences and is located in close proximity to the 3′ end of the korB regulatory gene. Evidence that sequences extending beyond the minimal locus and into the korB open reading frame influence clt transfer function and demonstration that clt-korB sequences are intrinsically curved raise the possibility that higher-order structuring of DNA and protein within this plasmid region may be an inherent feature of efficient pIJ101 transfer. PMID:11073933
Minimal and contributing sequence determinants of the cis-acting locus of transfer (clt) of streptomycete plasmid pIJ101 occur within an intrinsically curved plasmid region.

PubMed

Ducote, M J; Prakash, S; Pettis, G S

2000-12-01

Efficient interbacterial transfer of streptomycete plasmid pIJ101 requires the pIJ101 tra gene, as well as a cis-acting plasmid function known as clt. Here we show that the minimal pIJ101 clt locus consists of a sequence no greater than 54 bp in size that includes essential inverted-repeat and direct-repeat sequences and is located in close proximity to the 3' end of the korB regulatory gene. Evidence that sequences extending beyond the minimal locus and into the korB open reading frame influence clt transfer function and demonstration that clt-korB sequences are intrinsically curved raise the possibility that higher-order structuring of DNA and protein within this plasmid region may be an inherent feature of efficient pIJ101 transfer.
Comparative chloroplast genomics: Analyses including new sequencesfrom the angiosperms Nuphar advena and Ranunculus macranthus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Raubeso, Linda A.; Peery, Rhiannon; Chumley, Timothy W.

2007-03-01

The number of completely sequenced plastid genomes available is growing rapidly. This new array of sequences presents new opportunities to perform comparative analyses. In comparative studies, it is most useful to compare across wide phylogenetic spans and, within angiosperms, to include representatives from basally diverging lineages such as the new genomes reported here: Nuphar advena (from a basal-most lineage) and Ranunculus macranthus (from the basal group of eudicots). We report these two new plastid genome sequences and make comparisons (within angiosperms, seed plants, or all photosynthetic lineages) to evaluate features such as the status of ycf15 and ycf68 as proteinmore » coding genes, the distribution of simple sequence repeats (SSRs) and longer dispersed repeats (SDR), and patterns of nucleotide composition.« less
Simultaneous Differentiation and Typing of Entamoeba histolytica and Entamoeba dispar

PubMed Central

Zaki, Mehreen; Meelu, Parool; Sun, Wei; Clark, C. Graham

2002-01-01

Sequences corresponding to some of the polymorphic loci previously reported from Entamoeba histolytica have been detected in Entamoeba dispar. Comparison of nucleotide sequences of two loci between E. dispar strain SAW760 and E. histolytica strain HM-1:IMSS revealed significant differences in both repeat and flanking regions. The tandem repeat units varied not only in sequence but also in number and arrangement between the two species at both the loci. Using the sequences obtained, primer pairs aimed at amplifying species-specific products were designed and tested on a variety of E. histolytica and E. dispar samples. Amplification results were in complete agreement with the original species classification in all cases, and the PCR products displayed discernible size and pattern variations among the isolates. PMID:11923344
Novel variable number of tandem repeats of gibbon MAOA gene and its evolutionary significance.

PubMed

Choi, Yuri; Jung, Yi-Deun; Ayarpadikannan, Selvam; Koga, Akihiko; Imai, Hiroo; Hirai, Hirohisa; Roos, Christian; Kim, Heui-Soo

2014-08-01

Variable number of tandem repeats (VNTRs) are scattered throughout the primate genome, and genetic variation of these VNTRs have been accumulated during primate radiation. Here, we analyzed VNTRs upstream of the monoamine oxidase A (MAOA) gene in 11 different gibbon species. An abundance of truncated VNTR sequences and copy number differences were observed compared to those of human VNTR sequences. To better understand the biological role of these VNTRs, a luciferase activity assay was conducted and results indicated that selected VNTR sequences of the MAOA gene from human and three different gibbon species (Hylobates klossii, Hylobates lar, and Nomascus concolor) showed silencing ability. Together, these data could be useful for understanding the evolutionary history and functional significance of MAOA VNTR sequences in gibbon species.
Preliminary Genomic Characterization of Ten Hardwood Tree Species from Multiplexed Low Coverage Whole Genome Sequencing

PubMed Central

Staton, Margaret; Best, Teodora; Khodwekar, Sudhir; Owusu, Sandra; Xu, Tao; Xu, Yi; Jennings, Tara; Cronn, Richard; Arumuganathan, A. Kathiravetpilla; Coggeshall, Mark; Gailing, Oliver; Liang, Haiying; Romero-Severson, Jeanne; Schlarbaum, Scott; Carlson, John E.

2015-01-01

Forest health issues are on the rise in the United States, resulting from introduction of alien pests and diseases, coupled with abiotic stresses related to climate change. Increasingly, forest scientists are finding genetic/genomic resources valuable in addressing forest health issues. For a set of ten ecologically and economically important native hardwood tree species representing a broad phylogenetic spectrum, we used low coverage whole genome sequencing from multiplex Illumina paired ends to economically profile their genomic content. For six species, the genome content was further analyzed by flow cytometry in order to determine the nuclear genome size. Sequencing yielded a depth of 0.8X to 7.5X, from which in silico analysis yielded preliminary estimates of gene and repetitive sequence content in the genome for each species. Thousands of genomic SSRs were identified, with a clear predisposition toward dinucleotide repeats and AT-rich repeat motifs. Flanking primers were designed for SSR loci for all ten species, ranging from 891 loci in sugar maple to 18,167 in redbay. In summary, we have demonstrated that useful preliminary genome information including repeat content, gene content and useful SSR markers can be obtained at low cost and time input from a single lane of Illumina multiplex sequence. PMID:26698853

Some links on this page may take you to non-federal websites. Their policies may differ from this site.