Sample records for background simple sequence

  1. Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.)

    USDA-ARS?s Scientific Manuscript database

    Background: Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed S...

  2. Genome Wide Characterization of Simple Sequence Repeats in Cucumber

    USDA-ARS?s Scientific Manuscript database

    The whole genome sequence of the cucumber cultivar Gy14 was recently sequenced at 15× coverage with the Roche 454 Titanium technology. The microsatellite DNA sequences (simple sequence repeats, SSRs) in the assembled scaffolds were computationally explored and characterized. A total of 112,073 SSRs ...

  3. Identification of simple objects in image sequences

    NASA Astrophysics Data System (ADS)

    Geiselmann, Christoph; Hahn, Michael

    1994-08-01

    We present an investigation in the identification and location of simple objects in color image sequences. As an example the identification of traffic signs is discussed. Three aspects are of special interest. First regions have to be detected which may contain the object. The separation of those regions from the background can be based on color, motion, and contours. In the experiments all three possibilities are investigated. The second aspect focuses on the extraction of suitable features for the identification of the objects. For that purpose the border line of the region of interest is used. For planar objects a sufficient approximation of perspective projection is affine mapping. In consequence, it is near at hand to extract affine-invariant features from the border line. The investigation includes invariant features based on Fourier descriptors and moments. Finally, the object is identified by maximum likelihood classification. In the experiments all three basic object types are correctly identified. The probabilities for misclassification have been found to be below 1%

  4. A blackberry (Rubus L.) expressed sequence tag library for the development of simple sequence repeat markers

    PubMed Central

    Lewers, Kim S; Saski, Chris A; Cuthbertson, Brandon J; Henry, David C; Staton, Meg E; Main, Dorrie S; Dhanaraj, Anik L; Rowland, Lisa J; Tomkins, Jeff P

    2008-01-01

    Background The recent development of novel repeat-fruiting types of blackberry (Rubus L.) cultivars, combined with a long history of morphological marker-assisted selection for thornlessness by blackberry breeders, has given rise to increased interest in using molecular markers to facilitate blackberry breeding. Yet no genetic maps, molecular markers, or even sequences exist specifically for cultivated blackberry. The purpose of this study is to begin development of these tools by generating and annotating the first blackberry expressed sequence tag (EST) library, designing primers from the ESTs to amplify regions containing simple sequence repeats (SSR), and testing the usefulness of a subset of the EST-SSRs with two blackberry cultivars. Results A cDNA library of 18,432 clones was generated from expanding leaf tissue of the cultivar Merton Thornless, a progenitor of many thornless commercial cultivars. Among the most abundantly expressed of the 3,000 genes annotated were those involved with energy, cell structure, and defense. From individual sequences containing SSRs, 673 primer pairs were designed. Of a randomly chosen set of 33 primer pairs tested with two blackberry cultivars, 10 detected an average of 1.9 polymorphic PCR products. Conclusion This rate predicts that this library may yield as many as 940 SSR primer pairs detecting 1,786 polymorphisms. This may be sufficient to generate a genetic map that can be used to associate molecular markers with phenotypic traits, making possible molecular marker-assisted breeding to compliment existing morphological marker-assisted breeding in blackberry. PMID:18570660

  5. Mapping Simple Repeated DNA Sequences in Heterochromatin of Drosophila Melanogaster

    PubMed Central

    Lohe, A. R.; Hilliker, A. J.; Roberts, P. A.

    1993-01-01

    Heterochromatin in Drosophila has unusual genetic, cytological and molecular properties. Highly repeated DNA sequences (satellites) are the principal component of heterochromatin. Using probes from cloned satellites, we have constructed a chromosome map of 10 highly repeated, simple DNA sequences in heterochromatin of mitotic chromosomes of Drosophila melanogaster. Despite extensive sequence homology among some satellites, chromosomal locations could be distinguished by stringent in situ hybridizations for each satellite. Only two of the localizations previously determined using gradient-purified bulk satellite probes are correct. Eight new satellite localizations are presented, providing a megabase-level chromosome map of one-quarter of the genome. Five major satellites each exhibit a multichromosome distribution, and five minor satellites hybridize to single sites on the Y chromosome. Satellites closely related in sequence are often located near one another on the same chromosome. About 80% of Y chromosome DNA is composed of nine simple repeated sequences, in particular (AAGAC)(n) (8 Mb), (AAGAG)(n) (7 Mb) and (AATAT)(n) (6 Mb). Similarly, more than 70% of the DNA in chromosome 2 heterochromatin is composed of five simple repeated sequences. We have also generated a high resolution map of satellites in chromosome 2 heterochromatin, using a series of translocation chromosomes whose breakpoints in heterochromatin were ordered by N-banding. Finally, staining and banding patterns of heterochromatic regions are correlated with the locations of specific repeated DNA sequences. The basis for the cytochemical heterogeneity in banding appears to depend exclusively on the different satellite DNAs present in heterochromatin. PMID:8375654

  6. Optimization of sequence alignment for simple sequence repeat regions.

    PubMed

    Jighly, Abdulqader; Hamwieh, Aladdin; Ogbonnaya, Francis C

    2011-07-20

    Microsatellites, or simple sequence repeats (SSRs), are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs) mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs).SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type.When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic phylogenic relationship.

  7. Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.

    PubMed

    Gupta, P D

    2016-10-01

    In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology.

  8. DNA sequencing using fluorescence background electroblotting membrane

    DOEpatents

    Caldwell, Karin D.; Chu, Tun-Jen; Pitt, William G.

    1992-01-01

    A method for the multiplex sequencing on DNA is disclosed which comprises the electroblotting or specific base terminated DNA fragments, which have been resolved by gel electrophoresis, onto the surface of a neutral non-aromatic polymeric microporous membrane exhibiting low background fluorescence which has been surface modified to contain amino groups. Polypropylene membranes are preferably and the introduction of amino groups is accomplished by subjecting the membrane to radio or microwave frequency plasma discharge in the presence of an aminating agent, preferably ammonia. The membrane, containing physically adsorbed DNA fragments on its surface after the electroblotting, is then treated with crosslinking means such as UV radiation or a glutaraldehyde spray to chemically bind the DNA fragments to the membrane through said smino groups contained on the surface thereof. The DNA fragments chemically bound to the membrane are subjected to hybridization probing with a tagged probe specific to the sequence of the DNA fragments. The tagging may be by either fluorophores or radioisotopes. The tagged probes hybridized to said target DNA fragments are detected and read by laser induced fluorescence detection or autoradiograms. The use of aminated low fluorescent background membranes allows the use of fluorescent detection and reading even when the available amount of DNA to be sequenced is small. The DNA bound to the membrances may be reprobed numerous times.

  9. DNA sequencing using fluorescence background electroblotting membrane

    DOEpatents

    Caldwell, K.D.; Chu, T.J.; Pitt, W.G.

    1992-05-12

    A method for the multiplex sequencing on DNA is disclosed which comprises the electroblotting or specific base terminated DNA fragments, which have been resolved by gel electrophoresis, onto the surface of a neutral non-aromatic polymeric microporous membrane exhibiting low background fluorescence which has been surface modified to contain amino groups. Polypropylene membranes are preferably and the introduction of amino groups is accomplished by subjecting the membrane to radio or microwave frequency plasma discharge in the presence of an aminating agent, preferably ammonia. The membrane, containing physically adsorbed DNA fragments on its surface after the electroblotting, is then treated with crosslinking means such as UV radiation or a glutaraldehyde spray to chemically bind the DNA fragments to the membrane through amino groups contained on the surface. The DNA fragments chemically bound to the membrane are subjected to hybridization probing with a tagged probe specific to the sequence of the DNA fragments. The tagging may be by either fluorophores or radioisotopes. The tagged probes hybridized to the target DNA fragments are detected and read by laser induced fluorescence detection or autoradiograms. The use of aminated low fluorescent background membranes allows the use of fluorescent detection and reading even when the available amount of DNA to be sequenced is small. The DNA bound to the membranes may be reprobed numerous times. No Drawings

  10. Always look on both sides: Phylogenetic information conveyed by simple sequence repeat allele sequences

    USDA-ARS?s Scientific Manuscript database

    Simple sequence repeat (SSR) markers are widely used tools for inferences about genetic diversity, phylogeography and spatial genetic structure. Their applications assume that variation among alleles is essentially caused by an expansion or contraction of the number of repeats and that, accessorily,...

  11. Capturing chloroplast variation for molecular ecology studies: a simple next generation sequencing approach applied to a rainforest tree

    PubMed Central

    2013-01-01

    Background With high quantity and quality data production and low cost, next generation sequencing has the potential to provide new opportunities for plant phylogeographic studies on single and multiple species. Here we present an approach for in silicio chloroplast DNA assembly and single nucleotide polymorphism detection from short-read shotgun sequencing. The approach is simple and effective and can be implemented using standard bioinformatic tools. Results The chloroplast genome of Toona ciliata (Meliaceae), 159,514 base pairs long, was assembled from shotgun sequencing on the Illumina platform using de novo assembly of contigs. To evaluate its practicality, value and quality, we compared the short read assembly with an assembly completed using 454 data obtained after chloroplast DNA isolation. Sanger sequence verifications indicated that the Illumina dataset outperformed the longer read 454 data. Pooling of several individuals during preparation of the shotgun library enabled detection of informative chloroplast SNP markers. Following validation, we used the identified SNPs for a preliminary phylogeographic study of T. ciliata in Australia and to confirm low diversity across the distribution. Conclusions Our approach provides a simple method for construction of whole chloroplast genomes from shotgun sequencing of whole genomic DNA using short-read data and no available closely related reference genome (e.g. from the same species or genus). The high coverage of Illumina sequence data also renders this method appropriate for multiplexing and SNP discovery and therefore a useful approach for landscape level studies of evolutionary ecology. PMID:23497206

  12. Comparison of simple sequence repeats in 19 Archaea.

    PubMed

    Trivedi, S

    2006-12-05

    All organisms that have been studied until now have been found to have differential distribution of simple sequence repeats (SSRs), with more SSRs in intergenic than in coding sequences. SSR distribution was investigated in Archaea genomes where complete chromosome sequences of 19 Archaea were analyzed with the program SPUTNIK to find di- to penta-nucleotide repeats. The number of repeats was determined for the complete chromosome sequences and for the coding and non-coding sequences. Different from what has been found for other groups of organisms, there is an abundance of SSRs in coding regions of the genome of some Archaea. Dinucleotide repeats were rare and CG repeats were found in only two Archaea. In general, trinucleotide repeats are the most abundant SSR motifs; however, pentanucleotide repeats are abundant in some Archaea. Some of the tetranucleotide and pentanucleotide repeat motifs are organism specific. In general, repeats are short and CG-rich repeats are present in Archaea having a CG-rich genome. Among the 19 Archaea, SSR density was not correlated with genome size or with optimum growth temperature. Pentanucleotide density had an inverse correlation with the CG content of the genome.

  13. Analysis of expressed sequence tags from Prunus mume flower and fruit and development of simple sequence repeat markers

    PubMed Central

    2010-01-01

    Background Expressed Sequence Tag (EST) has been a cost-effective tool in molecular biology and represents an abundant valuable resource for genome annotation, gene expression, and comparative genomics in plants. Results In this study, we constructed a cDNA library of Prunus mume flower and fruit, sequenced 10,123 clones of the library, and obtained 8,656 expressed sequence tag (EST) sequences with high quality. The ESTs were assembled into 4,473 unigenes composed of 1,492 contigs and 2,981 singletons and that have been deposited in NCBI (accession IDs: GW868575 - GW873047), among which 1,294 unique ESTs were with known or putative functions. Furthermore, we found 1,233 putative simple sequence repeats (SSRs) in the P. mume unigene dataset. We randomly tested 42 pairs of PCR primers flanking potential SSRs, and 14 pairs were identified as true-to-type SSR loci and could amplify polymorphic bands from 20 individual plants of P. mume. We further used the 14 EST-SSR primer pairs to test the transferability on peach and plum. The result showed that nearly 89% of the primer pairs produced target PCR bands in the two species. A high level of marker polymorphism was observed in the plum species (65%) and low in the peach (46%), and the clustering analysis of the three species indicated that these SSR markers were useful in the evaluation of genetic relationships and diversity between and within the Prunus species. Conclusions We have constructed the first cDNA library of P. mume flower and fruit, and our data provide sets of molecular biology resources for P. mume and other Prunus species. These resources will be useful for further study such as genome annotation, new gene discovery, gene functional analysis, molecular breeding, evolution and comparative genomics between Prunus species. PMID:20626882

  14. Highly Informative Simple Sequence Repeat (SSR) Markers for Fingerprinting Hazelnut

    USDA-ARS?s Scientific Manuscript database

    Simple sequence repeat (SSR) or microsatellite markers have many applications in breeding and genetic studies of plants, including fingerprinting of cultivars and investigations of genetic diversity, and therefore provide information for better management of germplasm collections. They are repeatab...

  15. PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

    PubMed

    Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

    2011-01-01

    PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.

  16. Simple sequence repeat markers that identify Claviceps species and strains

    USDA-ARS?s Scientific Manuscript database

    Claviceps purpurea is a pathogen that infects most members of the Pooideae subfamily and causes ergot, a floral disease in which the ovary is replaced with a sclerotium. This study was initiated to develop Simple Sequence Repeat (SSRs) markers for rapid identification of C. purpurea. SSRs were desi...

  17. Simple sequence repeat marker loci discovery using SSR primer.

    PubMed

    Robinson, Andrew J; Love, Christopher G; Batley, Jacqueline; Barker, Gary; Edwards, David

    2004-06-12

    Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. With the increase in the availability of DNA sequence information, an automated process to identify and design PCR primers for amplification of SSR loci would be a useful tool in plant breeding programs. We report an application that integrates SPUTNIK, an SSR repeat finder, with Primer3, a PCR primer design program, into one pipeline tool, SSR Primer. On submission of multiple FASTA formatted sequences, the script screens each sequence for SSRs using SPUTNIK. The results are parsed to Primer3 for locus-specific primer design. The script makes use of a Web-based interface, enabling remote use. This program has been written in PERL and is freely available for non-commercial users by request from the authors. The Web-based version may be accessed at http://hornbill.cspp.latrobe.edu.au/

  18. Aircraft stress sequence development: A complex engineering process made simple

    NASA Technical Reports Server (NTRS)

    Schrader, K. H.; Butts, D. G.; Sparks, W. A.

    1994-01-01

    Development of stress sequences for critical aircraft structure requires flight measured usage data, known aircraft loads, and established relationships between aircraft flight loads and structural stresses. Resulting cycle-by-cycle stress sequences can be directly usable for crack growth analysis and coupon spectra tests. Often, an expert in loads and spectra development manipulates the usage data into a typical sequence of representative flight conditions for which loads and stresses are calculated. For a fighter/trainer type aircraft, this effort is repeated many times for each of the fatigue critical locations (FCL) resulting in expenditure of numerous engineering hours. The Aircraft Stress Sequence Computer Program (ACSTRSEQ), developed by Southwest Research Institute under contract to San Antonio Air Logistics Center, presents a unique approach for making complex technical computations in a simple, easy to use method. The program is written in Microsoft Visual Basic for the Microsoft Windows environment.

  19. Simple Sequence Repeats Provide a Substrate for Phenotypic Variation in the Neurospora crassa Circadian Clock

    PubMed Central

    Michael, Todd P.; Park, Sohyun; Kim, Tae-Sung; Booth, Jim; Byer, Amanda; Sun, Qi; Chory, Joanne; Lee, Kwangwon

    2007-01-01

    Background WHITE COLLAR-1 (WC-1) mediates interactions between the circadian clock and the environment by acting as both a core clock component and as a blue light photoreceptor in Neurospora crassa. Loss of the amino-terminal polyglutamine (NpolyQ) domain in WC-1 results in an arrhythmic circadian clock; this data is consistent with this simple sequence repeat (SSR) being essential for clock function. Methodology/Principal Findings Since SSRs are often polymorphic in length across natural populations, we reasoned that investigating natural variation of the WC-1 NpolyQ may provide insight into its role in the circadian clock. We observed significant phenotypic variation in the period, phase and temperature compensation of circadian regulated asexual conidiation across 143 N. crassa accessions. In addition to the NpolyQ, we identified two other simple sequence repeats in WC-1. The sizes of all three WC-1 SSRs correlated with polymorphisms in other clock genes, latitude and circadian period length. Furthermore, in a cross between two N. crassa accessions, the WC-1 NpolyQ co-segregated with period length. Conclusions/Significance Natural variation of the WC-1 NpolyQ suggests a mechanism by which period length can be varied and selected for by the local environment that does not deleteriously affect WC-1 activity. Understanding natural variation in the N. crassa circadian clock will facilitate an understanding of how fungi exploit their environments. PMID:17726525

  20. Fine-tuning gene networks using simple sequence repeats

    PubMed Central

    Egbert, Robert G.; Klavins, Eric

    2012-01-01

    The parameters in a complex synthetic gene network must be extensively tuned before the network functions as designed. Here, we introduce a simple and general approach to rapidly tune gene networks in Escherichia coli using hypermutable simple sequence repeats embedded in the spacer region of the ribosome binding site. By varying repeat length, we generated expression libraries that incrementally and predictably sample gene expression levels over a 1,000-fold range. We demonstrate the utility of the approach by creating a bistable switch library that programmatically samples the expression space to balance the two states of the switch, and we illustrate the need for tuning by showing that the switch’s behavior is sensitive to host context. Further, we show that mutation rates of the repeats are controllable in vivo for stability or for targeted mutagenesis—suggesting a new approach to optimizing gene networks via directed evolution. This tuning methodology should accelerate the process of engineering functionally complex gene networks. PMID:22927382

  1. Genome-wide characterization and selection of expressed sequence tag simple sequence repeat primers for optimized marker distribution and reliability in peach

    USDA-ARS?s Scientific Manuscript database

    Expressed sequence tag (EST) simple sequence repeats (SSRs) in Prunus were mined, and flanking primers designed and used for genome-wide characterization and selection of primers to optimize marker distribution and reliability. A total of 12,618 contigs were assembled from 84,727 ESTs, along with 34...

  2. Simple chained guide trees give high-quality protein multiple sequence alignments

    PubMed Central

    Boyce, Kieran; Sievers, Fabian; Higgins, Desmond G.

    2014-01-01

    Guide trees are used to decide the order of sequence alignment in the progressive multiple sequence alignment heuristic. These guide trees are often the limiting factor in making large alignments, and considerable effort has been expended over the years in making these quickly or accurately. In this article we show that, at least for protein families with large numbers of sequences that can be benchmarked with known structures, simple chained guide trees give the most accurate alignments. These also happen to be the fastest and simplest guide trees to construct, computationally. Such guide trees have a striking effect on the accuracy of alignments produced by some of the most widely used alignment packages. There is a marked increase in accuracy and a marked decrease in computational time, once the number of sequences goes much above a few hundred. This is true, even if the order of sequences in the guide tree is random. PMID:25002495

  3. Effects of tonal language background on tests of temporal sequencing in children.

    PubMed

    Mukari, Siti Zamratol-Mai S; Yu, Xuan; Ishak, Wan Syafira; Mazlan, Rafidah

    2015-01-01

    The aims of the present study were to determine the effects of language background on the performance of the pitch pattern sequence test (PPST) and duration pattern sequence test (DPST). As temporal order sequencing may be affected by age and working memory, these factors were also studied. Performance of tonal and non-tonal language speakers on PPST and DPST were compared. Twenty-eight native Mandarin (tonal language) speakers and twenty-nine native Malay (non-tonal language) speakers between seven to nine years old participated in this study. The results revealed that relative to native Malay speakers, native Mandarin speakers demonstrated better scores on the PPST in both humming and verbal labeling responses. However, a similar language effect was not apparent in the DPST. An age effect was only significant in the PPST (verbal labeling). Finally, no significant effect of working memory was found on the PPST and the DPST. These findings suggest that the PPST is affected by tonal language background, and highlight the importance of developing different normative values for tonal and non-tonal language speakers.

  4. Evolution Analysis of Simple Sequence Repeats in Plant Genome.

    PubMed

    Qin, Zhen; Wang, Yanping; Wang, Qingmei; Li, Aixian; Hou, Fuyun; Zhang, Liming

    2015-01-01

    Simple sequence repeats (SSRs) are widespread units on genome sequences, and play many important roles in plants. In order to reveal the evolution of plant genomes, we investigated the evolutionary regularities of SSRs during the evolution of plant species and the plant kingdom by analysis of twelve sequenced plant genome sequences. First, in the twelve studied plant genomes, the main SSRs were those which contain repeats of 1-3 nucleotides combination. Second, in mononucleotide SSRs, the A/T percentage gradually increased along with the evolution of plants (except for P. patens). With the increase of SSRs repeat number the percentage of A/T in C. reinhardtii had no significant change, while the percentage of A/T in terrestrial plants species gradually declined. Third, in dinucleotide SSRs, the percentage of AT/TA increased along with the evolution of plant kingdom and the repeat number increased in terrestrial plants species. This trend was more obvious in dicotyledon than monocotyledon. The percentage of CG/GC showed the opposite pattern to the AT/TA. Forth, in trinucleotide SSRs, the percentages of combinations including two or three A/T were in a rising trend along with the evolution of plant kingdom; meanwhile with the increase of SSRs repeat number in plants species, different species chose different combinations as dominant SSRs. SSRs in C. reinhardtii, P. patens, Z. mays and A. thaliana showed their specific patterns related to evolutionary position or specific changes of genome sequences. The results showed that, SSRs not only had the general pattern in the evolution of plant kingdom, but also were associated with the evolution of the specific genome sequence. The study of the evolutionary regularities of SSRs provided new insights for the analysis of the plant genome evolution.

  5. Simple, multiplexed, PCR-based barcoding of DNA enables sensitive mutation detection in liquid biopsies using sequencing.

    PubMed

    Ståhlberg, Anders; Krzyzanowski, Paul M; Jackson, Jennifer B; Egyud, Matthew; Stein, Lincoln; Godfrey, Tony E

    2016-06-20

    Detection of cell-free DNA in liquid biopsies offers great potential for use in non-invasive prenatal testing and as a cancer biomarker. Fetal and tumor DNA fractions however can be extremely low in these samples and ultra-sensitive methods are required for their detection. Here, we report an extremely simple and fast method for introduction of barcodes into DNA libraries made from 5 ng of DNA. Barcoded adapter primers are designed with an oligonucleotide hairpin structure to protect the molecular barcodes during the first rounds of polymerase chain reaction (PCR) and prevent them from participating in mis-priming events. Our approach enables high-level multiplexing and next-generation sequencing library construction with flexible library content. We show that uniform libraries of 1-, 5-, 13- and 31-plex can be generated. Utilizing the barcodes to generate consensus reads for each original DNA molecule reduces background sequencing noise and allows detection of variant alleles below 0.1% frequency in clonal cell line DNA and in cell-free plasma DNA. Thus, our approach bridges the gap between the highly sensitive but specific capabilities of digital PCR, which only allows a limited number of variants to be analyzed, with the broad target capability of next-generation sequencing which traditionally lacks the sensitivity to detect rare variants. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Analysis of simple sequence repeat (SSR) structure and sequence within Epichloë endophyte genomes reveals impacts on gene structure and insights into ancestral hybridization events.

    PubMed

    Clayton, William; Eaton, Carla Jane; Dupont, Pierre-Yves; Gillanders, Tim; Cameron, Nick; Saikia, Sanjay; Scott, Barry

    2017-01-01

    Epichloë grass endophytes comprise a group of filamentous fungi of both sexual and asexual species. Known for the beneficial characteristics they endow upon their grass hosts, the identification of these endophyte species has been of great interest agronomically and scientifically. The use of simple sequence repeat loci and the variation in repeat elements has been used to rapidly identify endophyte species and strains, however, little is known of how the structure of repeat elements changes between species and strains, and where these repeat elements are located in the fungal genome. We report on an in-depth analysis of the structure and genomic location of the simple sequence repeat locus B10, commonly used for Epichloë endophyte species identification. The B10 repeat was found to be located within an exon of a putative bZIP transcription factor, suggesting possible impacts on polypeptide sequence and thus protein function. Analysis of this repeat in the asexual endophyte hybrid Epichloë uncinata revealed that the structure of B10 alleles reflects the ancestral species that hybridized to give rise to this species. Understanding the structure and sequence of these simple sequence repeats provides a useful set of tools for readily distinguishing strains and for gaining insights into the ancestral species that have undergone hybridization events.

  7. Background sequence characteristics influence the occurrence and severity of disease-causing mtDNA mutations

    PubMed Central

    Wei, Wei; Hudson, Gavin

    2017-01-01

    Inherited mitochondrial DNA (mtDNA) mutations have emerged as a common cause of human disease, with mutations occurring multiple times in the world population. The clinical presentation of three pathogenic mtDNA mutations is strongly associated with a background mtDNA haplogroup, but it is not clear whether this is limited to a handful of examples or is a more general phenomenon. To address this, we determined the characteristics of 30,506 mtDNA sequences sampled globally. After performing several quality control steps, we ascribed an established pathogenicity score to the major alleles for each sequence. The mean pathogenicity score for known disease-causing mutations was significantly different between mtDNA macro-haplogroups. Several mutations were observed across all haplogroup backgrounds, whereas others were only observed on specific clades. In some instances this reflected a founder effect, but in others, the mutation recurred but only within the same phylogenetic cluster. Sequence diversity estimates showed that disease-causing mutations were more frequent on young sequences, and genomes with two or more disease-causing mutations were more common than expected by chance. These findings implicate the mtDNA background more generally in recurrent mutation events that have been purified through natural selection in older populations. This provides an explanation for the low frequency of mtDNA disease reported in specific ethnic groups. PMID:29253894

  8. Development of Genomic Simple Sequence Repeats (SSR) by Enrichment Libraries in Date Palm.

    PubMed

    Al-Faifi, Sulieman A; Migdadi, Hussein M; Algamdi, Salem S; Khan, Mohammad Altaf; Al-Obeed, Rashid S; Ammar, Megahed H; Jakse, Jerenj

    2017-01-01

    Development of highly informative markers such as simple sequence repeats (SSR) for cultivar identification and germplasm characterization and management is essential for date palms genetic studies. The present study documents the development of SSR markers and assesses genetic relationships of commonly grown date palm (Phoenix dactylifera L.) cultivars in different geographical regions of Saudi Arabia. A total of 93 novel simple sequence repeat (SSR) markers were screened for their ability to detect polymorphism in date palm. Around 71% of genomic SSRs are dinucleotide, 25% trinucleotide, 3% tetranucleotide, and 1% pentanucleotide motives and show 100% polymorphism. The Unweighted Pair Group Method with Arithmetic Mean (UPGMA) cluster analysis illustrates that cultivars trend to group according to their class of maturity, region of cultivation, and fruit color. Analysis of molecular variations (AMOVA) reveals genetic variation among and within cultivars of 27% and 73%, respectively, according to the geographical distribution of the cultivars. Developed microsatellite markers are of additional value to date palm characterization, tools which can be used by researchers in population genetics, cultivar identification, as well as genetic resource exploration and management. The cultivars tested exhibited a significant amount of genetic diversity and could be suitable for successful breeding programs. Genomic sequences generated from this study are available at the National Center for Biotechnology Information (NCBI), Sequence Read Archive (Accession numbers. LIBGSS_039019).

  9. Identification of Simple Sequence Repeats in Chloroplast Genomes of Magnoliids Through Bioinformatics Approach.

    PubMed

    Srivastava, Deepika; Shanker, Asheesh

    2016-12-01

    Basal angiosperms or Magnoliids is an important clade of commercially important plants which mainly include spices and edible fruits. In this study, 17 chloroplast genome sequences belonging to clade Magnoliids were screened for the identification of chloroplast simple sequence repeats (cpSSRs). Simple sequence repeats or microsatellites are short stretches of DNA up to 1-6 base pair in length. These repeats are ubiquitous and play important role in the development of molecular markers and to study the mapping of traits of economic, medical or ecological interest. A total of 479 SSRs were detected, showing average density of 1 SSR/6.91 kb. Depending on the repeat units, the length of SSRs ranged from 12 to 24 bp for mono-, 12 to 18 bp for di-, 12 to 26 bp for tri-, 12 to 24 bp for tetra-, 15 bp for penta- and 18 bp for hexanucleotide repeats. Mononucleotide repeats were the most frequent (207, 43.21 %) followed by tetranucleotide repeats (130, 27.13 %). Penta- and hexanucleotide repeats were least frequent or absent in these chloroplast genomes.

  10. Development and characterization of simple sequence repeats for Bipolaris sokiniana and cross transferability to related species

    USDA-ARS?s Scientific Manuscript database

    Simple sequence repeats (SSR) markers were developed from a small insert genomic library for Bipolaris sorokiniana, a mitosporic fungal pathogen that causes spot blotch and root rot in switchgrass. About 59% of sequenced clones (n=384) harbored various SSR motifs. After eliminating the redundant seq...

  11. Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species

    PubMed Central

    Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha

    2011-01-01

    Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309

  12. Simple tools for assembling and searching high-density picolitre pyrophosphate sequence data.

    PubMed

    Parker, Nicolas J; Parker, Andrew G

    2008-04-18

    The advent of pyrophosphate sequencing makes large volumes of sequencing data available at a lower cost than previously possible. However, the short read lengths are difficult to assemble and the large dataset is difficult to handle. During the sequencing of a virus from the tsetse fly, Glossina pallidipes, we found the need for tools to search quickly a set of reads for near exact text matches. A set of tools is provided to search a large data set of pyrophosphate sequence reads under a "live" CD version of Linux on a standard PC that can be used by anyone without prior knowledge of Linux and without having to install a Linux setup on the computer. The tools permit short lengths of de novo assembly, checking of existing assembled sequences, selection and display of reads from the data set and gathering counts of sequences in the reads. Demonstrations are given of the use of the tools to help with checking an assembly against the fragment data set; investigating homopolymer lengths, repeat regions and polymorphisms; and resolving inserted bases caused by incomplete chain extension. The additional information contained in a pyrophosphate sequencing data set beyond a basic assembly is difficult to access due to a lack of tools. The set of simple tools presented here would allow anyone with basic computer skills and a standard PC to access this information.

  13. Characterization and Amplification of Gene-Based Simple Sequence Repeat (SSR) Markers in Date Palm.

    PubMed

    Zhao, Yongli; Keremane, Manjunath; Prakash, Channapatna S; He, Guohao

    2017-01-01

    The paucity of molecular markers limits the application of genetic and genomic research in date palm (Phoenix dactylifera L.). Availability of expressed sequence tag (EST) sequences in date palm may provide a good resource for developing gene-based markers. This study characterizes a substantial fraction of transcriptome sequences containing simple sequence repeats (SSRs) from the EST sequences in date palm. The EST sequences studied are mainly homologous to those of Elaeis guineensis and Musa acuminata. A total of 911 gene-based SSR markers, characterized with functional annotations, have provided a useful basis not only for discovering candidate genes and understanding genetic basis of traits of interest but also for developing genetic and genomic tools for molecular research in date palm, such as diversity study, quantitative trait locus (QTL) mapping, and molecular breeding. The procedures of DNA extraction, polymerase chain reaction (PCR) amplification of these gene-based SSR markers, and gel electrophoresis of PCR products are described in this chapter.

  14. A simple algorithm for quantifying DNA methylation levels on multiple independent CpG sites in bisulfite genomic sequencing electropherograms.

    PubMed

    Leakey, Tatiana I; Zielinski, Jerzy; Siegfried, Rachel N; Siegel, Eric R; Fan, Chun-Yang; Cooney, Craig A

    2008-06-01

    DNA methylation at cytosines is a widely studied epigenetic modification. Methylation is commonly detected using bisulfite modification of DNA followed by PCR and additional techniques such as restriction digestion or sequencing. These additional techniques are either laborious, require specialized equipment, or are not quantitative. Here we describe a simple algorithm that yields quantitative results from analysis of conventional four-dye-trace sequencing. We call this method Mquant and we compare it with the established laboratory method of combined bisulfite restriction assay (COBRA). This analysis of sequencing electropherograms provides a simple, easily applied method to quantify DNA methylation at specific CpG sites.

  15. GATA simple sequence repeats function as enhancer blocker boundaries.

    PubMed

    Kumar, Ram P; Krishnan, Jaya; Pratap Singh, Narendra; Singh, Lalji; Mishra, Rakesh K

    2013-01-01

    Simple sequence repeats (SSRs) account for ~3% of the human genome, but their functional significance still remains unclear. One of the prominent SSRs the GATA tetranucleotide repeat has preferentially accumulated in complex organisms. GATA repeats are particularly enriched on the human Y chromosome, and their non-random distribution and exclusive association with genes expressed during early development indicate their role in coordinated gene regulation. Here we show that GATA repeats have enhancer blocker activity in Drosophila and human cells. This enhancer blocker activity is seen in transgenic as well as native context of the enhancers at various developmental stages. These findings ascribe functional significance to SSRs and offer an explanation as to why SSRs, especially GATA, may have accumulated in complex organisms.

  16. Unexpected effects of different genetic backgrounds on identification of genomic rearrangements via whole-genome next generation sequencing.

    PubMed

    Chen, Zhangguo; Gowan, Katherine; Leach, Sonia M; Viboolsittiseri, Sawanee S; Mishra, Ameet K; Kadoishi, Tanya; Diener, Katrina; Gao, Bifeng; Jones, Kenneth; Wang, Jing H

    2016-10-21

    Whole genome next generation sequencing (NGS) is increasingly employed to detect genomic rearrangements in cancer genomes, especially in lymphoid malignancies. We recently established a unique mouse model by specifically deleting a key non-homologous end-joining DNA repair gene, Xrcc4, and a cell cycle checkpoint gene, Trp53, in germinal center B cells. This mouse model spontaneously develops mature B cell lymphomas (termed G1XP lymphomas). Here, we attempt to employ whole genome NGS to identify novel structural rearrangements, in particular inter-chromosomal translocations (CTXs), in these G1XP lymphomas. We sequenced six lymphoma samples, aligned our NGS data with mouse reference genome (in C57BL/6J (B6) background) and identified CTXs using CREST algorithm. Surprisingly, we detected widespread CTXs in both lymphomas and wildtype control samples, majority of which were false positive and attributable to different genetic backgrounds. In addition, we validated our NGS pipeline by sequencing multiple control samples from distinct tissues of different genetic backgrounds of mouse (B6 vs non-B6). Lastly, our studies showed that widespread false positive CTXs can be generated by simply aligning sequences from different genetic backgrounds of mouse. We conclude that mapping and alignment with reference genome might not be a preferred method for analyzing whole-genome NGS data obtained from a genetic background different from reference genome. Given the complex genetic background of different mouse strains or the heterogeneity of cancer genomes in human patients, in order to minimize such systematic artifacts and uncover novel CTXs, a preferred method might be de novo assembly of personalized normal control genome and cancer cell genome, instead of mapping and aligning NGS data to mouse or human reference genome. Thus, our studies have critical impact on the manner of data analysis for cancer genomics.

  17. Identification of apple cultivars on the basis of simple sequence repeat markers.

    PubMed

    Liu, G S; Zhang, Y G; Tao, R; Fang, J G; Dai, H Y

    2014-09-12

    DNA markers are useful tools that play an important role in plant cultivar identification. They are usually based on polymerase chain reaction (PCR) and include simple sequence repeats (SSRs), inter-simple sequence repeats, and random amplified polymorphic DNA. However, DNA markers were not used effectively in the complete identification of plant cultivars because of the lack of known DNA fingerprints. Recently, a novel approach called the cultivar identification diagram (CID) strategy was developed to facilitate the use of DNA markers for separate plant individuals. The CID was designed whereby a polymorphic maker was generated from each PCR that directly allowed for cultivar sample separation at each step. Therefore, it could be used to identify cultivars and varieties easily with fewer primers. In this study, 60 apple cultivars, including a few main cultivars in fields and varieties from descendants (Fuji x Telamon) were examined. Of the 20 pairs of SSR primers screened, 8 pairs gave reproducible, polymorphic DNA amplification patterns. The banding patterns obtained from these 8 primers were used to construct a CID map. Each cultivar or variety in this study was distinguished from the others completely, indicating that this method can be used for efficient cultivar identification. The result contributed to studies on germplasm resources and the seedling industry in fruit trees.

  18. Simple Sequence Repeats in Escherichia coli: Abundance, Distribution, Composition, and Polymorphism

    PubMed Central

    Gur-Arie, Riva; Cohen, Cyril J.; Eitan, Yuval; Shelef, Leora; Hallerman, Eric M.; Kashi, Yechezkel

    2000-01-01

    Computer-based genome-wide screening of the DNA sequence of Escherichia coli strain K12 revealed tens of thousands of tandem simple sequence repeat (SSR) tracts, with motifs ranging from 1 to 6 nucleotides. SSRs were well distributed throughout the genome. Mononucleotide SSRs were over-represented in noncoding regions and under-represented in open reading frames (ORFs). Nucleotide composition of mono- and dinucleotide SSRs, both in ORFs and in noncoding regions, differed from that of the genomic region in which they occurred, with 93% of all mononucleotide SSRs proving to be of A or T. Computer-based analysis of the fine position of every SSR locus in the noncoding portion of the genome relative to downstream ORFs showed SSRs located in areas that could affect gene regulation. DNA sequences at 14 arbitrarily chosen SSR tracts were compared among E. coli strains. Polymorphisms of SSR copy number were observed at four of seven mononucleotide SSR tracts screened, with all polymorphisms occurring in noncoding regions. SSR polymorphism could prove important as a genome-wide source of variation, both for practical applications (including rapid detection, strain identification, and detection of loci affecting key phenotypes) and for evolutionary adaptation of microbes.[The sequence data described in this paper have been submitted to the GenBank data library under accession numbers AF209020–209030 and AF209508–209518.] PMID:10645951

  19. Simple sequence repeats in Escherichia coli: abundance, distribution, composition, and polymorphism.

    PubMed

    Gur-Arie, R; Cohen, C J; Eitan, Y; Shelef, L; Hallerman, E M; Kashi, Y

    2000-01-01

    Computer-based genome-wide screening of the DNA sequence of Escherichia coli strain K12 revealed tens of thousands of tandem simple sequence repeat (SSR) tracts, with motifs ranging from 1 to 6 nucleotides. SSRs were well distributed throughout the genome. Mononucleotide SSRs were over-represented in noncoding regions and under-represented in open reading frames (ORFs). Nucleotide composition of mono- and dinucleotide SSRs, both in ORFs and in noncoding regions, differed from that of the genomic region in which they occurred, with 93% of all mononucleotide SSRs proving to be of A or T. Computer-based analysis of the fine position of every SSR locus in the noncoding portion of the genome relative to downstream ORFs showed SSRs located in areas that could affect gene regulation. DNA sequences at 14 arbitrarily chosen SSR tracts were compared among E. coli strains. Polymorphisms of SSR copy number were observed at four of seven mononucleotide SSR tracts screened, with all polymorphisms occurring in noncoding regions. SSR polymorphism could prove important as a genome-wide source of variation, both for practical applications (including rapid detection, strain identification, and detection of loci affecting key phenotypes) and for evolutionary adaptation of microbes.

  20. Simple sequence repeat marker development from bacterial artificial chromosome end sequences and expressed sequence tags of flax (Linum usitatissimum L.).

    PubMed

    Cloutier, Sylvie; Miranda, Evelyn; Ward, Kerry; Radovanovic, Natasa; Reimer, Elsa; Walichnowski, Andrzej; Datla, Raju; Rowland, Gordon; Duguid, Scott; Ragupathy, Raja

    2012-08-01

    Flax is an important oilseed crop in North America and is mostly grown as a fibre crop in Europe. As a self-pollinated diploid with a small estimated genome size of ~370 Mb, flax is well suited for fast progress in genomics. In the last few years, important genetic resources have been developed for this crop. Here, we describe the assessment and comparative analyses of 1,506 putative simple sequence repeats (SSRs) of which, 1,164 were derived from BAC-end sequences (BESs) and 342 from expressed sequence tags (ESTs). The SSRs were assessed on a panel of 16 flax accessions with 673 (58 %) and 145 (42 %) primer pairs being polymorphic in the BESs and ESTs, respectively. With 818 novel polymorphic SSR primer pairs reported in this study, the repertoire of available SSRs in flax has more than doubled from the combined total of 508 of all previous reports. Among nucleotide motifs, trinucleotides were the most abundant irrespective of the class, but dinucleotides were the most polymorphic. SSR length was also positively correlated with polymorphism. Two dinucleotide (AT/TA and AG/GA) and two trinucleotide (AAT/ATA/TAA and GAA/AGA/AAG) motifs and their iterations, different from those reported in many other crops, accounted for more than half of all the SSRs and were also more polymorphic (63.4 %) than the rest of the markers (42.7 %). This improved resource promises to be useful in genetic, quantitative trait loci (QTL) and association mapping as well as for anchoring the physical/genetic map with the whole genome shotgun reference sequence of flax.

  1. Differential effects of simple repeating DNA sequences on gene expression from the SV40 early promoter.

    PubMed

    Amirhaeri, S; Wohlrab, F; Wells, R D

    1995-02-17

    The influence of simple repeat sequences, cloned into different positions relative to the SV40 early promoter/enhancer, on the transient expression of the chloramphenicol acetyltransferase (CAT) gene was investigated. Insertion of (G)29.(C)29 in either orientation into the 5'-untranslated region of the CAT gene reduced expression in CV-1 cells 50-100 fold when compared with controls with random sequence inserts. Analysis of CAT-specific mRNA levels demonstrated that the effect was due to a reduction of CAT mRNA production rather than to posttranscriptional events. In contrast, insertion of the same insert in either orientation upstream of the promoter-enhancer or downstream of the gene stimulated gene expression 2-3-fold. These effects could be reversed by cotransfection of a competitor plasmid carrying (G)25.(C)25 sequences. The results suggest that a G.C-binding transcription factor modulates gene expression in this system and that promoter strength can be regulated by providing protein-binding sites in trans. Although constructs containing longer tracts of alternating (C-G), (T-G), or (A-T) sequences inhibited CAT expression when inserted in the 5'-untranslated region of the CAT gene, the amount of CAT mRNA was unaffected. Hence, these inhibitions must be due to posttranscriptional events, presumably at the level of translation. These effects of microsatellite sequences on gene expression are discussed with respect to recent data on related simple repeat sequences which cause several human genetic diseases.

  2. Typing of artiodactyl MHC-DRB genes with the help of intronic simple repeated DNA sequences.

    PubMed

    Schwaiger, F W; Buitkamp, J; Weyers, E; Epplen, J T

    1993-02-01

    An efficient oligonucleotide typing method for the highly polymorphic MHC-DRB genes is described for artiodactyls like cattle, sheep and goat. By means of the polymerase chain reaction, the second exon of MHC-DRB is amplified as well as part of the adjacent intron containing a mixed simple repeat sequence. Using this primer combination we were able to amplify the MHC-DRB exons 2 and adjacent introns from all of the investigated 10 species of the family of Bovidae and giraffes. Therefore, the DRB genes of novel artiodactyl species can also be readily studied. Oligonucleotide probes specific for the polymorphisms of ungulate DRB genes are used with which sequences differing in at least one single base can be distinguished. Exonic polymorphism was found to be correlated with the allele lengths and the patterns of the repeat structures. Hence oligonucleotide probes specific for different simple repeats and polymorphic positions serve also for typing across species barriers. The strict correlation of sequence length and exonic polymorphism permits a preselection of specific oligonucleotides for hybridization. Thus more than 20 alleles can already be differentiated from each of the three species.

  3. MSDB: A Comprehensive Database of Simple Sequence Repeats

    PubMed Central

    Avvaru, Akshay Kumar; Saxena, Saketh; Mishra, Rakesh Kumar

    2017-01-01

    Abstract Microsatellites, also known as Simple Sequence Repeats (SSRs), are short tandem repeats of 1–6 nt motifs present in all genomes, particularly eukaryotes. Besides their usefulness as genome markers, SSRs have been shown to perform important regulatory functions, and variations in their length at coding regions are linked to several disorders in humans. Microsatellites show a taxon-specific enrichment in eukaryotic genomes, and some may be functional. MSDB (Microsatellite Database) is a collection of >650 million SSRs from 6,893 species including Bacteria, Archaea, Fungi, Plants, and Animals. This database is by far the most exhaustive resource to access and analyze SSR data of multiple species. In addition to exploring data in a customizable tabular format, users can view and compare the data of multiple species simultaneously using our interactive plotting system. MSDB is developed using the Django framework and MySQL. It is freely available at http://tdb.ccmb.res.in/msdb. PMID:28854643

  4. Analysis of sequence diversity through internal transcribed spacers and simple sequence repeats to identify Dendrobium species.

    PubMed

    Liu, Y T; Chen, R K; Lin, S J; Chen, Y C; Chin, S W; Chen, F C; Lee, C Y

    2014-04-08

    The Orchidaceae is one of the largest and most diverse families of flowering plants. The Dendrobium genus has high economic potential as ornamental plants and for medicinal purposes. In addition, the species of this genus are able to produce large crops. However, many Dendrobium varieties are very similar in outward appearance, making it difficult to distinguish one species from another. This study demonstrated that the 12 Dendrobium species used in this study may be divided into 2 groups by internal transcribed spacer (ITS) sequence analysis. Red and yellow flowers may also be used to separate these species into 2 main groups. In particular, the deciduous characteristic is associated with the ITS genetic diversity of the A group. Of 53 designed simple sequence repeat (SSR) primer pairs, 7 pairs were polymorphic for polymerase chain reaction products that were amplified from a specific band. The results of this study demonstrate that these 7 SSR primer pairs may potentially be used to identify Dendrobium species and their progeny in future studies.

  5. SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop.

    PubMed

    Schumacher, André; Pireddu, Luca; Niemenmaa, Matti; Kallio, Aleksi; Korpelainen, Eija; Zanetti, Gianluigi; Heljanko, Keijo

    2014-01-01

    Hadoop MapReduce-based approaches have become increasingly popular due to their scalability in processing large sequencing datasets. However, as these methods typically require in-depth expertise in Hadoop and Java, they are still out of reach of many bioinformaticians. To solve this problem, we have created SeqPig, a library and a collection of tools to manipulate, analyze and query sequencing datasets in a scalable and simple manner. SeqPigscripts use the Hadoop-based distributed scripting engine Apache Pig, which automatically parallelizes and distributes data processing tasks. We demonstrate SeqPig's scalability over many computing nodes and illustrate its use with example scripts. Available under the open source MIT license at http://sourceforge.net/projects/seqpig/

  6. MSDB: A Comprehensive Database of Simple Sequence Repeats.

    PubMed

    Avvaru, Akshay Kumar; Saxena, Saketh; Sowpati, Divya Tej; Mishra, Rakesh Kumar

    2017-06-01

    Microsatellites, also known as Simple Sequence Repeats (SSRs), are short tandem repeats of 1-6 nt motifs present in all genomes, particularly eukaryotes. Besides their usefulness as genome markers, SSRs have been shown to perform important regulatory functions, and variations in their length at coding regions are linked to several disorders in humans. Microsatellites show a taxon-specific enrichment in eukaryotic genomes, and some may be functional. MSDB (Microsatellite Database) is a collection of >650 million SSRs from 6,893 species including Bacteria, Archaea, Fungi, Plants, and Animals. This database is by far the most exhaustive resource to access and analyze SSR data of multiple species. In addition to exploring data in a customizable tabular format, users can view and compare the data of multiple species simultaneously using our interactive plotting system. MSDB is developed using the Django framework and MySQL. It is freely available at http://tdb.ccmb.res.in/msdb. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  7. Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

    PubMed

    Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.

  8. Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers

    PubMed Central

    Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551

  9. Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.).

    PubMed

    Bushakra, Jill M; Lewers, Kim S; Staton, Margaret E; Zhebentyayeva, Tetyana; Saski, Christopher A

    2015-10-26

    Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed sequence tags (ESTs) are a source of SSRs that can be used to develop markers to facilitate plant breeding and for more basic research across genera and higher plant orders. Leaf and meristem tissue from 'Heritage' red raspberry (Rubus idaeus) and 'Bristol' black raspberry (R. occidentalis) were utilized for RNA extraction. After conversion to cDNA and library construction, ESTs were sequenced, quality verified, assembled and scanned for SSRs.  Primers flanking the SSRs were designed and a subset tested for amplification, polymorphism and transferability across species. ESTs containing SSRs were functionally annotated using the GenBank non-redundant (nr) database and further classified using the gene ontology database. To accelerate development of EST-SSRs in the genus Rubus (Rosaceae), 1149 and 2358 cDNA sequences were generated from red raspberry and black raspberry, respectively. The cDNA sequences were screened using rigorous filtering criteria which resulted in the identification of 121 and 257 SSR loci for red and black raspberry, respectively. Primers were designed from the surrounding sequences resulting in 131 and 288 primer pairs, respectively, as some sequences contained more than one SSR locus. Sequence analysis revealed that the SSR-containing genes span a diversity of functions and share more sequence identity with strawberry genes than with other Rosaceous species. This resource of Rubus-specific, gene-derived markers will facilitate the construction of linkage maps composed of transferable markers for studying and manipulating important traits in this economically important genus.

  10. SIRW: A web server for the Simple Indexing and Retrieval System that combines sequence motif searches with keyword searches.

    PubMed

    Ramu, Chenna

    2003-07-01

    SIRW (http://sirw.embl.de/) is a World Wide Web interface to the Simple Indexing and Retrieval System (SIR) that is capable of parsing and indexing various flat file databases. In addition it provides a framework for doing sequence analysis (e.g. motif pattern searches) for selected biological sequences through keyword search. SIRW is an ideal tool for the bioinformatics community for searching as well as analyzing biological sequences of interest.

  11. SSRscanner: a program for reporting distribution and exact location of simple sequence repeats.

    PubMed

    Anwar, Tamanna; Khan, Asad U

    2006-02-20

    Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com.

  12. SSRscanner: a program for reporting distribution and exact location of simple sequence repeats

    PubMed Central

    Anwar, Tamanna; Khan, Asad U

    2006-01-01

    Simple sequence repeats (SSRs) have become important molecular markers for a broad range of applications, such as genome mapping and characterization, phenotype mapping, marker assisted selection of crop plants and a range of molecular ecology and diversity studies. These repeated DNA sequences are found in both prokaryotes and eukaryotes. They are distributed almost at random throughout the genome, ranging from mononucleotide to trinucleotide repeats. They are also found at longer lengths (> 6 repeating units) of tracts. Most of the computer programs that find SSRs do not report its exact position. A computer program SSRscanner was written to find out distribution, frequency and exact location of each SSR in the genome. SSRscanner is user friendly. It can search repeats of any length and produce outputs with their exact position on chromosome and their frequency of occurrence in the sequence. Availability This program has been written in PERL and is freely available for non-commercial users by request from the authors. Please contact the authors by E-mail: huzzi99@hotmail.com PMID:17597863

  13. Simple Elimination of Background Fluorescence in Formalin-Fixed Human Brain Tissue for Immunofluorescence Microscopy.

    PubMed

    Sun, Yulong; Ip, Philbert; Chakrabartty, Avijit

    2017-09-03

    Immunofluorescence is a common method used to visualize subcellular compartments and to determine the localization of specific proteins within a tissue sample. A great hindrance to the acquisition of high quality immunofluorescence images is endogenous autofluorescence of the tissue caused by aging pigments such as lipofuscin or by common sample preparation processes such as aldehyde fixation. This protocol describes how background fluorescence can be greatly reduced through photobleaching using white phosphor light emitting diode (LED) arrays prior to treatment with fluorescent probes. The broad-spectrum emission of white phosphor LEDs allow for bleaching of fluorophores across a range of emission peaks. The photobleaching apparatus can be constructed from off-the-shelf components at very low cost and offers an accessible alternative to commercially available chemical quenchers. A photobleaching pre-treatment of the tissue followed by conventional immunofluorescence staining generates images free of background autofluorescence. Compared to established chemical quenchers which reduced probe as well as background signals, photobleaching treatment had no effect on probe fluorescence intensity while it effectively reduced background and lipofuscin fluorescence. Although photobleaching requires more time for pre-treatment, higher intensity LED arrays may be used to reduce photobleaching time. This simple method can potentially be applied to a variety of tissues, particularly postmitotic tissues that accumulate lipofuscin such as the brain and cardiac or skeletal muscles.

  14. Simple sequence repeat markers that identify Claviceps species and strains.

    PubMed

    Gilmore, Barbara S; Alderman, Stephen C; Knaus, Brian J; Bassil, Nahla V; Martin, Ruth C; Dombrowski, James E; Dung, Jeremiah K S

    2016-01-01

    Claviceps purpurea is a pathogen that infects most members of Pooideae, a subfamily of Poaceae, and causes ergot, a floral disease in which the ovary is replaced with a sclerotium. When the ergot body is accidently consumed by either man or animal in high enough quantities, there is extreme pain, limb loss and sometimes death. This study was initiated to develop simple sequence repeat (SSRs) markers for rapid identification of  C. purpurea . SSRs were designed from sequence data stored at the National Center for Biotechnology Information database. The study consisted of 74 ergot isolates, from four different host species, Lolium perenne , Poa pratensis , Bromus inermis , and Secale cereale plus three additional Claviceps species, C. pusilla , C. paspali and C. fusiformis. Samples were collected from six different counties in Oregon and Washington over a 5-year period. Thirty-four SSR markers were selected, which enabled the differentiation of each isolate from one another based solely on their molecular fingerprints. Discriminant analysis of principle components was used to identify four isolate groups, CA Group 1, 2, 3, and 4, for subsequent cluster and molecular variance analyses. CA Group 1 consisting of eight isolates from the host species P. pratensis , was separated on the cluster analysis plot from the remaining three groups and this group was later identified as C. humidiphila . The other three groups were distinct from one another, but closely related. These three groups contained samples from all four of the host species. These SSRs are simple to use, reliable and allowed clear differentiation of C. humidiphila from C. purpurea . Isolates from the three separate species, C. pusilla , C. paspali and C. fusiformis , also amplified with these markers. The SSR markers developed in this study will be helpful in defining the population structure and genetics of Claviceps strains. They will also provide valuable tools for plant breeders needing to identify

  15. Genotyping and Molecular Identification of Date Palm Cultivars Using Inter-Simple Sequence Repeat (ISSR) Markers.

    PubMed

    Ayesh, Basim M

    2017-01-01

    Molecular markers are credible for the discrimination of genotypes and estimation of the extent of genetic diversity and relatedness in a set of genotypes. Inter-simple sequence repeat (ISSR) markers rapidly reveal high polymorphic fingerprints and have been used frequently to determine the genetic diversity among date palm cultivars. This chapter describes the application of ISSR markers for genotyping of date palm cultivars. The application involves extraction of genomic DNA from the target cultivars with reliable quality and quantity. Subsequently the extracted DNA serves as a template for amplification of genomic regions flanked by inverted simple sequence repeats using a single primer. The similarity of each pair of samples is measured by calculating the number of mono- and polymorphic bands revealed by gel electrophoresis. Matrices constructed for similarity and genetic distance are used to build a phylogenetic tree and cluster analysis, to determine the molecular relatedness of cultivars. The protocol describes 3 out of 9 tested primers consistently amplified 31 loci in 6 date palm cultivars, with 28 polymorphic loci.

  16. SeqPig: simple and scalable scripting for large sequencing data sets in Hadoop

    PubMed Central

    Schumacher, André; Pireddu, Luca; Niemenmaa, Matti; Kallio, Aleksi; Korpelainen, Eija; Zanetti, Gianluigi; Heljanko, Keijo

    2014-01-01

    Summary: Hadoop MapReduce-based approaches have become increasingly popular due to their scalability in processing large sequencing datasets. However, as these methods typically require in-depth expertise in Hadoop and Java, they are still out of reach of many bioinformaticians. To solve this problem, we have created SeqPig, a library and a collection of tools to manipulate, analyze and query sequencing datasets in a scalable and simple manner. SeqPigscripts use the Hadoop-based distributed scripting engine Apache Pig, which automatically parallelizes and distributes data processing tasks. We demonstrate SeqPig’s scalability over many computing nodes and illustrate its use with example scripts. Availability and Implementation: Available under the open source MIT license at http://sourceforge.net/projects/seqpig/ Contact: andre.schumacher@yahoo.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24149054

  17. Universal sequence map (USM) of arbitrary discrete sequences

    PubMed Central

    2002-01-01

    Background For over a decade the idea of representing biological sequences in a continuous coordinate space has maintained its appeal but not been fully realized. The basic idea is that any sequence of symbols may define trajectories in the continuous space conserving all its statistical properties. Ideally, such a representation would allow scale independent sequence analysis – without the context of fixed memory length. A simple example would consist on being able to infer the homology between two sequences solely by comparing the coordinates of any two homologous units. Results We have successfully identified such an iterative function for bijective mappingψ of discrete sequences into objects of continuous state space that enable scale-independent sequence analysis. The technique, named Universal Sequence Mapping (USM), is applicable to sequences with an arbitrary length and arbitrary number of unique units and generates a representation where map distance estimates sequence similarity. The novel USM procedure is based on earlier work by these and other authors on the properties of Chaos Game Representation (CGR). The latter enables the representation of 4 unit type sequences (like DNA) as an order free Markov Chain transition table. The properties of USM are illustrated with test data and can be verified for other data by using the accompanying web-based tool:http://bioinformatics.musc.edu/~jonas/usm/. Conclusions USM is shown to enable a statistical mechanics approach to sequence analysis. The scale independent representation frees sequence analysis from the need to assume a memory length in the investigation of syntactic rules. PMID:11895567

  18. Development of simple sequence repeat (SSR) markers from a genome survey of Chinese bayberry (Myrica rubra)

    PubMed Central

    2012-01-01

    Background Chinese bayberry (Myrica rubra Sieb. and Zucc.) is a subtropical evergreen tree originating in China. It has been cultivated in southern China for several thousand years, and annual production has reached 1.1 million tons. The taste and high level of health promoting characters identified in the fruit in recent years has stimulated its extension in China and introduction to Australia. A limited number of co-dominant markers have been developed and applied in genetic diversity and identity studies. Here we report, for the first time, a survey of whole genome shotgun data to develop a large number of simple sequence repeat (SSR) markers to analyse the genetic diversity of the common cultivated Chinese bayberry and the relationship with three other Myrica species. Results The whole genome shotgun survey of Chinese bayberry produced 9.01Gb of sequence data, about 26x coverage of the estimated genome size of 323 Mb. The genome sequences were highly heterozygous, but with little duplication. From the initial assembled scaffold covering 255 Mb sequence data, 28,602 SSRs (≥5 repeats) were identified. Dinucleotide was the most common repeat motif with a frequency of 84.73%, followed by 13.78% trinucleotide, 1.34% tetranucleotide, 0.12% pentanucleotide and 0.04% hexanucleotide. From 600 primer pairs, 186 polymorphic SSRs were developed. Of these, 158 were used to screen 29 Chinese bayberry accessions and three other Myrica species: 91.14%, 89.87% and 46.84% SSRs could be used in Myrica adenophora, Myrica nana and Myrica cerifera, respectively. The UPGMA dendrogram tree showed that cultivated Myrica rubra is closely related to Myrica adenophora and Myrica nana, originating in southwest China, and very distantly related to Myrica cerifera, originating in America. These markers can be used in the construction of a linkage map and for genetic diversity studies in Myrica species. Conclusion Myrica rubra has a small genome of about 323 Mb with a high level of

  19. Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis

    PubMed Central

    Zheng, Jin-shuang; Sun, Cheng-zhen; Zhang, Shu-ning; Hou, Xi-lin; Bonnema, Guusje

    2016-01-01

    A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis. PMID:27507974

  20. Cytogenetic Diversity of Simple Sequences Repeats in Morphotypes of Brassica rapa ssp. chinensis.

    PubMed

    Zheng, Jin-Shuang; Sun, Cheng-Zhen; Zhang, Shu-Ning; Hou, Xi-Lin; Bonnema, Guusje

    2016-01-01

    A significant fraction of the nuclear DNA of all eukaryotes is comprised of simple sequence repeats (SSRs). Although these sequences are widely used for studying genetic variation, linkage mapping and evolution, little attention had been paid to the chromosomal distribution and cytogenetic diversity of these sequences. In this paper, we report the distribution characterization of mono-, di-, and tri-nucleotide SSRs in Brassica rapa ssp. chinensis. Fluorescence in situ hybridization was used to characterize the cytogenetic diversity of SSRs among morphotypes of B. rapa ssp. chinensis. The proportion of different SSR motifs varied among morphotypes of B. rapa ssp. chinensis, with tri-nucleotide SSRs being more prevalent in the genome of B. rapa ssp. chinensis. We determined the chromosomal locations of mono-, di-, and tri-nucleotide repeat loci. The results showed that the chromosomal distribution of SSRs in the different morphotypes is non-random and motif-dependent, and allowed us to characterize the relative variability in terms of SSR numbers and similar chromosomal distributions in centromeric/peri-centromeric heterochromatin. The differences between SSR repeats with respect to abundance and distribution indicate that SSRs are a driving force in the genomic evolution of B. rapa species. Our results provide a comprehensive view of the SSR sequence distribution and evolution for comparison among morphotypes B. rapa ssp. chinensis.

  1. Development of expressed sequence tag-simple sequence repeat markers for genetic characterization and population structure analysis of Praxelis clematidea (Asteraceae).

    PubMed

    Wang, Q Z; Huang, M; Downie, S R; Chen, Z X

    2016-05-23

    Invasive plants tend to spread aggressively in new habitats and an understanding of their genetic diversity and population structure is useful for their management. In this study, expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed for the invasive plant species Praxelis clematidea (Asteraceae) from 5548 Stevia rebaudiana (Asteraceae) expressed sequence tags (ESTs). A total of 133 microsatellite-containing ESTs (2.4%) were identified, of which 56 (42.1%) were hexanucleotide repeat motifs and 50 (37.6%) were trinucleotide repeat motifs. Of the 24 primer pairs designed from these 133 ESTs, 7 (29.2%) resulted in significant polymorphisms. The number of alleles per locus ranged from 5 to 9. The relatively high genetic diversity (H = 0.2667, I = 0.4212, and P = 100%) of P. clematidea was related to high gene flow (Nm = 1.4996) among populations. The coefficient of population differentiation (GST = 0.2500) indicated that most genetic variation occurred within populations. A Mantel test suggested that there was significant correlation between genetic distance and geographical distribution (r = 0.3192, P = 0.012). These results further support the transferability of EST-SSR markers between closely related genera of the same family.

  2. Kangaroo – A pattern-matching program for biological sequences

    PubMed Central

    2002-01-01

    Background Biologists are often interested in performing a simple database search to identify proteins or genes that contain a well-defined sequence pattern. Many databases do not provide straightforward or readily available query tools to perform simple searches, such as identifying transcription binding sites, protein motifs, or repetitive DNA sequences. However, in many cases simple pattern-matching searches can reveal a wealth of information. We present in this paper a regular expression pattern-matching tool that was used to identify short repetitive DNA sequences in human coding regions for the purpose of identifying potential mutation sites in mismatch repair deficient cells. Results Kangaroo is a web-based regular expression pattern-matching program that can search for patterns in DNA, protein, or coding region sequences in ten different organisms. The program is implemented to facilitate a wide range of queries with no restriction on the length or complexity of the query expression. The program is accessible on the web at http://bioinfo.mshri.on.ca/kangaroo/ and the source code is freely distributed at http://sourceforge.net/projects/slritools/. Conclusion A low-level simple pattern-matching application can prove to be a useful tool in many research settings. For example, Kangaroo was used to identify potential genetic targets in a human colorectal cancer variant that is characterized by a high frequency of mutations in coding regions containing mononucleotide repeats. PMID:12150718

  3. Molecular Identification of Sex in Phoenix dactylifera Using Inter Simple Sequence Repeat Markers.

    PubMed

    Al-Ameri, Abdulhafed A; Al-Qurainy, Fahad; Gaafar, Abdel-Rhman Z; Khan, Salim; Nadeem, M

    2016-01-01

    Early sex identification of Date Palm (Phoenix dactylifera L.) at seedling stage is an economically desirable objective, which will significantly increase the profits of seed based cultivation. The utilization of molecular markers at this stage for early and rapid identification of sex is important due to the lack of morphological markers. In this study, a total of two hundred Inter Simple Sequence Repeat (ISSR) primers were screened among male and female Date palm plants to identify putative sex-specific marker, out of which only two primers (IS_A02 and IS_A71) were found to be associated with sex. The primer IS_A02 produced a unique band of size 390 bp and was found clearly in all female plants, while it was absent in all male plants. Contrary to this, the primer IS_A71 produced a unique band of size 380 bp and was clearly found in all male plants, whereas it was absent in all the female plants. Subsequently, these specific fragments were excised, purified, and sequenced for the development of sequence specific markers further in future for the implementation on dioecious Date Palm for sex determination. These markers are efficient, highly reliable, and reproducible for sex identification at the early stage of seedling.

  4. Determining Phylogenetic Relationships Among Date Palm Cultivars Using Random Amplified Polymorphic DNA (RAPD) and Inter-Simple Sequence Repeat (ISSR) Markers.

    PubMed

    Haider, Nadia

    2017-01-01

    Investigation of genetic variation and phylogenetic relationships among date palm (Phoenix dactylifera L.) cultivars is useful for their conservation and genetic improvement. Various molecular markers such as restriction fragment length polymorphisms (RFLPs), simple sequence repeat (SSR), representational difference analysis (RDA), and amplified fragment length polymorphism (AFLP) have been developed to molecularly characterize date palm cultivars. PCR-based markers random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) are powerful tools to determine the relatedness of date palm cultivars that are difficult to distinguish morphologically. In this chapter, the principles, materials, and methods of RAPD and ISSR techniques are presented. Analysis of data generated from these two techniques and the use of these data to reveal phylogenetic relationships among date palm cultivars are also discussed.

  5. FOUNTAIN: A JAVA open-source package to assist large sequencing projects

    PubMed Central

    Buerstedde, Jean-Marie; Prill, Florian

    2001-01-01

    Background Better automation, lower cost per reaction and a heightened interest in comparative genomics has led to a dramatic increase in DNA sequencing activities. Although the large sequencing projects of specialized centers are supported by in-house bioinformatics groups, many smaller laboratories face difficulties managing the appropriate processing and storage of their sequencing output. The challenges include documentation of clones, templates and sequencing reactions, and the storage, annotation and analysis of the large number of generated sequences. Results We describe here a new program, named FOUNTAIN, for the management of large sequencing projects . FOUNTAIN uses the JAVA computer language and data storage in a relational database. Starting with a collection of sequencing objects (clones), the program generates and stores information related to the different stages of the sequencing project using a web browser interface for user input. The generated sequences are subsequently imported and annotated based on BLAST searches against the public databases. In addition, simple algorithms to cluster sequences and determine putative polymorphic positions are implemented. Conclusions A simple, but flexible and scalable software package is presented to facilitate data generation and storage for large sequencing projects. Open source and largely platform and database independent, we wish FOUNTAIN to be improved and extended in a community effort. PMID:11591214

  6. The Flushtration Count Illusion: Attribute substitution tricks our interpretation of a simple visual event sequence.

    PubMed

    Thomas, Cyril; Didierjean, André; Kuhn, Gustav

    2018-04-17

    When faced with a difficult question, people sometimes work out an answer to a related, easier question without realizing that a substitution has taken place (e.g., Kahneman, 2011, Thinking, fast and slow. New York, Farrar, Strauss, Giroux). In two experiments, we investigated whether this attribute substitution effect can also affect the interpretation of a simple visual event sequence. We used a magic trick called the 'Flushtration Count Illusion', which involves a technique used by magicians to give the illusion of having seen multiple cards with identical backs, when in fact only the back of one card (the bottom card) is repeatedly shown. In Experiment 1, we demonstrated that most participants are susceptible to the illusion, even if they have the visual and analytical reasoning capacity to correctly process the sequence. In Experiment 2, we demonstrated that participants construct a biased and simplified representation of the Flushtration Count by substituting some attributes of the event sequence. We discussed of the psychological processes underlying this attribute substitution effect. © 2018 The British Psychological Society.

  7. Simple sequence repeat markers useful for sorghum downy mildew (Peronosclerospora sorghi) and related species

    PubMed Central

    Perumal, Ramasamy; Nimmakayala, Padmavathi; Erattaimuthu, Saradha R; No, Eun-Gyu; Reddy, Umesh K; Prom, Louis K; Odvody, Gary N; Luster, Douglas G; Magill, Clint W

    2008-01-01

    Background A recent outbreak of sorghum downy mildew in Texas has led to the discovery of both metalaxyl resistance and a new pathotype in the causal organism, Peronosclerospora sorghi. These observations and the difficulty in resolving among phylogenetically related downy mildew pathogens dramatically point out the need for simply scored markers in order to differentiate among isolates and species, and to study the population structure within these obligate oomycetes. Here we present the initial results from the use of a biotin capture method to discover, clone and develop PCR primers that permit the use of simple sequence repeats (microsatellites) to detect differences at the DNA level. Results Among the 55 primers pairs designed from clones from pathotype 3 of P. sorghi, 36 flanked microsatellite loci containing simple repeats, including 28 (55%) with dinucleotide repeats and 6 (11%) with trinucleotide repeats. A total of 22 microsatellites with CA/AC or GT/TG repeats were the most abundant (40%) and GA/AG or CT/TC types contribute 15% in our collection. When used to amplify DNA from 19 isolates from P. sorghi, as well as from 5 related species that cause downy mildew on other hosts, the number of different bands detected for each SSR primer pair using a LI-COR- DNA Analyzer ranged from two to eight. Successful cross-amplification for 12 primer pairs studied in detail using DNA from downy mildews that attack maize (P. maydis & P. philippinensis), sugar cane (P. sacchari), pearl millet (Sclerospora graminicola) and rose (Peronospora sparsa) indicate that the flanking regions are conserved in all these species. A total of 15 SSR amplicons unique to P. philippinensis (one of the potential threats to US maize production) were detected, and these have potential for development of diagnostic tests. A total of 260 alleles were obtained using 54 microsatellites primer combinations, with an average of 4.8 polymorphic markers per SSR across 34 Peronosclerospora

  8. GASP: Gapped Ancestral Sequence Prediction for proteins

    PubMed Central

    Edwards, Richard J; Shields, Denis C

    2004-01-01

    Background The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments. Results Here we present a new algorithm, GASP (Gapped Ancestral Sequence Prediction), for predicting ancestral sequences from phylogenetic trees and the corresponding multiple sequence alignments. Alignments may be of any size and contain gaps. GASP first assigns the positions of gaps in the phylogeny before using a likelihood-based approach centred on amino acid substitution matrices to assign ancestral amino acids. Important outgroup information is used by first working down from the tips of the tree to the root, using descendant data only to assign probabilities, and then working back up from the root to the tips using descendant and outgroup data to make predictions. GASP was tested on a number of simulated datasets based on real phylogenies. Prediction accuracy for ungapped data was similar to three alternative algorithms tested, with GASP performing better in some cases and worse in others. Adding simple insertions and deletions to the simulated data did not have a detrimental effect on GASP accuracy. Conclusions GASP (Gapped Ancestral Sequence Prediction) will predict ancestral sequences from multiple protein alignments of any size. Although not as accurate in all cases as some of the more sophisticated maximum likelihood approaches, it can process a wide range of input phylogenies and will predict ancestral sequences for gapped and ungapped residues alike. PMID:15350199

  9. M13-Tailed Simple Sequence Repeat (SSR) Markers in Studies of Genetic Diversity and Population Structure of Common Oat Germplasm.

    PubMed

    Onyśk, Agnieszka; Boczkowska, Maja

    2017-01-01

    Simple Sequence Repeat (SSR) markers are one of the most frequently used molecular markers in studies of crop diversity and population structure. This is due to their uniform distribution in the genome, the high polymorphism, reproducibility, and codominant character. Additional advantages are the possibility of automatic analysis and simple interpretation of the results. The M13 tagged PCR reaction significantly reduces the costs of analysis by the automatic genetic analyzers. Here, we also disclose a short protocol of SSR data analysis.

  10. Not all transmembrane helices are born equal: Towards the extension of the sequence homology concept to membrane proteins

    PubMed Central

    2011-01-01

    Background Sequence homology considerations widely used to transfer functional annotation to uncharacterized protein sequences require special precautions in the case of non-globular sequence segments including membrane-spanning stretches composed of non-polar residues. Simple, quantitative criteria are desirable for identifying transmembrane helices (TMs) that must be included into or should be excluded from start sequence segments in similarity searches aimed at finding distant homologues. Results We found that there are two types of TMs in membrane-associated proteins. On the one hand, there are so-called simple TMs with elevated hydrophobicity, low sequence complexity and extraordinary enrichment in long aliphatic residues. They merely serve as membrane-anchoring device. In contrast, so-called complex TMs have lower hydrophobicity, higher sequence complexity and some functional residues. These TMs have additional roles besides membrane anchoring such as intra-membrane complex formation, ligand binding or a catalytic role. Simple and complex TMs can occur both in single- and multi-membrane-spanning proteins essentially in any type of topology. Whereas simple TMs have the potential to confuse searches for sequence homologues and to generate unrelated hits with seemingly convincing statistical significance, complex TMs contain essential evolutionary information. Conclusion For extending the homology concept onto membrane proteins, we provide a necessary quantitative criterion to distinguish simple TMs (and a sufficient criterion for complex TMs) in query sequences prior to their usage in homology searches based on assessment of hydrophobicity and sequence complexity of the TM sequence segments. Reviewers This article was reviewed by Shamil Sunyaev, L. Aravind and Arcady Mushegian. PMID:22024092

  11. Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL): adapting the Partial Phylogenetic Profiling algorithm to scan sequences for signatures that predict protein function

    PubMed Central

    2010-01-01

    Background Comparative genomics methods such as phylogenetic profiling can mine powerful inferences from inherently noisy biological data sets. We introduce Sites Inferred by Metabolic Background Assertion Labeling (SIMBAL), a method that applies the Partial Phylogenetic Profiling (PPP) approach locally within a protein sequence to discover short sequence signatures associated with functional sites. The approach is based on the basic scoring mechanism employed by PPP, namely the use of binomial distribution statistics to optimize sequence similarity cutoffs during searches of partitioned training sets. Results Here we illustrate and validate the ability of the SIMBAL method to find functionally relevant short sequence signatures by application to two well-characterized protein families. In the first example, we partitioned a family of ABC permeases using a metabolic background property (urea utilization). Thus, the TRUE set for this family comprised members whose genome of origin encoded a urea utilization system. By moving a sliding window across the sequence of a permease, and searching each subsequence in turn against the full set of partitioned proteins, the method found which local sequence signatures best correlated with the urea utilization trait. Mapping of SIMBAL "hot spots" onto crystal structures of homologous permeases reveals that the significant sites are gating determinants on the cytosolic face rather than, say, docking sites for the substrate-binding protein on the extracellular face. In the second example, we partitioned a protein methyltransferase family using gene proximity as a criterion. In this case, the TRUE set comprised those methyltransferases encoded near the gene for the substrate RF-1. SIMBAL identifies sequence regions that map onto the substrate-binding interface while ignoring regions involved in the methyltransferase reaction mechanism in general. Neither method for training set construction requires any prior experimental

  12. Development of simple sequence repeat markers and diversity analysis in alfalfa (Medicago sativa L.).

    PubMed

    Wang, Zan; Yan, Hongwei; Fu, Xinnian; Li, Xuehui; Gao, Hongwen

    2013-04-01

    Efficient and robust molecular markers are essential for molecular breeding in plant. Compared to dominant and bi-allelic markers, multiple alleles of simple sequence repeat (SSR) markers are particularly informative and superior in genetic linkage map and QTL mapping in autotetraploid species like alfalfa. The objective of this study was to enrich SSR markers directly from alfalfa expressed sequence tags (ESTs). A total of 12,371 alfalfa ESTs were retrieved from the National Center for Biotechnology Information. Total 774 SSR-containing ESTs were identified from 716 ESTs. On average, one SSR was found per 7.7 kb of EST sequences. Tri-nucleotide repeats (48.8 %) was the most abundant motif type, followed by di-(26.1 %), tetra-(11.5 %), penta-(9.7 %), and hexanucleotide (3.9 %). One hundred EST-SSR primer pairs were successfully designed and 29 exhibited polymorphism among 28 alfalfa accessions. The allele number per marker ranged from two to 21 with an average of 6.8. The PIC values ranged from 0.195 to 0.896 with an average of 0.608, indicating a high level of polymorphism of the EST-SSR markers. Based on the 29 EST-SSR markers, assessment of genetic diversity was conducted and found that Medicago sativa ssp. sativa was clearly different from the other subspecies. The high transferability of those EST-SSR markers was also found for relative species.

  13. Plant genotyping using fluorescently tagged inter-simple sequence repeats (ISSRs): basic principles and methodology.

    PubMed

    Prince, Linda M

    2015-01-01

    Inter-simple sequence repeat PCR (ISSR-PCR) is a fast, inexpensive genotyping technique based on length variation in the regions between microsatellites. The method requires no species-specific prior knowledge of microsatellite location or composition. Very small amounts of DNA are required, making this method ideal for organisms of conservation concern, or where the quantity of DNA is extremely limited due to organism size. ISSR-PCR can be highly reproducible but requires careful attention to detail. Optimization of DNA extraction, fragment amplification, and normalization of fragment peak heights during fluorescent detection are critical steps to minimizing the downstream time spent verifying and scoring the data.

  14. A resource of large-scale molecular markers for monitoring Agropyron cristatum chromatin introgression in wheat background based on transcriptome sequences.

    PubMed

    Zhang, Jinpeng; Liu, Weihua; Lu, Yuqing; Liu, Qunxing; Yang, Xinming; Li, Xiuquan; Li, Lihui

    2017-09-20

    Agropyron cristatum is a wild grass of the tribe Triticeae and serves as a gene donor for wheat improvement. However, very few markers can be used to monitor A. cristatum chromatin introgressions in wheat. Here, we reported a resource of large-scale molecular markers for tracking alien introgressions in wheat based on transcriptome sequences. By aligning A. cristatum unigenes with the Chinese Spring reference genome sequences, we designed 9602 A. cristatum expressed sequence tag-sequence-tagged site (EST-STS) markers for PCR amplification and experimental screening. As a result, 6063 polymorphic EST-STS markers were specific for the A. cristatum P genome in the single-receipt wheat background. A total of 4956 randomly selected polymorphic EST-STS markers were further tested in eight wheat variety backgrounds, and 3070 markers displaying stable and polymorphic amplification were validated. These markers covered more than 98% of the A. cristatum genome, and the marker distribution density was approximately 1.28 cM. An application case of all EST-STS markers was validated on the A. cristatum 6 P chromosome. These markers were successfully applied in the tracking of alien A. cristatum chromatin. Altogether, this study provided a universal method of large-scale molecular marker development to monitor wild relative chromatin in wheat.

  15. Cultivar identification, pedigree verification, and diversity analysis among Peach (Prunus persica L. Batsch) Cultivars based on Simple Sequence Repeat markers

    USDA-ARS?s Scientific Manuscript database

    The genetic relationships and pedigree inferences among peach (Prunus persica (L.) Batsch) accessions and breeding lines used in genetic improvement were evaluated using 15 simple sequence repeat (SSR) markers. A total of 80 alleles were detected among the 37 peach accessions with an average of 5.53...

  16. Robust Data Detection for the Photon-Counting Free-Space Optical System With Implicit CSI Acquisition and Background Radiation Compensation

    NASA Astrophysics Data System (ADS)

    Song, Tianyu; Kam, Pooi-Yuen

    2016-02-01

    Since atmospheric turbulence and pointing errors cause signal intensity fluctuations and the background radiation surrounding the free-space optical (FSO) receiver contributes an undesired noisy component, the receiver requires accurate channel state information (CSI) and background information to adjust the detection threshold. In most previous studies, for CSI acquisition, pilot symbols were employed, which leads to a reduction of spectral and energy efficiency; and an impractical assumption that the background radiation component is perfectly known was made. In this paper, we develop an efficient and robust sequence receiver, which acquires the CSI and the background information implicitly and requires no knowledge about the channel model information. It is robust since it can automatically estimate the CSI and background component and detect the data sequence accordingly. Its decision metric has a simple form and involves no integrals, and thus can be easily evaluated. A Viterbi-type trellis-search algorithm is adopted to improve the search efficiency, and a selective-store strategy is adopted to overcome a potential error floor problem as well as to increase the memory efficiency. To further simplify the receiver, a decision-feedback symbol-by-symbol receiver is proposed as an approximation of the sequence receiver. By simulations and theoretical analysis, we show that the performance of both the sequence receiver and the symbol-by-symbol receiver, approach that of detection with perfect knowledge of the CSI and background radiation, as the length of the window for forming the decision metric increases.

  17. EASY: a simple tool for simultaneously removing background, deadtime and acoustic ringing in quantitative NMR spectroscopy--part I: basic principle and applications.

    PubMed

    Jaeger, Christian; Hemmann, Felix

    2014-01-01

    Elimination of Artifacts in NMR SpectroscopY (EASY) is a simple but very effective tool to remove simultaneously any real NMR probe background signal, any spectral distortions due to deadtime ringdown effects and -specifically- severe acoustic ringing artifacts in NMR spectra of low-gamma nuclei. EASY enables and maintains quantitative NMR (qNMR) as only a single pulse (preferably 90°) is used for data acquisition. After the acquisition of the first scan (it contains the wanted NMR signal and the background/deadtime/ringing artifacts) the same experiment is repeated immediately afterwards before the T1 waiting delay. This second scan contains only the background/deadtime/ringing parts. Hence, the simple difference of both yields clean NMR line shapes free of artefacts. In this Part I various examples for complete (1)H, (11)B, (13)C, (19)F probe background removal due to construction parts of the NMR probes are presented. Furthermore, (25)Mg EASY of Mg(OH)2 is presented and this example shows how extremely strong acoustic ringing can be suppressed (more than a factor of 200) such that phase and baseline correction for spectra acquired with a single pulse is no longer a problem. EASY is also a step towards deadtime-free data acquisition as these effects are also canceled completely. EASY can be combined with any other NMR experiment, including 2D NMR, if baseline distortions are a big problem. © 2013 Published by Elsevier Inc.

  18. Characterization of expressed sequence tag-derived simple sequence repeat markers for Aspergillus flavus: emphasis on variability of isolates from the southern United States.

    PubMed

    Wang, Xinwang; Wadl, Phillip A; Wood-Jones, Alicia; Windham, Gary; Trigiano, Robert N; Scruggs, Mary; Pilgrim, Candace; Baird, Richard

    2012-12-01

    Simple sequence repeat (SSR) markers were developed from Aspergillus flavus expressed sequence tag (EST) database to conduct an analysis of genetic relationships of Aspergillus isolates from numerous host species and geographical regions, but primarily from the United States. Twenty-nine primers were designed from 362 tri-nucleotide EST-SSR sequences. Eighteen polymorphic loci were used to genotype 96 Aspergillus species isolates. The number of alleles detected per locus ranged from 2 to 24 with a mean of 8.2 alleles. Haploid diversity ranged from 0.28 to 0.91. Genetic distance matrix was used to perform principal coordinates analysis (PCA) and to generate dendrograms using unweighted pair group method with arithmetic mean (UPGMA). Two principal coordinates explained more than 75 % of the total variation among the isolates. One clade was identified for A. flavus isolates (n = 87) with the other Aspergillus species (n = 7) using PCA, but five distinct clusters were present when the others taxa were excluded from the analysis. Six groups were noted when the EST-SSR data were compared using UPGMA. However, the latter PCA or UPGMA comparison resulted in no direct associations with host species, geographical region or aflatoxin production. Furthermore, there was no direct correlation to visible morphological features such as sclerotial types. The isolates from Mississippi Delta region, which contained the largest percentage of isolates, did not show any unusual clustering except for isolates K32, K55, and 199. Further studies of these three isolates are warranted to evaluate their pathogenicity, aflatoxin production potential, additional gene sequences (e.g., RPB2), and morphological comparisons.

  19. Scheduling observational and physical practice: influence on the coding of simple motor sequences.

    PubMed

    Ellenbuerger, Thomas; Boutin, Arnaud; Blandin, Yannick; Shea, Charles H; Panzer, Stefan

    2012-01-01

    The main purpose of the present experiment was to determine the coordinate system used in the development of movement codes when observational and physical practice are scheduled across practice sessions. The task was to reproduce a 1,300-ms spatial-temporal pattern of elbow flexions and extensions. An intermanual transfer paradigm with a retention test and two effector (contralateral limb) transfer tests was used. The mirror effector transfer test required the same pattern of homologous muscle activation and sequence of limb joint angles as that performed or observed during practice, and the non-mirror effector transfer test required the same spatial pattern movements as that performed or observed. The test results following the first acquisition session replicated the findings of Gruetzmacher, Panzer, Blandin, and Shea (2011) . The results following the second acquisition session indicated a strong advantage for participants who received physical practice in both practice sessions or received observational practice followed by physical practice. This advantage was found on both the retention and the mirror transfer tests compared to the non-mirror transfer test. These results demonstrate that codes based in motor coordinates can be developed relatively quickly and effectively for a simple spatial-temporal movement sequence when participants are provided with physical practice or observation followed by physical practice, but physical practice followed by observational practice or observational practice alone limits the development of codes based in motor coordinates.

  20. Physical organisation of simple sequence repeats (SSRs) in Triticeae: structural, functional and evolutionary implications.

    PubMed

    Cuadrado, A; Cardoso, M; Jouve, N

    2008-01-01

    A significant fraction of the nuclear DNA of all eukaryotes is occupied by simple sequence repeats (SSRs) or microsatellites. This type of sequence has sparked great interest as a means of studying genetic variation, linkage mapping, gene tagging and evolution. Although SSRs at different positions in a gene help determine the regulation of expression and the function of the protein produced, little attention has been paid to the chromosomal organisation and distribution of these sequences, even in model species. This review discusses the main achievements in the characterisation of long-range SSR organisation in the chromosomes of Triticum aestivum L., Secale cereale L., and Hordeum vulgare L. (all members of Triticeae). We have detected SSRs using an improved FISH technique based on the random primer labelling of synthetic oligonucleotides (15-24 bases) in multi-colour experiments. Detailed information on the presence and distribution of AC, AG and all the possible classes of trinucleotide repeats has been acquired. These data have revealed the motif-dependent and non-random chromosome distributions of SSRs in the different genomes, and allowed the correlation of particular SSRs with chromosome areas characterised by specific features (e.g., heterochromatin, euchromatin and centromeres) in all three species. The present review provides a detailed comparative study of the distribution of these SSRs in each of the seven chromosomes of the genomes A, B and D of wheat, H of barley and R of rye. The importance of SSRs in plant breeding and their possible role in chromosome structure, function and evolution is discussed. 2008 S. Karger AG, Basel

  1. Simple-MSSM: a simple and efficient method for simultaneous multi-site saturation mutagenesis.

    PubMed

    Cheng, Feng; Xu, Jian-Miao; Xiang, Chao; Liu, Zhi-Qiang; Zhao, Li-Qing; Zheng, Yu-Guo

    2017-04-01

    To develop a practically simple and robust multi-site saturation mutagenesis (MSSM) method that enables simultaneously recombination of amino acid positions for focused mutant library generation. A general restriction enzyme-free and ligase-free MSSM method (Simple-MSSM) based on prolonged overlap extension PCR (POE-PCR) and Simple Cloning techniques. As a proof of principle of Simple-MSSM, the gene of eGFP (enhanced green fluorescent protein) was used as a template gene for simultaneous mutagenesis of five codons. Forty-eight randomly selected clones were sequenced. Sequencing revealed that all the 48 clones showed at least one mutant codon (mutation efficiency = 100%), and 46 out of the 48 clones had mutations at all the five codons. The obtained diversities at these five codons are 27, 24, 26, 26 and 22, respectively, which correspond to 84, 75, 81, 81, 69% of the theoretical diversity offered by NNK-degeneration (32 codons; NNK, K = T or G). The enzyme-free Simple-MSSM method can simultaneously and efficiently saturate five codons within one day, and therefore avoid missing interactions between residues in interacting amino acid networks.

  2. Simple automatic strategy for background drift correction in chromatographic data analysis.

    PubMed

    Fu, Hai-Yan; Li, He-Dong; Yu, Yong-Jie; Wang, Bing; Lu, Peng; Cui, Hua-Peng; Liu, Ping-Ping; She, Yuan-Bin

    2016-06-03

    Chromatographic background drift correction, which influences peak detection and time shift alignment results, is a critical stage in chromatographic data analysis. In this study, an automatic background drift correction methodology was developed. Local minimum values in a chromatogram were initially detected and organized as a new baseline vector. Iterative optimization was then employed to recognize outliers, which belong to the chromatographic peaks, in this vector, and update the outliers in the baseline until convergence. The optimized baseline vector was finally expanded into the original chromatogram, and linear interpolation was employed to estimate background drift in the chromatogram. The principle underlying the proposed method was confirmed using a complex gas chromatographic dataset. Finally, the proposed approach was applied to eliminate background drift in liquid chromatography quadrupole time-of-flight samples used in the metabolic study of Escherichia coli samples. The proposed method was comparable with three classical techniques: morphological weighted penalized least squares, moving window minimum value strategy and background drift correction by orthogonal subspace projection. The proposed method allows almost automatic implementation of background drift correction, which is convenient for practical use. Copyright © 2016 Elsevier B.V. All rights reserved.

  3. An integrated genetic linkage map of watermelon and genetic diversity based on single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers

    USDA-ARS?s Scientific Manuscript database

    Watermelon (Citrullus lanatus var. lanatus) is an important vegetable fruit throughout the world. A high number of single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers should provide large coverage of the watermelon genome and high phylogenetic resolution of germplasm acces...

  4. Conflict Background Triggered Congruency Sequence Effects in Graphic Judgment Task

    PubMed Central

    Zhao, Liang; Wang, Yonghui

    2013-01-01

    Congruency sequence effects refer to the reduction of congruency effects when following an incongruent trial than following a congruent trial. The conflict monitoring account, one of the most influential contributions to this effect, assumes that the sequential modulations are evoked by response conflict. The present study aimed at exploring the congruency sequence effects in the absence of response conflict. We found congruency sequence effects occurred in graphic judgment task, in which the conflict stimuli acted as irrelevant information. The findings reveal that processing task-irrelevant conflict stimulus features could also induce sequential modulations of interference. The results do not support the interpretation of conflict monitoring and favor a feature integration account that the congruency sequence effects are attributed to the repetitions of stimulus and response features. PMID:23372766

  5. Assessment of clinical analytical sensitivity and specificity of next-generation sequencing for detection of simple and complex mutations.

    PubMed

    Chin, Ephrem L H; da Silva, Cristina; Hegde, Madhuri

    2013-02-19

    Detecting mutations in disease genes by full gene sequence analysis is common in clinical diagnostic laboratories. Sanger dideoxy terminator sequencing allows for rapid development and implementation of sequencing assays in the clinical laboratory, but it has limited throughput, and due to cost constraints, only allows analysis of one or at most a few genes in a patient. Next-generation sequencing (NGS), on the other hand, has evolved rapidly, although to date it has mainly been used for large-scale genome sequencing projects and is beginning to be used in the clinical diagnostic testing. One advantage of NGS is that many genes can be analyzed easily at the same time, allowing for mutation detection when there are many possible causative genes for a specific phenotype. In addition, regions of a gene typically not tested for mutations, like deep intronic and promoter mutations, can also be detected. Here we use 20 previously characterized Sanger-sequenced positive controls in disease-causing genes to demonstrate the utility of NGS in a clinical setting using standard PCR based amplification to assess the analytical sensitivity and specificity of the technology for detecting all previously characterized changes (mutations and benign SNPs). The positive controls chosen for validation range from simple substitution mutations to complex deletion and insertion mutations occurring in autosomal dominant and recessive disorders. The NGS data was 100% concordant with the Sanger sequencing data identifying all 119 previously identified changes in the 20 samples. We have demonstrated that NGS technology is ready to be deployed in clinical laboratories. However, NGS and associated technologies are evolving, and clinical laboratories will need to invest significantly in staff and infrastructure to build the necessary foundation for success.

  6. BioWord: A sequence manipulation suite for Microsoft Word

    PubMed Central

    2012-01-01

    Background The ability to manipulate, edit and process DNA and protein sequences has rapidly become a necessary skill for practicing biologists across a wide swath of disciplines. In spite of this, most everyday sequence manipulation tools are distributed across several programs and web servers, sometimes requiring installation and typically involving frequent switching between applications. To address this problem, here we have developed BioWord, a macro-enabled self-installing template for Microsoft Word documents that integrates an extensive suite of DNA and protein sequence manipulation tools. Results BioWord is distributed as a single macro-enabled template that self-installs with a single click. After installation, BioWord will open as a tab in the Office ribbon. Biologists can then easily manipulate DNA and protein sequences using a familiar interface and minimize the need to switch between applications. Beyond simple sequence manipulation, BioWord integrates functionality ranging from dyad search and consensus logos to motif discovery and pair-wise alignment. Written in Visual Basic for Applications (VBA) as an open source, object-oriented project, BioWord allows users with varying programming experience to expand and customize the program to better meet their own needs. Conclusions BioWord integrates a powerful set of tools for biological sequence manipulation within a handy, user-friendly tab in a widely used word processing software package. The use of a simple scripting language and an object-oriented scheme facilitates customization by users and provides a very accessible educational platform for introducing students to basic bioinformatics algorithms. PMID:22676326

  7. Genetic diversity studies in pea (Pisum sativum L.) using simple sequence repeat markers.

    PubMed

    Kumari, P; Basal, N; Singh, A K; Rai, V P; Srivastava, C P; Singh, P K

    2013-03-13

    The genetic diversity among 28 pea (Pisum sativum L.) genotypes was analyzed using 32 simple sequence repeat markers. A total of 44 polymorphic bands, with an average of 2.1 bands per primer, were obtained. The polymorphism information content ranged from 0.657 to 0.309 with an average of 0.493. The variation in genetic diversity among these cultivars ranged from 0.11 to 0.73. Cluster analysis based on Jaccard's similarity coefficient using the unweighted pair-group method with arithmetic mean (UPGMA) revealed 2 distinct clusters, I and II, comprising 6 and 22 genotypes, respectively. Cluster II was further differentiated into 2 subclusters, IIA and IIB, with 12 and 10 genotypes, respectively. Principal component (PC) analysis revealed results similar to those of UPGMA. The first, second, and third PCs contributed 21.6, 16.1, and 14.0% of the variation, respectively; cumulative variation of the first 3 PCs was 51.7%.

  8. De novo transcriptome sequencing reveals a considerable bias in the incidence of simple sequence repeats towards the downstream of 'Pre-miRNAs' of black pepper.

    PubMed

    Joy, Nisha; Asha, Srinivasan; Mallika, Vijayan; Soniya, Eppurathu Vasudevan

    2013-01-01

    Next generation sequencing has an advantageon transformational development of species with limited available sequence data as it helps to decode the genome and transcriptome. We carried out the de novo sequencing using illuminaHiSeq™ 2000 to generate the first leaf transcriptome of black pepper (Piper nigrum L.), an important spice variety native to South India and also grown in other tropical regions. Despite the economic and biochemical importance of pepper, a scientifically rigorous study at the molecular level is far from complete due to lack of sufficient sequence information and cytological complexity of its genome. The 55 million raw reads obtained, when assembled using Trinity program generated 2,23,386 contigs and 1,28,157 unigenes. Reports suggest that the repeat-rich genomic regions give rise to small non-coding functional RNAs. MicroRNAs (miRNAs) are the most abundant type of non-coding regulatory RNAs. In spite of the widespread research on miRNAs, little is known about the hair-pin precursors of miRNAs bearing Simple Sequence Repeats (SSRs). We used the array of transcripts generated, for the in silico prediction and detection of '43 pre-miRNA candidates bearing different types of SSR motifs'. The analysis identified 3913 different types of SSR motifs with an average of one SSR per 3.04 MB of thetranscriptome. About 0.033% of the transcriptome constituted 'pre-miRNA candidates bearing SSRs'. The abundance, type and distribution of SSR motifs studied across the hair-pin miRNA precursors, showed a significant bias in the position of SSRs towards the downstream of predicted 'pre-miRNA candidates'. The catalogue of transcripts identified, together with the demonstration of reliable existence of SSRs in the miRNA precursors, permits future opportunities for understanding the genetic mechanism of black pepper and likely functions of 'tandem repeats' in miRNAs.

  9. Genome Survey Sequencing for the Characterization of the Genetic Background of Rosa roxburghii Tratt and Leaf Ascorbate Metabolism Genes.

    PubMed

    Lu, Min; An, Huaming; Li, Liangliang

    2016-01-01

    Rosa roxburghii Tratt is an important commercial horticultural crop in China that is recognized for its nutritional and medicinal values. In spite of the economic significance, genomic information on this rose species is currently unavailable. In the present research, a genome survey of R. roxburghii was carried out using next-generation sequencing (NGS) technologies. Total 30.29 Gb sequence data was obtained by HiSeq 2500 sequencing and an estimated genome size of R. roxburghii was 480.97 Mb, in which the guanine plus cytosine (GC) content was calculated to be 38.63%. All of these reads were technically assembled and a total of 627,554 contigs with a N50 length of 1.484 kb and furthermore 335,902 scaffolds with a total length of 409.36 Mb were obtained. Transposable elements (TE) sequence of 90.84 Mb which comprised 29.20% of the genome, and 167,859 simple sequence repeats (SSRs) were identified from the scaffolds. Among these, the mono-(66.30%), di-(25.67%), and tri-(6.64%) nucleotide repeats contributed to nearly 99% of the SSRs, and sequence motifs AG/CT (28.81%) and GAA/TTC (14.76%) were the most abundant among the dinucleotide and trinucleotide repeat motifs, respectively. Genome analysis predicted a total of 22,721 genes which have an average length of 2311.52 bp, an average exon length of 228.15 bp, and average intron length of 401.18 bp. Eleven genes putatively involved in ascorbate metabolism were identified and its expression in R. roxburghii leaves was validated by quantitative real-time PCR (qRT-PCR). This is the first report of genome-wide characterization of this rose species.

  10. Changes in spinal reflex excitability associated with motor sequence learning.

    PubMed

    Lungu, Ovidiu; Frigon, Alain; Piché, Mathieu; Rainville, Pierre; Rossignol, Serge; Doyon, Julien

    2010-05-01

    There is ample evidence that motor sequence learning is mediated by changes in brain activity. Yet the question of whether this form of learning elicits changes detectable at the spinal cord level has not been addressed. To date, studies in humans have revealed that spinal reflex activity may be altered during the acquisition of various motor skills, but a link between motor sequence learning and changes in spinal excitability has not been demonstrated. To address this issue, we studied the modulation of H-reflex amplitude evoked in the flexor carpi radialis muscle of 14 healthy individuals between blocks of movements that involved the implicit acquisition of a sequence versus other movements that did not require learning. Each participant performed the task in three conditions: "sequence"-externally triggered, repeating and sequential movements, "random"-similar movements, but performed in an arbitrary order, and "simple"- involving alternating movements in a left-right or up-down direction only. When controlling for background muscular activity, H-reflex amplitude was significantly more reduced in the sequence (43.8 +/- 1.47%. mean +/- SE) compared with the random (38.2 +/- 1.60%) and simple (31.5 +/- 1.82%) conditions, while the M-response was not different across conditions. Furthermore, H-reflex changes were observed from the beginning of the learning process up to when subjects reached asymptotic performance on the motor task. Changes also persisted for >60 s after motor activity ceased. Such findings suggest that the excitability in some spinal reflex circuits is altered during the implicit learning process of a new motor sequence.

  11. A Simple Sequence Repeat- and Single-Nucleotide Polymorphism-Based Genetic Linkage Map of the Brown Planthopper, Nilaparvata lugens

    PubMed Central

    Jairin, Jirapong; Kobayashi, Tetsuya; Yamagata, Yoshiyuki; Sanada-Morimura, Sachiyo; Mori, Kazuki; Tashiro, Kosuke; Kuhara, Satoru; Kuwazaki, Seigo; Urio, Masahiro; Suetsugu, Yoshitaka; Yamamoto, Kimiko; Matsumura, Masaya; Yasui, Hideshi

    2013-01-01

    In this study, we developed the first genetic linkage map for the major rice insect pest, the brown planthopper (BPH, Nilaparvata lugens). The linkage map was constructed by integrating linkage data from two backcross populations derived from three inbred BPH strains. The consensus map consists of 474 simple sequence repeats, 43 single-nucleotide polymorphisms, and 1 sequence-tagged site, for a total of 518 markers at 472 unique positions in 17 linkage groups. The linkage groups cover 1093.9 cM, with an average distance of 2.3 cM between loci. The average number of marker loci per linkage group was 27.8. The sex-linkage group was identified by exploiting X-linked and Y-specific markers. Our linkage map and the newly developed markers used to create it constitute an essential resource and a useful framework for future genetic analyses in BPH. PMID:23204257

  12. Pigeons Exhibit Contextual Cueing to Both Simple and Complex Backgrounds

    PubMed Central

    Wasserman, Edward A.; Teng, Yuejia; Castro, Leyre

    2014-01-01

    Repeated pairings of a particular visual context with a specific location of a target stimulus facilitate target search in humans. We explored an animal model of this contextual cueing effect using a novel Cueing-Miscueing design. Pigeons had to peck a target which could appear in one of four possible locations on four possible color backgrounds or four possible color photographs of real-world scenes. On 80% of the trials, each of the contexts was uniquely paired with one of the target locations; on the other 20% of the trials, each of the contexts was randomly paired with the remaining target locations. Pigeons came to exhibit robust contextual cueing when the context preceded the target by 2 s, with reaction times to the target being shorter on correctly-cued trials than on incorrectly-cued trials. Contextual cueing proved to be more robust with photographic backgrounds than with uniformly colored backgrounds. In addition, during the context-target delay, pigeons predominately pecked toward the location of the upcoming target, suggesting that attentional guidance contributes to contextual cueing. These findings confirm the effectiveness of animal models of contextual cueing and underscore the important part played by associative learning in producing the effect. PMID:24491468

  13. The First Molecular Identification of an Olive Collection Applying Standard Simple Sequence Repeats and Novel Expressed Sequence Tag Markers.

    PubMed

    Mousavi, Soraya; Mariotti, Roberto; Regni, Luca; Nasini, Luigi; Bufacchi, Marina; Pandolfi, Saverio; Baldoni, Luciana; Proietti, Primo

    2017-01-01

    Germplasm collections of tree crop species represent fundamental tools for conservation of diversity and key steps for its characterization and evaluation. For the olive tree, several collections were created all over the world, but only few of them have been fully characterized and molecularly identified. The olive collection of Perugia University (UNIPG), established in the years' 60, represents one of the first attempts to gather and safeguard olive diversity, keeping together cultivars from different countries. In the present study, a set of 370 olive trees previously uncharacterized was screened with 10 standard simple sequence repeats (SSRs) and nine new EST-SSR markers, to correctly and thoroughly identify all genotypes, verify their representativeness of the entire cultivated olive variation, and validate the effectiveness of new markers in comparison to standard genotyping tools. The SSR analysis revealed the presence of 59 genotypes, corresponding to 72 well known cultivars, 13 of them resulting exclusively present in this collection. The new EST-SSRs have shown values of diversity parameters quite similar to those of best standard SSRs. When compared to hundreds of Mediterranean cultivars, the UNIPG olive accessions were splitted into the three main populations (East, Center and West Mediterranean), confirming that the collection has a good representativeness of the entire olive variability. Furthermore, Bayesian analysis, performed on the 59 genotypes of the collection by the use of both sets of markers, have demonstrated their splitting into four clusters, with a well balanced membership obtained by EST respect to standard SSRs. The new OLEST ( Olea expressed sequence tags) SSR markers resulted as effective as the best standard markers. The information obtained from this study represents a high valuable tool for ex situ conservation and management of olive genetic resources, useful to build a common database from worldwide olive cultivar collections

  14. De novo Transcriptome Sequencing Reveals a Considerable Bias in the Incidence of Simple Sequence Repeats towards the Downstream of ‘Pre-miRNAs’ of Black Pepper

    PubMed Central

    Joy, Nisha; Asha, Srinivasan; Mallika, Vijayan; Soniya, Eppurathu Vasudevan

    2013-01-01

    Next generation sequencing has an advantageon transformational development of species with limited available sequence data as it helps to decode the genome and transcriptome. We carried out the de novo sequencing using illuminaHiSeq™ 2000 to generate the first leaf transcriptome of black pepper (Piper nigrum L.), an important spice variety native to South India and also grown in other tropical regions. Despite the economic and biochemical importance of pepper, a scientifically rigorous study at the molecular level is far from complete due to lack of sufficient sequence information and cytological complexity of its genome. The 55 million raw reads obtained, when assembled using Trinity program generated 2,23,386 contigs and 1,28,157 unigenes. Reports suggest that the repeat-rich genomic regions give rise to small non-coding functional RNAs. MicroRNAs (miRNAs) are the most abundant type of non-coding regulatory RNAs. In spite of the widespread research on miRNAs, little is known about the hair-pin precursors of miRNAs bearing Simple Sequence Repeats (SSRs). We used the array of transcripts generated, for the in silico prediction and detection of ‘43 pre-miRNA candidates bearing different types of SSR motifs’. The analysis identified 3913 different types of SSR motifs with an average of one SSR per 3.04 MB of thetranscriptome. About 0.033% of the transcriptome constituted ‘pre-miRNA candidates bearing SSRs’. The abundance, type and distribution of SSR motifs studied across the hair-pin miRNA precursors, showed a significant bias in the position of SSRs towards the downstream of predicted ‘pre-miRNA candidates’. The catalogue of transcripts identified, together with the demonstration of reliable existence of SSRs in the miRNA precursors, permits future opportunities for understanding the genetic mechanism of black pepper and likely functions of ‘tandem repeats’ in miRNAs. PMID:23469176

  15. Real-time detection of small and dim moving objects in IR video sequences using a robust background estimator and a noise-adaptive double thresholding

    NASA Astrophysics Data System (ADS)

    Zingoni, Andrea; Diani, Marco; Corsini, Giovanni

    2016-10-01

    We developed an algorithm for automatically detecting small and poorly contrasted (dim) moving objects in real-time, within video sequences acquired through a steady infrared camera. The algorithm is suitable for different situations since it is independent of the background characteristics and of changes in illumination. Unlike other solutions, small objects of any size (up to single-pixel), either hotter or colder than the background, can be successfully detected. The algorithm is based on accurately estimating the background at the pixel level and then rejecting it. A novel approach permits background estimation to be robust to changes in the scene illumination and to noise, and not to be biased by the transit of moving objects. Care was taken in avoiding computationally costly procedures, in order to ensure the real-time performance even using low-cost hardware. The algorithm was tested on a dataset of 12 video sequences acquired in different conditions, providing promising results in terms of detection rate and false alarm rate, independently of background and objects characteristics. In addition, the detection map was produced frame by frame in real-time, using cheap commercial hardware. The algorithm is particularly suitable for applications in the fields of video-surveillance and computer vision. Its reliability and speed permit it to be used also in critical situations, like in search and rescue, defence and disaster monitoring.

  16. Hi-Plex for Simple, Accurate, and Cost-Effective Amplicon-based Targeted DNA Sequencing.

    PubMed

    Pope, Bernard J; Hammet, Fleur; Nguyen-Dumont, Tu; Park, Daniel J

    2018-01-01

    Hi-Plex is a suite of methods to enable simple, accurate, and cost-effective highly multiplex PCR-based targeted sequencing (Nguyen-Dumont et al., Biotechniques 58:33-36, 2015). At its core is the principle of using gene-specific primers (GSPs) to "seed" (or target) the reaction and universal primers to "drive" the majority of the reaction. In this manner, effects on amplification efficiencies across the target amplicons can, to a large extent, be restricted to early seeding cycles. Product sizes are defined within a relatively narrow range to enable high-specificity size selection, replication uniformity across target sites (including in the context of fragmented input DNA such as that derived from fixed tumor specimens (Nguyen-Dumont et al., Biotechniques 55:69-74, 2013; Nguyen-Dumont et al., Anal Biochem 470:48-51, 2015), and application of high-specificity genetic variant calling algorithms (Pope et al., Source Code Biol Med 9:3, 2014; Park et al., BMC Bioinformatics 17:165, 2016). Hi-Plex offers a streamlined workflow that is suitable for testing large numbers of specimens without the need for automation.

  17. Simple and efficient identification of rare recessive pathologically important sequence variants from next generation exome sequence data.

    PubMed

    Carr, Ian M; Morgan, Joanne; Watson, Christopher; Melnik, Svitlana; Diggle, Christine P; Logan, Clare V; Harrison, Sally M; Taylor, Graham R; Pena, Sergio D J; Markham, Alexander F; Alkuraya, Fowzan S; Black, Graeme C M; Ali, Manir; Bonthron, David T

    2013-07-01

    Massively parallel ("next generation") DNA sequencing (NGS) has quickly become the method of choice for seeking pathogenic mutations in rare uncharacterized monogenic diseases. Typically, before DNA sequencing, protein-coding regions are enriched from patient genomic DNA, representing either the entire genome ("exome sequencing") or selected mapped candidate loci. Sequence variants, identified as differences between the patient's and the human genome reference sequences, are then filtered according to various quality parameters. Changes are screened against datasets of known polymorphisms, such as dbSNP and the 1000 Genomes Project, in the effort to narrow the list of candidate causative variants. An increasing number of commercial services now offer to both generate and align NGS data to a reference genome. This potentially allows small groups with limited computing infrastructure and informatics skills to utilize this technology. However, the capability to effectively filter and assess sequence variants is still an important bottleneck in the identification of deleterious sequence variants in both research and diagnostic settings. We have developed an approach to this problem comprising a user-friendly suite of programs that can interactively analyze, filter and screen data from enrichment-capture NGS data. These programs ("Agile Suite") are particularly suitable for small-scale gene discovery or for diagnostic analysis. © 2013 WILEY PERIODICALS, INC.

  18. THE USE OF INTER SIMPLE SEQUENCE REPEATS (ISSR) IN DISTINGUISHING NEIGHBORING DOUGLAS-FIR TREES AS A MEANS TO IDENTIFYING TREE ROOTS WITH ABOVE-GROUND BIOMASS

    EPA Science Inventory

    We are attempting to identify specific root fragments from soil cores with individual trees. We successfully used Inter Simple Sequence Repeats (ISSR) to distinguish neighboring old-growth Douglas-fir trees from one another, while maintaining identity among each tree's parts. W...

  19. Pigeons exhibit contextual cueing to both simple and complex backgrounds.

    PubMed

    Wasserman, Edward A; Teng, Yuejia; Castro, Leyre

    2014-05-01

    Repeated pairings of a particular visual context with a specific location of a target stimulus facilitate target search in humans. We explored an animal model of this contextual cueing effect using a novel Cueing-Miscueing design. Pigeons had to peck a target which could appear in one of four possible locations on four possible color backgrounds or four possible color photographs of real-world scenes. On 80% of the trials, each of the contexts was uniquely paired with one of the target locations; on the other 20% of the trials, each of the contexts was randomly paired with the remaining target locations. Pigeons came to exhibit robust contextual cueing when the context preceded the target by 2s, with reaction times to the target being shorter on correctly-cued trials than on incorrectly-cued trials. Contextual cueing proved to be more robust with photographic backgrounds than with uniformly colored backgrounds. In addition, during the context-target delay, pigeons predominately pecked toward the location of the upcoming target, suggesting that attentional guidance contributes to contextual cueing. These findings confirm the effectiveness of animal models of contextual cueing and underscore the important part played by associative learning in producing the effect. This article is part of a Special Issue entitled: SQAB 2013: Contextual Con. Copyright © 2014 Elsevier B.V. All rights reserved.

  20. Genetic variation of Sargassum horneri populations detected by inter-simple sequence repeats.

    PubMed

    Ren, J R; Yang, R; He, Y Y; Sun, Q H

    2015-01-30

    The seaweed Sargassum horneri is an important brown alga in the marine environment, and it is an important raw material in the alginate industry. Unfortunately, the fixed resource that was originally reported is now reduced or disappeared, and increased floating populations have been reported in recent years. We sampled a floating population and 4 fixed cultivated populations of S. horneri along the coast of Zhejiang, China. Inter-simple sequence repeat (ISSR) markers were applied in this research to analyze the genetic variation between floating populations and fixed cultivated populations of S. horneri. In total, 220 loci were amplified with 23 ISSR primers. The percentage of polymorphic loci within each population ranged from 53.64 to 95.45%. The highest diversity was observed in population 3, which was the local species that was suspension cultured in the lab and then fixed cultivated in the Nanji Islands before sampling. The lowest diversity was obtained in the floating population 4. The genetic distances among the 5 S. horneri populations ranged from 0.0819 to 0.2889, and the distance tendency confirmed the genetic diversity. The results suggest that the floating population had the lowest genetic diversity and could not be joined into the cluster branch of the fixed cultivated populations.

  1. Genetic diversity of Pinus nigra Arn. populations in Southern Spain and Northern Morocco revealed by inter-simple sequence repeat profiles.

    PubMed

    Rubio-Moraga, Angela; Candel-Perez, David; Lucas-Borja, Manuel E; Tiscar, Pedro A; Viñegla, Benjamin; Linares, Juan C; Gómez-Gómez, Lourdes; Ahrazem, Oussama

    2012-01-01

    Eight Pinus nigra Arn. populations from Southern Spain and Northern Morocco were examined using inter-simple sequence repeat markers to characterize the genetic variability amongst populations. Pair-wise population genetic distance ranged from 0.031 to 0.283, with a mean of 0.150 between populations. The highest inter-population average distance was between PaCU from Cuenca and YeCA from Cazorla, while the lowest distance was between TaMO from Morocco and MA Sierra Mágina populations. Analysis of molecular variance (AMOVA) and Nei's genetic diversity analyses revealed higher genetic variation within the same population than among different populations. Genetic differentiation (Gst) was 0.233. Cuenca showed the highest Nei's genetic diversity followed by the Moroccan region, Sierra Mágina, and Cazorla region. However, clustering of populations was not in accordance with their geographical locations. Principal component analysis showed the presence of two major groups-Group 1 contained all populations from Cuenca while Group 2 contained populations from Cazorla, Sierra Mágina and Morocco-while Bayesian analysis revealed the presence of three clusters. The low genetic diversity observed in PaCU and YeCA is probably a consequence of inappropriate management since no estimation of genetic variability was performed before the silvicultural treatments. Data indicates that the inter-simple sequence repeat (ISSR) method is sufficiently informative and powerful to assess genetic variability among populations of P. nigra.

  2. Discovery and mapping of a new expressed sequence tag-single nucleotide polymorphism and simple sequence repeat panel for large-scale genetic studies and breeding of Theobroma cacao L.

    PubMed Central

    Allegre, Mathilde; Argout, Xavier; Boccara, Michel; Fouet, Olivier; Roguet, Yolande; Bérard, Aurélie; Thévenin, Jean Marc; Chauveau, Aurélie; Rivallan, Ronan; Clement, Didier; Courtois, Brigitte; Gramacho, Karina; Boland-Augé, Anne; Tahi, Mathias; Umaharan, Pathmanathan; Brunel, Dominique; Lanaud, Claire

    2012-01-01

    Theobroma cacao is an economically important tree of several tropical countries. Its genetic improvement is essential to provide protection against major diseases and improve chocolate quality. We discovered and mapped new expressed sequence tag-single nucleotide polymorphism (EST-SNP) and simple sequence repeat (SSR) markers and constructed a high-density genetic map. By screening 149 650 ESTs, 5246 SNPs were detected in silico, of which 1536 corresponded to genes with a putative function, while 851 had a clear polymorphic pattern across a collection of genetic resources. In addition, 409 new SSR markers were detected on the Criollo genome. Lastly, 681 new EST-SNPs and 163 new SSRs were added to the pre-existing 418 co-dominant markers to construct a large consensus genetic map. This high-density map and the set of new genetic markers identified in this study are a milestone in cocoa genomics and for marker-assisted breeding. The data are available at http://tropgenedb.cirad.fr. PMID:22210604

  3. ViBe: a universal background subtraction algorithm for video sequences.

    PubMed

    Barnich, Olivier; Van Droogenbroeck, Marc

    2011-06-01

    This paper presents a technique for motion detection that incorporates several innovative mechanisms. For example, our proposed technique stores, for each pixel, a set of values taken in the past at the same location or in the neighborhood. It then compares this set to the current pixel value in order to determine whether that pixel belongs to the background, and adapts the model by choosing randomly which values to substitute from the background model. This approach differs from those based upon the classical belief that the oldest values should be replaced first. Finally, when the pixel is found to be part of the background, its value is propagated into the background model of a neighboring pixel. We describe our method in full details (including pseudo-code and the parameter values used) and compare it to other background subtraction techniques. Efficiency figures show that our method outperforms recent and proven state-of-the-art methods in terms of both computation speed and detection rate. We also analyze the performance of a downscaled version of our algorithm to the absolute minimum of one comparison and one byte of memory per pixel. It appears that even such a simplified version of our algorithm performs better than mainstream techniques.

  4. Simultaneous phylogeny reconstruction and multiple sequence alignment

    PubMed Central

    Yue, Feng; Shi, Jian; Tang, Jijun

    2009-01-01

    Background A phylogeny is the evolutionary history of a group of organisms. To date, sequence data is still the most used data type for phylogenetic reconstruction. Before any sequences can be used for phylogeny reconstruction, they must be aligned, and the quality of the multiple sequence alignment has been shown to affect the quality of the inferred phylogeny. At the same time, all the current multiple sequence alignment programs use a guide tree to produce the alignment and experiments showed that good guide trees can significantly improve the multiple alignment quality. Results We devise a new algorithm to simultaneously align multiple sequences and search for the phylogenetic tree that leads to the best alignment. We also implemented the algorithm as a C program package, which can handle both DNA and protein data and can take simple cost model as well as complex substitution matrices, such as PAM250 or BLOSUM62. The performance of the new method are compared with those from other popular multiple sequence alignment tools, including the widely used programs such as ClustalW and T-Coffee. Experimental results suggest that this method has good performance in terms of both phylogeny accuracy and alignment quality. Conclusion We present an algorithm to align multiple sequences and reconstruct the phylogenies that minimize the alignment score, which is based on an efficient algorithm to solve the median problems for three sequences. Our extensive experiments suggest that this method is very promising and can produce high quality phylogenies and alignments. PMID:19208110

  5. Development of chloroplast simple sequence repeats (cpSSRs) for the intraspecific study of Gracilaria tenuistipitata (Gracilariales, Rhodophyta) from different populations

    PubMed Central

    2014-01-01

    Background Gracilaria tenuistipitata is an agarophyte with substantial economic potential because of its high growth rate and tolerance to a wide range of environment factors. This red seaweed is intensively cultured in China for the production of agar and fodder for abalone. Microsatellite markers were developed from the chloroplast genome of G. tenuistipitata var. liui to differentiate G. tenuistipitata obtained from six different localities: four from Peninsular Malaysia, one from Thailand and one from Vietnam. Eighty G. tenuistipitata specimens were analyzed using eight simple sequence repeat (SSR) primer-pairs that we developed for polymerase chain reaction (PCR) amplification. Findings Five mononucleotide primer-pairs and one trinucleotide primer-pair exhibited monomorphic alleles, whereas the other two primer-pairs separated the G. tenuistipitata specimens into two main clades. G. tenuistipitata from Thailand and Vietnam were grouped into one clade, and the populations from Batu Laut, Middle Banks and Kuah (Malaysia) were grouped into another clade. The combined dataset of these two primer-pairs separated G. tenuistipitata obtained from Kelantan, Malaysia from that obtained from other localities. Conclusions Based on the variations in repeated nucleotides of microsatellite markers, our results suggested that the populations of G. tenuistipitata were distributed into two main geographical regions: (i) populations in the west coast of Peninsular Malaysia and (ii) populations facing the South China Sea. The correct identification of G. tenuistipitata strains with traits of high economic potential will be advantageous for the mass cultivation of seaweeds. PMID:24490797

  6. Motif finding in DNA sequences based on skipping nonconserved positions in background Markov chains.

    PubMed

    Zhao, Xiaoyan; Sze, Sing-Hoi

    2011-05-01

    One strategy to identify transcription factor binding sites is through motif finding in upstream DNA sequences of potentially co-regulated genes. Despite extensive efforts, none of the existing algorithms perform very well. We consider a string representation that allows arbitrary ignored positions within the nonconserved portion of single motifs, and use O(2(l)) Markov chains to model the background distributions of motifs of length l while skipping these positions within each Markov chain. By focusing initially on positions that have fixed nucleotides to define core occurrences, we develop an algorithm to identify motifs of moderate lengths. We compare the performance of our algorithm to other motif finding algorithms on a few benchmark data sets, and show that significant improvement in accuracy can be obtained when the sites are sufficiently conserved within a given sample, while comparable performance is obtained when the site conservation rate is low. A software program (PosMotif ) and detailed results are available online at http://faculty.cse.tamu.edu/shsze/posmotif.

  7. Research on the algorithm of infrared target detection based on the frame difference and background subtraction method

    NASA Astrophysics Data System (ADS)

    Liu, Yun; Zhao, Yuejin; Liu, Ming; Dong, Liquan; Hui, Mei; Liu, Xiaohua; Wu, Yijian

    2015-09-01

    As an important branch of infrared imaging technology, infrared target tracking and detection has a very important scientific value and a wide range of applications in both military and civilian areas. For the infrared image which is characterized by low SNR and serious disturbance of background noise, an innovative and effective target detection algorithm is proposed in this paper, according to the correlation of moving target frame-to-frame and the irrelevance of noise in sequential images based on OpenCV. Firstly, since the temporal differencing and background subtraction are very complementary, we use a combined detection method of frame difference and background subtraction which is based on adaptive background updating. Results indicate that it is simple and can extract the foreground moving target from the video sequence stably. For the background updating mechanism continuously updating each pixel, we can detect the infrared moving target more accurately. It paves the way for eventually realizing real-time infrared target detection and tracking, when transplanting the algorithms on OpenCV to the DSP platform. Afterwards, we use the optimal thresholding arithmetic to segment image. It transforms the gray images to black-white images in order to provide a better condition for the image sequences detection. Finally, according to the relevance of moving objects between different frames and mathematical morphology processing, we can eliminate noise, decrease the area, and smooth region boundaries. Experimental results proves that our algorithm precisely achieve the purpose of rapid detection of small infrared target.

  8. Genetic Diversity of Ascaris in China Assessed Using Simple Sequence Repeat Markers.

    PubMed

    Zhou, Chunhua; Jian, Shaoqing; Peng, Weidong; Li, Min

    2018-04-01

    The giant roundworm Ascaris infects pigs and people worldwide and causes serious diseases. The taxonomic relationship between Ascaris suum and Ascaris lumbricoides is still unclear. The purpose of the present study was to investigate the genetic diversity and population genetic structure of 258 Ascaris specimens from humans and pigs from 6 sympatric regions in Ascaris -endemic regions of China using existing simple sequence repeat data. The microsatellite markers showed a high level of allelic richness and genetic diversity in the samples. Each of the populations demonstrated excess homozygosity (Ho0). According to a genetic differentiation index (Fst=0.0593), there was a high-level of gene flow in the Ascaris populations. A hierarchical analysis on molecular variance revealed remarkably high levels of variation within the populations. Moreover, a population structure analysis indicated that Ascaris populations fell into 3 main genetic clusters, interpreted as A. suum , A. lumbricoides , and a hybrid of the species. We speculated that humans can be infected with A. lumbricoides , A. suum , and the hybrid, but pigs were mainly infected with A. suum . This study provided new information on the genetic diversity and population structure of Ascaris from human and pigs in China, which can be used for designing Ascaris control strategies. It can also be beneficial to understand the introgression of host affiliation.

  9. Cytogenetic and molecular markers for detecting Aegilops uniaristata chromosomes in a wheat background.

    PubMed

    Gong, Wenping; Li, Guangrong; Zhou, Jianping; Li, Genying; Liu, Cheng; Huang, Chengyan; Zhao, Zhendong; Yang, Zujun

    2014-09-01

    Aegilops uniaristata has many agronomically useful traits that can be used for wheat breeding. So far, a Triticum turgidum - Ae. uniaristata amphiploid and one set of Chinese Spring (CS) - Ae. uniaristata addition lines have been produced. To guide Ae. uniaristata chromatin transformation from these lines into cultivated wheat through chromosome engineering, reliable cytogenetic and molecular markers specific for Ae. uniaristata chromosomes need to be developed. Standard C-banding shows that C-bands mainly exist in the centromeric regions of Ae. uniaristata but rarely at the distal ends. Fluorescence in situ hybridization (FISH) using (GAA)8 as a probe showed that the hybridization signal of chromosomes 1N-7N are different, thus (GAA)8 can be used to identify all Ae. uniaristata chromosomes in wheat background simultaneously. Moreover, a total of 42 molecular markers specific for Ae. uniaristata chromosomes were developed by screening expressed sequence tag - sequence tagged site (EST-STS), expressed sequence tag - simple sequence repeat (EST-SSR), and PCR-based landmark unique gene (PLUG) primers. The markers were subsequently localized using the CS - Ae. uniaristata addition lines and different wheat cultivars as controls. The cytogenetic and molecular markers developed herein will be helpful for screening and identifying wheat - Ae. uniaristata progeny.

  10. ChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants

    PubMed Central

    Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh

    2014-01-01

    Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1–6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ PMID:25380781

  11. ChloroSSRdb: a repository of perfect and imperfect chloroplastic simple sequence repeats (cpSSRs) of green plants.

    PubMed

    Kapil, Aditi; Rai, Piyush Kant; Shanker, Asheesh

    2014-01-01

    Simple sequence repeats (SSRs) are regions in DNA sequence that contain repeating motifs of length 1-6 nucleotides. These repeats are ubiquitously present and are found in both coding and non-coding regions of genome. A total of 534 complete chloroplast genome sequences (as on 18 September 2014) of Viridiplantae are available at NCBI organelle genome resource. It provides opportunity to mine these genomes for the detection of SSRs and store them in the form of a database. In an attempt to properly manage and retrieve chloroplastic SSRs, we designed ChloroSSRdb which is a relational database developed using SQL server 2008 and accessed through ASP.NET. It provides information of all the three types (perfect, imperfect and compound) of SSRs. At present, ChloroSSRdb contains 124 430 mined SSRs, with majority lying in non-coding region. Out of these, PCR primers were designed for 118 249 SSRs. Tetranucleotide repeats (47 079) were found to be the most frequent repeat type, whereas hexanucleotide repeats (6414) being the least abundant. Additionally, in each species statistical analyses were performed to calculate relative frequency, correlation coefficient and chi-square statistics of perfect and imperfect SSRs. In accordance with the growing interest in SSR studies, ChloroSSRdb will prove to be a useful resource in developing genetic markers, phylogenetic analysis, genetic mapping, etc. Moreover, it will serve as a ready reference for mined SSRs in available chloroplast genomes of green plants. Database URL: www.compubio.in/chlorossrdb/ © The Author(s) 2014. Published by Oxford University Press.

  12. Genome-Wide Characterization and Linkage Mapping of Simple Sequence Repeats in Mei (Prunus mume Sieb. et Zucc.)

    PubMed Central

    Sun, Lidan; Yang, Weiru; Zhang, Qixiang; Cheng, Tangren; Pan, Huitang; Xu, Zongda; Zhang, Jie; Chen, Chuguang

    2013-01-01

    Because of its popularity as an ornamental plant in East Asia, mei (Prunus mume Sieb. et Zucc.) has received increasing attention in genetic and genomic research with the recent shotgun sequencing of its genome. Here, we performed the genome-wide characterization of simple sequence repeats (SSRs) in the mei genome and detected a total of 188,149 SSRs occurring at a frequency of 794 SSR/Mb. Mononucleotide repeats were the most common type of SSR in genomic regions, followed by di- and tetranucleotide repeats. Most of the SSRs in coding sequences (CDS) were composed of tri- or hexanucleotide repeat motifs, but mononucleotide repeats were always the most common in intergenic regions. Genome-wide comparison of SSR patterns among the mei, strawberry (Fragaria vesca), and apple (Malus×domestica) genomes showed mei to have the highest density of SSRs, slightly higher than that of strawberry (608 SSR/Mb) and almost twice as high as that of apple (398 SSR/Mb). Mononucleotide repeats were the dominant SSR motifs in the three Rosaceae species. Using 144 SSR markers, we constructed a 670 cM-long linkage map of mei delimited into eight linkage groups (LGs), with an average marker distance of 5 cM. Seventy one scaffolds covering about 27.9% of the assembled mei genome were anchored to the genetic map, depending on which the macro-colinearity between the mei genome and Prunus T×E reference map was identified. The framework map of mei constructed provides a first step into subsequent high-resolution genetic mapping and marker-assisted selection for this ornamental species. PMID:23555708

  13. Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns

    PubMed Central

    2011-01-01

    Background Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Results Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Conclusions Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks. PMID:21910886

  14. A simple, rapid, high-fidelity and cost-effective PCR-based two-step DNA synthesis method for long gene sequences.

    PubMed

    Xiong, Ai-Sheng; Yao, Quan-Hong; Peng, Ri-He; Li, Xian; Fan, Hui-Qin; Cheng, Zong-Ming; Li, Yi

    2004-07-07

    Chemical synthesis of DNA sequences provides a powerful tool for modifying genes and for studying gene function, structure and expression. Here, we report a simple, high-fidelity and cost-effective PCR-based two-step DNA synthesis (PTDS) method for synthesis of long segments of DNA. The method involves two steps. (i) Synthesis of individual fragments of the DNA of interest: ten to twelve 60mer oligonucleotides with 20 bp overlap are mixed and a PCR reaction is carried out with high-fidelity DNA polymerase Pfu to produce DNA fragments that are approximately 500 bp in length. (ii) Synthesis of the entire sequence of the DNA of interest: five to ten PCR products from the first step are combined and used as the template for a second PCR reaction using high-fidelity DNA polymerase pyrobest, with the two outermost oligonucleotides as primers. Compared with the previously published methods, the PTDS method is rapid (5-7 days) and suitable for synthesizing long segments of DNA (5-6 kb) with high G + C contents, repetitive sequences or complex secondary structures. Thus, the PTDS method provides an alternative tool for synthesizing and assembling long genes with complex structures. Using the newly developed PTDS method, we have successfully obtained several genes of interest with sizes ranging from 1.0 to 5.4 kb.

  15. Survey and analysis of simple sequence repeats in the Laccaria bicolor genome, with development of microsatellite markers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Labbe, Jessy L; Murat, Claude; Morin, Emmanuelle

    It is becoming clear that simple sequence repeats (SSRs) play a significant role in fungal genome organization, and they are a large source of genetic markers for population genetics and meiotic maps. We identified SSRs in the Laccaria bicolor genome by in silico survey and analyzed their distribution in the different genomic regions. We also compared the abundance and distribution of SSRs in L. bicolor with those of the following fungal genomes: Phanerochaete chrysosporium, Coprinopsis cinerea, Ustilago maydis, Cryptococcus neoformans, Aspergillus nidulans, Magnaporthe grisea, Neurospora crassa and Saccharomyces cerevisiae. Using the MISA computer program, we detected 277,062 SSRs in themore » L. bicolor genome representing 8% of the assembled genomic sequence. Among the analyzed basidiomycetes, L. bicolor exhibited the highest SSR density although no correlation between relative abundance and the genome sizes was observed. In most genomes the short motifs (mono- to trinucleotides) were more abundant than the longer repeated SSRs. Generally, in each organism, the occurrence, relative abundance, and relative density of SSRs decreased as the repeat unit increased. Furthermore, each organism had its own common and longest SSRs. In the L. bicolor genome, most of the SSRs were located in intergenic regions (73.3%) and the highest SSR density was observed in transposable elements (TEs; 6,706 SSRs/Mb). However, 81% of the protein-coding genes contained SSRs in their exons, suggesting that SSR polymorphism may alter gene phenotypes. Within a L. bicolor offspring, sequence polymorphism of 78 SSRs was mainly detected in non-TE intergenic regions. Unlike previously developed microsatellite markers, these new ones are spread throughout the genome; these markers could have immediate applications in population genetics.« less

  16. Genetic Diversity of Pinus nigra Arn. Populations in Southern Spain and Northern Morocco Revealed By Inter-Simple Sequence Repeat Profiles †

    PubMed Central

    Rubio-Moraga, Angela; Candel-Perez, David; Lucas-Borja, Manuel E.; Tiscar, Pedro A.; Viñegla, Benjamin; Linares, Juan C.; Gómez-Gómez, Lourdes; Ahrazem, Oussama

    2012-01-01

    Eight Pinus nigra Arn. populations from Southern Spain and Northern Morocco were examined using inter-simple sequence repeat markers to characterize the genetic variability amongst populations. Pair-wise population genetic distance ranged from 0.031 to 0.283, with a mean of 0.150 between populations. The highest inter-population average distance was between PaCU from Cuenca and YeCA from Cazorla, while the lowest distance was between TaMO from Morocco and MA Sierra Mágina populations. Analysis of molecular variance (AMOVA) and Nei’s genetic diversity analyses revealed higher genetic variation within the same population than among different populations. Genetic differentiation (Gst) was 0.233. Cuenca showed the highest Nei’s genetic diversity followed by the Moroccan region, Sierra Mágina, and Cazorla region. However, clustering of populations was not in accordance with their geographical locations. Principal component analysis showed the presence of two major groups—Group 1 contained all populations from Cuenca while Group 2 contained populations from Cazorla, Sierra Mágina and Morocco—while Bayesian analysis revealed the presence of three clusters. The low genetic diversity observed in PaCU and YeCA is probably a consequence of inappropriate management since no estimation of genetic variability was performed before the silvicultural treatments. Data indicates that the inter-simple sequence repeat (ISSR) method is sufficiently informative and powerful to assess genetic variability among populations of P. nigra. PMID:22754321

  17. Development of Simple Sequence Repeats (SSR) markers in Setaria italica (Poaceae) and cross-amplification in related species.

    PubMed

    Lin, Heng-Sheng; Chiang, Chih-Yun; Chang, Song-Bin; Kuoh, Chang-Sheng

    2011-01-01

    Foxtail millet is one of the world's oldest cultivated crops. It has been adopted as a model organism for providing a deeper understanding of plant biology. In this study, 45 simple sequence repeats (SSR) markers of Setaria italica were developed. These markers showing polymorphism were screened in 223 samples from 12 foxtail millet populations around Taiwan. The most common dinucleotide and trinucleotide repeat motifs are AC/TG (84.21%) and CAT (46.15%). The average number of alleles (N(a)), the average heterozygosities observed (H(o)) and expected (H(e)) are 3.73, 0.714, 0.587, respectively. In addition, 24 SSR markers had shown transferability to six related Poaceae species. These new markers provide tools for examining genetic relatedness among foxtail millet populations and other related species. It is suitable for germplasm management and protection in Poaceae.

  18. Development of Simple Sequence Repeats (SSR) Markers in Setaria italica (Poaceae) and Cross-Amplification in Related Species

    PubMed Central

    Lin, Heng-Sheng; Chiang, Chih-Yun; Chang, Song-Bin; Kuoh, Chang-Sheng

    2011-01-01

    Foxtail millet is one of the world’s oldest cultivated crops. It has been adopted as a model organism for providing a deeper understanding of plant biology. In this study, 45 simple sequence repeats (SSR) markers of Setaria italica were developed. These markers showing polymorphism were screened in 223 samples from 12 foxtail millet populations around Taiwan. The most common dinucleotide and trinucleotide repeat motifs are AC/TG (84.21%) and CAT (46.15%). The average number of alleles (Na), the average heterozygosities observed (Ho) and expected (He) are 3.73, 0.714, 0.587, respectively. In addition, 24 SSR markers had shown transferability to six related Poaceae species. These new markers provide tools for examining genetic relatedness among foxtail millet populations and other related species. It is suitable for germplasm management and protection in Poaceae. PMID:22174636

  19. Touch imprint cytology with massively parallel sequencing (TIC-seq): a simple and rapid method to snapshot genetic alterations in tumors.

    PubMed

    Amemiya, Kenji; Hirotsu, Yosuke; Goto, Taichiro; Nakagomi, Hiroshi; Mochizuki, Hitoshi; Oyama, Toshio; Omata, Masao

    2016-12-01

    Identifying genetic alterations in tumors is critical for molecular targeting of therapy. In the clinical setting, formalin-fixed paraffin-embedded (FFPE) tissue is usually employed for genetic analysis. However, DNA extracted from FFPE tissue is often not suitable for analysis because of its low levels and poor quality. Additionally, FFPE sample preparation is time-consuming. To provide early treatment for cancer patients, a more rapid and robust method is required for precision medicine. We present a simple method for genetic analysis, called touch imprint cytology combined with massively paralleled sequencing (touch imprint cytology [TIC]-seq), to detect somatic mutations in tumors. We prepared FFPE tissues and TIC specimens from tumors in nine lung cancer patients and one patient with breast cancer. We found that the quality and quantity of TIC DNA was higher than that of FFPE DNA, which requires microdissection to enrich DNA from target tissues. Targeted sequencing using a next-generation sequencer obtained sufficient sequence data using TIC DNA. Most (92%) somatic mutations in lung primary tumors were found to be consistent between TIC and FFPE DNA. We also applied TIC DNA to primary and metastatic tumor tissues to analyze tumor heterogeneity in a breast cancer patient, and showed that common and distinct mutations among primary and metastatic sites could be classified into two distinct histological subtypes. TIC-seq is an alternative and feasible method to analyze genomic alterations in tumors by simply touching the cut surface of specimens to slides. © 2016 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.

  20. Molecular Analysis of Date Palm Genetic Diversity Using Random Amplified Polymorphic DNA (RAPD) and Inter-Simple Sequence Repeats (ISSRs).

    PubMed

    El Sharabasy, Sherif F; Soliman, Khaled A

    2017-01-01

    The date palm is an ancient domesticated plant with great diversity and has been cultivated in the Middle East and North Africa for at last 5000 years. Date palm cultivars are classified based on the fruit moisture content, as dry, semidry, and soft dates. There are a number of biochemical and molecular techniques available for characterization of the date palm variation. This chapter focuses on the DNA-based markers random amplified polymorphic DNA (RAPD) and inter-simple sequence repeats (ISSR) techniques, in addition to biochemical markers based on isozyme analysis. These techniques coupled with appropriate statistical tools proved useful for determining phylogenetic relationships among date palm cultivars and provide information resources for date palm gene banks.

  1. A comprehensive characterization of simple sequence repeats in pepper genomes provides valuable resources for marker development in Capsicum.

    PubMed

    Cheng, Jiaowen; Zhao, Zicheng; Li, Bo; Qin, Cheng; Wu, Zhiming; Trejo-Saavedra, Diana L; Luo, Xirong; Cui, Junjie; Rivera-Bustamante, Rafael F; Li, Shuaicheng; Hu, Kailin

    2016-01-07

    The sequences of the full set of pepper genomes including nuclear, mitochondrial and chloroplast are now available for use. However, the overall of simple sequence repeats (SSR) distribution in these genomes and their practical implications for molecular marker development in Capsicum have not yet been described. Here, an average of 868,047.50, 45.50 and 30.00 SSR loci were identified in the nuclear, mitochondrial and chloroplast genomes of pepper, respectively. Subsequently, systematic comparisons of various species, genome types, motif lengths, repeat numbers and classified types were executed and discussed. In addition, a local database composed of 113,500 in silico unique SSR primer pairs was built using a homemade bioinformatics workflow. As a pilot study, 65 polymorphic markers were validated among a wide collection of 21 Capsicum genotypes with allele number and polymorphic information content value per marker raging from 2 to 6 and 0.05 to 0.64, respectively. Finally, a comparison of the clustering results with those of a previous study indicated the usability of the newly developed SSR markers. In summary, this first report on the comprehensive characterization of SSR motifs in pepper genomes and the very large set of SSR primer pairs will benefit various genetic studies in Capsicum.

  2. A comprehensive characterization of simple sequence repeats in pepper genomes provides valuable resources for marker development in Capsicum

    PubMed Central

    Cheng, Jiaowen; Zhao, Zicheng; Li, Bo; Qin, Cheng; Wu, Zhiming; Trejo-Saavedra, Diana L.; Luo, Xirong; Cui, Junjie; Rivera-Bustamante, Rafael F.; Li, Shuaicheng; Hu, Kailin

    2016-01-01

    The sequences of the full set of pepper genomes including nuclear, mitochondrial and chloroplast are now available for use. However, the overall of simple sequence repeats (SSR) distribution in these genomes and their practical implications for molecular marker development in Capsicum have not yet been described. Here, an average of 868,047.50, 45.50 and 30.00 SSR loci were identified in the nuclear, mitochondrial and chloroplast genomes of pepper, respectively. Subsequently, systematic comparisons of various species, genome types, motif lengths, repeat numbers and classified types were executed and discussed. In addition, a local database composed of 113,500 in silico unique SSR primer pairs was built using a homemade bioinformatics workflow. As a pilot study, 65 polymorphic markers were validated among a wide collection of 21 Capsicum genotypes with allele number and polymorphic information content value per marker raging from 2 to 6 and 0.05 to 0.64, respectively. Finally, a comparison of the clustering results with those of a previous study indicated the usability of the newly developed SSR markers. In summary, this first report on the comprehensive characterization of SSR motifs in pepper genomes and the very large set of SSR primer pairs will benefit various genetic studies in Capsicum. PMID:26739748

  3. Winnowing DNA for Rare Sequences: Highly Specific Sequence and Methylation Based Enrichment

    PubMed Central

    Thompson, Jason D.; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre

    2012-01-01

    Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue. PMID:22355378

  4. Winnowing DNA for rare sequences: highly specific sequence and methylation based enrichment.

    PubMed

    Thompson, Jason D; Shibahara, Gosuke; Rajan, Sweta; Pel, Joel; Marziali, Andre

    2012-01-01

    Rare mutations in cell populations are known to be hallmarks of many diseases and cancers. Similarly, differential DNA methylation patterns arise in rare cell populations with diagnostic potential such as fetal cells circulating in maternal blood. Unfortunately, the frequency of alleles with diagnostic potential, relative to wild-type background sequence, is often well below the frequency of errors in currently available methods for sequence analysis, including very high throughput DNA sequencing. We demonstrate a DNA preparation and purification method that through non-linear electrophoretic separation in media containing oligonucleotide probes, achieves 10,000 fold enrichment of target DNA with single nucleotide specificity, and 100 fold enrichment of unmodified methylated DNA differing from the background by the methylation of a single cytosine residue.

  5. GIGA: a simple, efficient algorithm for gene tree inference in the genomic age

    PubMed Central

    2010-01-01

    Background Phylogenetic relationships between genes are not only of theoretical interest: they enable us to learn about human genes through the experimental work on their relatives in numerous model organisms from bacteria to fruit flies and mice. Yet the most commonly used computational algorithms for reconstructing gene trees can be inaccurate for numerous reasons, both algorithmic and biological. Additional information beyond gene sequence data has been shown to improve the accuracy of reconstructions, though at great computational cost. Results We describe a simple, fast algorithm for inferring gene phylogenies, which makes use of information that was not available prior to the genomic age: namely, a reliable species tree spanning much of the tree of life, and knowledge of the complete complement of genes in a species' genome. The algorithm, called GIGA, constructs trees agglomeratively from a distance matrix representation of sequences, using simple rules to incorporate this genomic age information. GIGA makes use of a novel conceptualization of gene trees as being composed of orthologous subtrees (containing only speciation events), which are joined by other evolutionary events such as gene duplication or horizontal gene transfer. An important innovation in GIGA is that, at every step in the agglomeration process, the tree is interpreted/reinterpreted in terms of the evolutionary events that created it. Remarkably, GIGA performs well even when using a very simple distance metric (pairwise sequence differences) and no distance averaging over clades during the tree construction process. Conclusions GIGA is efficient, allowing phylogenetic reconstruction of very large gene families and determination of orthologs on a large scale. It is exceptionally robust to adding more gene sequences, opening up the possibility of creating stable identifiers for referring to not only extant genes, but also their common ancestors. We compared trees produced by GIGA to those in

  6. Construction of an Integrated High Density Simple Sequence Repeat Linkage Map in Cultivated Strawberry (Fragaria × ananassa) and its Applicability

    PubMed Central

    Isobe, Sachiko N.; Hirakawa, Hideki; Sato, Shusei; Maeda, Fumi; Ishikawa, Masami; Mori, Toshiki; Yamamoto, Yuko; Shirasawa, Kenta; Kimura, Mitsuhiro; Fukami, Masanobu; Hashizume, Fujio; Tsuji, Tomoko; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Tsuruoka, Hisano; Minami, Chiharu; Takahashi, Chika; Wada, Tsuyuko; Ono, Akiko; Kawashima, Kumiko; Nakazaki, Naomi; Kishida, Yoshie; Kohara, Mitsuyo; Nakayama, Shinobu; Yamada, Manabu; Fujishiro, Tsunakazu; Watanabe, Akiko; Tabata, Satoshi

    2013-01-01

    The cultivated strawberry (Fragaria× ananassa) is an octoploid (2n = 8x = 56) of the Rosaceae family whose genomic architecture is still controversial. Several recent studies support the AAA′A′BBB′B′ model, but its complexity has hindered genetic and genomic analysis of this important crop. To overcome this difficulty and to assist genome-wide analysis of F. × ananassa, we constructed an integrated linkage map by organizing a total of 4474 of simple sequence repeat (SSR) markers collected from published Fragaria sequences, including 3746 SSR markers [Fragaria vesca expressed sequence tag (EST)-derived SSR markers] derived from F. vesca ESTs, 603 markers (F. × ananassa EST-derived SSR markers) from F. × ananassa ESTs, and 125 markers (F. × ananassa transcriptome-derived SSR markers) from F. × ananassa transcripts. Along with the previously published SSR markers, these markers were mapped onto five parent-specific linkage maps derived from three mapping populations, which were then assembled into an integrated linkage map. The constructed map consists of 1856 loci in 28 linkage groups (LGs) that total 2364.1 cM in length. Macrosynteny at the chromosome level was observed between the LGs of F. × ananassa and the genome of F. vesca. Variety distinction on 129 F. × ananassa lines was demonstrated using 45 selected SSR markers. PMID:23248204

  7. Viewing multiple sequence alignments with the JavaScript Sequence Alignment Viewer (JSAV)

    PubMed Central

    Martin, Andrew C. R.

    2014-01-01

    The JavaScript Sequence Alignment Viewer (JSAV) is designed as a simple-to-use JavaScript component for displaying sequence alignments on web pages. The display of sequences is highly configurable with options to allow alternative coloring schemes, sorting of sequences and ’dotifying’ repeated amino acids. An option is also available to submit selected sequences to another web site, or to other JavaScript code. JSAV is implemented purely in JavaScript making use of the JQuery and JQuery-UI libraries. It does not use any HTML5-specific options to help with browser compatibility. The code is documented using JSDOC and is available from http://www.bioinf.org.uk/software/jsav/. PMID:25653836

  8. Viewing multiple sequence alignments with the JavaScript Sequence Alignment Viewer (JSAV).

    PubMed

    Martin, Andrew C R

    2014-01-01

    The JavaScript Sequence Alignment Viewer (JSAV) is designed as a simple-to-use JavaScript component for displaying sequence alignments on web pages. The display of sequences is highly configurable with options to allow alternative coloring schemes, sorting of sequences and 'dotifying' repeated amino acids. An option is also available to submit selected sequences to another web site, or to other JavaScript code. JSAV is implemented purely in JavaScript making use of the JQuery and JQuery-UI libraries. It does not use any HTML5-specific options to help with browser compatibility. The code is documented using JSDOC and is available from http://www.bioinf.org.uk/software/jsav/.

  9. Fatty Acid Profile and Unigene-Derived Simple Sequence Repeat Markers in Tung Tree (Vernicia fordii)

    PubMed Central

    Zhang, Lin; Jia, Baoguang; Tan, Xiaofeng; Thammina, Chandra S.; Long, Hongxu; Liu, Min; Wen, Shanna; Song, Xianliang; Cao, Heping

    2014-01-01

    Tung tree (Vernicia fordii) provides the sole source of tung oil widely used in industry. Lack of fatty acid composition and molecular markers hinders biochemical, genetic and breeding research. The objectives of this study were to determine fatty acid profiles and develop unigene-derived simple sequence repeat (SSR) markers in tung tree. Fatty acid profiles of 41 accessions showed that the ratio of α-eleostearic acid was increasing continuously with a parallel trend to the amount of tung oil accumulation while the ratios of other fatty acids were decreasing in different stages of the seeds and that α-eleostearic acid (18∶3) consisted of 77% of the total fatty acids in tung oil. Transcriptome sequencing identified 81,805 unigenes from tung cDNA library constructed using seed mRNA and discovered 6,366 SSRs in 5,404 unigenes. The di- and tri-nucleotide microsatellites accounted for 92% of the SSRs with AG/CT and AAG/CTT being the most abundant SSR motifs. Fifteen polymorphic genic-SSR markers were developed from 98 unigene loci tested in 41 cultivated tung accessions by agarose gel and capillary electrophoresis. Genbank database search identified 10 of them putatively coding for functional proteins. Quantitative PCR demonstrated that all 15 polymorphic SSR-associated unigenes were expressed in tung seeds and some of them were highly correlated with oil composition in the seeds. Dendrogram revealed that most of the 41 accessions were clustered according to the geographic region. These new polymorphic genic-SSR markers will facilitate future studies on genetic diversity, molecular fingerprinting, comparative genomics and genetic mapping in tung tree. The lipid profiles in the seeds of 41 tung accessions will be valuable for biochemical and breeding studies. PMID:25167054

  10. Discrete sequence prediction and its applications

    NASA Technical Reports Server (NTRS)

    Laird, Philip

    1992-01-01

    Learning from experience to predict sequences of discrete symbols is a fundamental problem in machine learning with many applications. We apply sequence prediction using a simple and practical sequence-prediction algorithm, called TDAG. The TDAG algorithm is first tested by comparing its performance with some common data compression algorithms. Then it is adapted to the detailed requirements of dynamic program optimization, with excellent results.

  11. Biophysical and structural considerations for protein sequence evolution

    PubMed Central

    2011-01-01

    Background Protein sequence evolution is constrained by the biophysics of folding and function, causing interdependence between interacting sites in the sequence. However, current site-independent models of sequence evolutions do not take this into account. Recent attempts to integrate the influence of structure and biophysics into phylogenetic models via statistical/informational approaches have not resulted in expected improvements in model performance. This suggests that further innovations are needed for progress in this field. Results Here we develop a coarse-grained physics-based model of protein folding and binding function, and compare it to a popular informational model. We find that both models violate the assumption of the native sequence being close to a thermodynamic optimum, causing directional selection away from the native state. Sampling and simulation show that the physics-based model is more specific for fold-defining interactions that vary less among residue type. The informational model diffuses further in sequence space with fewer barriers and tends to provide less support for an invariant sites model, although amino acid substitutions are generally conservative. Both approaches produce sequences with natural features like dN/dS < 1 and gamma-distributed rates across sites. Conclusions Simple coarse-grained models of protein folding can describe some natural features of evolving proteins but are currently not accurate enough to use in evolutionary inference. This is partly due to improper packing of the hydrophobic core. We suggest possible improvements on the representation of structure, folding energy, and binding function, as regards both native and non-native conformations, and describe a large number of possible applications for such a model. PMID:22171550

  12. A Simple Method for the Extraction, PCR-amplification, Cloning, and Sequencing of Pasteuria 16S rDNA from Small Numbers of Endospores.

    PubMed

    Atibalentja, N; Noel, G R; Ciancio, A

    2004-03-01

    For many years the taxonomy of the genus Pasteuria has been marred with confusion because the bacterium could not be cultured in vitro and, therefore, descriptions were based solely on morphological, developmental, and pathological characteristics. The current study sought to devise a simple method for PCR-amplification, cloning, and sequencing of Pasteuria 16S rDNA from small numbers of endospores, with no need for prior DNA purification. Results show that DNA extracts from plain glass bead-beating of crude suspensions containing 10,000 endospores at 0.2 x 10 endospores ml(-1) were sufficient for PCR-amplification of Pasteuria 16S rDNA, when used in conjunction with specific primers. These results imply that for P. penetrans and P. nishizawae only one parasitized female of Meloidogyne spp. and Heterodera glycines, respectively, should be sufficient, and as few as eight cadavers of Belonolaimus longicaudatus with an average number of 1,250 endospores of "Candidatus Pasteuria usgae" are needed for PCR-amplification of Pasteuria 16S rDNA. The method described in this paper should facilitate the sequencing of the 16S rDNA of the many Pasteuria isolates that have been reported on nematodes and, consequently, expedite the classification of those isolates through comparative sequence analysis.

  13. Simulations Using Random-Generated DNA and RNA Sequences

    ERIC Educational Resources Information Center

    Bryce, C. F. A.

    1977-01-01

    Using a very simple computer program written in BASIC, a very large number of random-generated DNA or RNA sequences are obtained. Students use these sequences to predict complementary sequences and translational products, evaluate base compositions, determine frequencies of particular triplet codons, and suggest possible secondary structures.…

  14. Genetic diversity among Puccinia melanocephala isolates from Brazil assessed using simple sequence repeat markers.

    PubMed

    Peixoto-Junior, R F; Creste, S; Landell, M G A; Nunes, D S; Sanguino, A; Campos, M F; Vencovsky, R; Tambarussi, E V; Figueira, A

    2014-09-26

    Brown rust (causal agent Puccinia melanocephala) is an important sugarcane disease that is responsible for large losses in yield worldwide. Despite its importance, little is known regarding the genetic diversity of this pathogen in the main Brazilian sugarcane cultivation areas. In this study, we characterized the genetic diversity of 34 P. melanocephala isolates from 4 Brazilian states using loci identified from an enriched simple sequence repeat (SSR) library. The aggressiveness of 3 isolates from major sugarcane cultivation areas was evaluated by inoculating an intermediately resistant and a susceptible cultivar. From the enriched library, 16 SSR-specific primers were developed, which produced scorable alleles. Of these, 4 loci were polymorphic and 12 were monomorphic for all isolates evaluated. The molecular characterization of the 34 isolates of P. melanocephala conducted using 16 SSR loci revealed the existence of low genetic variability among the isolates. The average estimated genetic distance was 0.12. Phenetic analysis based on Nei's genetic distance clustered the isolates into 2 major groups. Groups I and II included 18 and 14 isolates, respectively, and both groups contained isolates from all 4 geographic regions studied. Two isolates did not cluster with these groups. It was not possible to obtain clusters according to location or state of origin. Analysis of disease severity data revealed that the isolates did not show significant differences in aggressiveness between regions.

  15. Probabilistic simple sticker systems

    NASA Astrophysics Data System (ADS)

    Selvarajoo, Mathuri; Heng, Fong Wan; Sarmin, Nor Haniza; Turaev, Sherzod

    2017-04-01

    A model for DNA computing using the recombination behavior of DNA molecules, known as a sticker system, was introduced by by L. Kari, G. Paun, G. Rozenberg, A. Salomaa, and S. Yu in the paper entitled DNA computing, sticker systems and universality from the journal of Acta Informatica vol. 35, pp. 401-420 in the year 1998. A sticker system uses the Watson-Crick complementary feature of DNA molecules: starting from the incomplete double stranded sequences, and iteratively using sticking operations until a complete double stranded sequence is obtained. It is known that sticker systems with finite sets of axioms and sticker rules generate only regular languages. Hence, different types of restrictions have been considered to increase the computational power of sticker systems. Recently, a variant of restricted sticker systems, called probabilistic sticker systems, has been introduced [4]. In this variant, the probabilities are initially associated with the axioms, and the probability of a generated string is computed by multiplying the probabilities of all occurrences of the initial strings in the computation of the string. Strings for the language are selected according to some probabilistic requirements. In this paper, we study fundamental properties of probabilistic simple sticker systems. We prove that the probabilistic enhancement increases the computational power of simple sticker systems.

  16. Characterization of EST-derived and non-EST simple sequence repeats in an F₁ hybrid population of Vitis vinifera L.

    PubMed

    Kayesh, E; Bilkish, N; Liu, G S; Chen, W; Leng, X P; Fang, J G

    2014-03-31

    Among different classes of molecular markers, expressed sequence tags (ESTs) are a new resource for developing simple sequence repeat (SSR) functional markers for genotyping and genetic mapping in F1 hybrid populations of Vitis vinifera L. Recently, because of the availability of an enormous amount of data for ESTs in the public domain, the emphasis has shifted from genomic SSRs to EST-SSRs, which belong to transcribed regions of the genome and may have a role in gene expression or function. The objective of this study was to assess the polymorphisms among 94 F1 hybrids from "Early Rose" and "Red Globe" using 25 EST-derived and 25 non-EST SSR markers. A total collection of 362,375 grape ESTs that were retrieved from the National Center for Biotechnology Information (NCBI) and 2522 EST-SSR sequences were identified. From them, 205 primer pairs were randomly selected, including 176 pairs that were EST-derived and 29 non-EST SSR primer pairs, for polymerase chain reaction amplification. A total of 131 alleles were amplified using 50 pairs of primers; 78 alleles were amplified using EST-derived SSR primers and 53 were from non-EST SSR primers. At most, 6 and 5 alleles were amplified by EST-derived and non-EST SSR primers, respectively. The EST-derived SSR markers showed a maximum polymorphic information content (PIC) value of 1 and a minimum of 0.33 while non-EST SSR markers had maximum and minimum PIC values of 1 and 0.25, respectively. The average PIC value was 0.56 for EST-derived SSR markers and 0.45 for non-EST SSR markers.

  17. BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons

    PubMed Central

    2011-01-01

    Background Visualisation of genome comparisons is invaluable for helping to determine genotypic differences between closely related prokaryotes. New visualisation and abstraction methods are required in order to improve the validation, interpretation and communication of genome sequence information; especially with the increasing amount of data arising from next-generation sequencing projects. Visualising a prokaryote genome as a circular image has become a powerful means of displaying informative comparisons of one genome to a number of others. Several programs, imaging libraries and internet resources already exist for this purpose, however, most are either limited in the number of comparisons they can show, are unable to adequately utilise draft genome sequence data, or require a knowledge of command-line scripting for implementation. Currently, there is no freely available desktop application that enables users to rapidly visualise comparisons between hundreds of draft or complete genomes in a single image. Results BLAST Ring Image Generator (BRIG) can generate images that show multiple prokaryote genome comparisons, without an arbitrary limit on the number of genomes compared. The output image shows similarity between a central reference sequence and other sequences as a set of concentric rings, where BLAST matches are coloured on a sliding scale indicating a defined percentage identity. Images can also include draft genome assembly information to show read coverage, assembly breakpoints and collapsed repeats. In addition, BRIG supports the mapping of unassembled sequencing reads against one or more central reference sequences. Many types of custom data and annotations can be shown using BRIG, making it a versatile approach for visualising a range of genomic comparison data. BRIG is readily accessible to any user, as it assumes no specialist computational knowledge and will perform all required file parsing and BLAST comparisons automatically. Conclusions

  18. Microcomputer-Assisted Mathematics: From Simple Interest to e.

    ERIC Educational Resources Information Center

    Kimberling, Clark

    1985-01-01

    The progression from simple interest to compound interest leads naturally and quickly to the number e, involving mathematical discovery learning through writing programs. Several programs are given, with suggestions for a teaching sequence. (MNS)

  19. Genetic variation and DNA fingerprinting of durian types in Malaysia using simple sequence repeat (SSR) markers.

    PubMed

    Siew, Ging Yang; Ng, Wei Lun; Tan, Sheau Wei; Alitheen, Noorjahan Banu; Tan, Soon Guan; Yeap, Swee Keong

    2018-01-01

    Durian ( Durio zibethinus ) is one of the most popular tropical fruits in Asia. To date, 126 durian types have been registered with the Department of Agriculture in Malaysia based on phenotypic characteristics. Classification based on morphology is convenient, easy, and fast but it suffers from phenotypic plasticity as a direct result of environmental factors and age. To overcome the limitation of morphological classification, there is a need to carry out genetic characterization of the various durian types. Such data is important for the evaluation and management of durian genetic resources in producing countries. In this study, simple sequence repeat (SSR) markers were used to study the genetic variation in 27 durian types from the germplasm collection of Universiti Putra Malaysia. Based on DNA sequences deposited in Genbank, seven pairs of primers were successfully designed to amplify SSR regions in the durian DNA samples. High levels of variation among the 27 durian types were observed (expected heterozygosity, H E  = 0.35). The DNA fingerprinting power of SSR markers revealed by the combined probability of identity (PI) of all loci was 2.3×10 -3 . Unique DNA fingerprints were generated for 21 out of 27 durian types using five polymorphic SSR markers (the other two SSR markers were monomorphic). We further tested the utility of these markers by evaluating the clonal status of shared durian types from different germplasm collection sites, and found that some were not clones. The findings in this preliminary study not only shows the feasibility of using SSR markers for DNA fingerprinting of durian types, but also challenges the current classification of durian types, e.g., on whether the different types should be called "clones", "varieties", or "cultivars". Such matters have a direct impact on the regulation and management of durian genetic resources in the region.

  20. Genetic variation and DNA fingerprinting of durian types in Malaysia using simple sequence repeat (SSR) markers

    PubMed Central

    Siew, Ging Yang; Tan, Sheau Wei; Tan, Soon Guan; Yeap, Swee Keong

    2018-01-01

    Durian (Durio zibethinus) is one of the most popular tropical fruits in Asia. To date, 126 durian types have been registered with the Department of Agriculture in Malaysia based on phenotypic characteristics. Classification based on morphology is convenient, easy, and fast but it suffers from phenotypic plasticity as a direct result of environmental factors and age. To overcome the limitation of morphological classification, there is a need to carry out genetic characterization of the various durian types. Such data is important for the evaluation and management of durian genetic resources in producing countries. In this study, simple sequence repeat (SSR) markers were used to study the genetic variation in 27 durian types from the germplasm collection of Universiti Putra Malaysia. Based on DNA sequences deposited in Genbank, seven pairs of primers were successfully designed to amplify SSR regions in the durian DNA samples. High levels of variation among the 27 durian types were observed (expected heterozygosity, HE = 0.35). The DNA fingerprinting power of SSR markers revealed by the combined probability of identity (PI) of all loci was 2.3×10−3. Unique DNA fingerprints were generated for 21 out of 27 durian types using five polymorphic SSR markers (the other two SSR markers were monomorphic). We further tested the utility of these markers by evaluating the clonal status of shared durian types from different germplasm collection sites, and found that some were not clones. The findings in this preliminary study not only shows the feasibility of using SSR markers for DNA fingerprinting of durian types, but also challenges the current classification of durian types, e.g., on whether the different types should be called “clones”, “varieties”, or “cultivars”. Such matters have a direct impact on the regulation and management of durian genetic resources in the region. PMID:29511604

  1. Recursive sequences in first-year calculus

    NASA Astrophysics Data System (ADS)

    Krainer, Thomas

    2016-02-01

    This article provides ready-to-use supplementary material on recursive sequences for a second-semester calculus class. It equips first-year calculus students with a basic methodical procedure based on which they can conduct a rigorous convergence or divergence analysis of many simple recursive sequences on their own without the need to invoke inductive arguments as is typically required in calculus textbooks. The sequences that are accessible to this kind of analysis are predominantly (eventually) monotonic, but also certain recursive sequences that alternate around their limit point as they converge can be considered.

  2. A Simple View of Writing in Chinese

    ERIC Educational Resources Information Center

    Yeung, Pui-sze; Ho, Connie Suk-han; Chan, David Wai-ock; Chung, Kevin Kien-hoa

    2017-01-01

    This study examined the Chinese written composition development of elementary-grade students in relation to the simple view of writing. Measures of nonverbal reasoning ability, component skills of transcription (stroke sequence knowledge, word spelling, and handwriting fluency), oral language (definitional skill, oral narrative skills, and…

  3. Predicting Protein Relationships to Human Pathways through a Relational Learning Approach Based on Simple Sequence Features.

    PubMed

    García-Jiménez, Beatriz; Pons, Tirso; Sanchis, Araceli; Valencia, Alfonso

    2014-01-01

    Biological pathways are important elements of systems biology and in the past decade, an increasing number of pathway databases have been set up to document the growing understanding of complex cellular processes. Although more genome-sequence data are becoming available, a large fraction of it remains functionally uncharacterized. Thus, it is important to be able to predict the mapping of poorly annotated proteins to original pathway models. We have developed a Relational Learning-based Extension (RLE) system to investigate pathway membership through a function prediction approach that mainly relies on combinations of simple properties attributed to each protein. RLE searches for proteins with molecular similarities to specific pathway components. Using RLE, we associated 383 uncharacterized proteins to 28 pre-defined human Reactome pathways, demonstrating relative confidence after proper evaluation. Indeed, in specific cases manual inspection of the database annotations and the related literature supported the proposed classifications. Examples of possible additional components of the Electron transport system, Telomere maintenance and Integrin cell surface interactions pathways are discussed in detail. All the human predicted proteins in the 2009 and 2012 releases 30 and 40 of Reactome are available at http://rle.bioinfo.cnio.es.

  4. Evaluation of genetic diversity amongst Descurainia sophia L. genotypes by inter-simple sequence repeat (ISSR) marker.

    PubMed

    Saki, Sahar; Bagheri, Hedayat; Deljou, Ali; Zeinalabedini, Mehrshad

    2016-01-01

    Descurainia sophia is a valuable medicinal plant in family of Brassicaceae. To determine the range of diversity amongst D. sophia in Iran, 32 naturally distributed plants belonging to six natural populations of the Iranian plateau were investigated by inter-simple sequence repeat (ISSR) markers. The average percentage of polymorphism produced by 12 ISSR primers was 86 %. The PIC values for primers ranged from 0.22 to 0.40 and Rp values ranged between 6.5 and 19.9. The relative genetic diversity of the populations was not high (Gst =0.32). However, the value of gene flow revealed by the ISSR marker was high (Nm = 1.03). UPGMA clustering method based on Jaccard similarity coefficient grouped the genotypes into two major clusters. Graph results from Neighbor-Net Network generated after a 1000 bootstrap test using Jaccard coefficient, and STRUCTURE analysis confirmed the UPGMA clustering. The first three PCAs represented 57.31 % of the total variation. The high levels of genetic diversity were observed within populations, which is useful in breeding and conservation programs. ISSR is found to be an eligible marker to study genetic diversity of D. sophia.

  5. Compression-based distance (CBD): a simple, rapid, and accurate method for microbiota composition comparison

    PubMed Central

    2013-01-01

    Background Perturbations in intestinal microbiota composition have been associated with a variety of gastrointestinal tract-related diseases. The alleviation of symptoms has been achieved using treatments that alter the gastrointestinal tract microbiota toward that of healthy individuals. Identifying differences in microbiota composition through the use of 16S rRNA gene hypervariable tag sequencing has profound health implications. Current computational methods for comparing microbial communities are usually based on multiple alignments and phylogenetic inference, making them time consuming and requiring exceptional expertise and computational resources. As sequencing data rapidly grows in size, simpler analysis methods are needed to meet the growing computational burdens of microbiota comparisons. Thus, we have developed a simple, rapid, and accurate method, independent of multiple alignments and phylogenetic inference, to support microbiota comparisons. Results We create a metric, called compression-based distance (CBD) for quantifying the degree of similarity between microbial communities. CBD uses the repetitive nature of hypervariable tag datasets and well-established compression algorithms to approximate the total information shared between two datasets. Three published microbiota datasets were used as test cases for CBD as an applicable tool. Our study revealed that CBD recaptured 100% of the statistically significant conclusions reported in the previous studies, while achieving a decrease in computational time required when compared to similar tools without expert user intervention. Conclusion CBD provides a simple, rapid, and accurate method for assessing distances between gastrointestinal tract microbiota 16S hypervariable tag datasets. PMID:23617892

  6. Integer sequence discovery from small graphs

    PubMed Central

    Hoppe, Travis; Petrone, Anna

    2015-01-01

    We have exhaustively enumerated all simple, connected graphs of a finite order and have computed a selection of invariants over this set. Integer sequences were constructed from these invariants and checked against the Online Encyclopedia of Integer Sequences (OEIS). 141 new sequences were added and six sequences were extended. From the graph database, we were able to programmatically suggest relationships among the invariants. It will be shown that we can readily visualize any sequence of graphs with a given criteria. The code has been released as an open-source framework for further analysis and the database was constructed to be extensible to invariants not considered in this work. PMID:27034526

  7. A Simple and Efficient Methodology To Improve Geometric Accuracy in Gamma Knife Radiation Surgery: Implementation in Multiple Brain Metastases

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Karaiskos, Pantelis, E-mail: pkaraisk@med.uoa.gr; Gamma Knife Department, Hygeia Hospital, Athens; Moutsatsos, Argyris

    Purpose: To propose, verify, and implement a simple and efficient methodology for the improvement of total geometric accuracy in multiple brain metastases gamma knife (GK) radiation surgery. Methods and Materials: The proposed methodology exploits the directional dependence of magnetic resonance imaging (MRI)-related spatial distortions stemming from background field inhomogeneities, also known as sequence-dependent distortions, with respect to the read-gradient polarity during MRI acquisition. First, an extra MRI pulse sequence is acquired with the same imaging parameters as those used for routine patient imaging, aside from a reversal in the read-gradient polarity. Then, “average” image data are compounded from data acquiredmore » from the 2 MRI sequences and are used for treatment planning purposes. The method was applied and verified in a polymer gel phantom irradiated with multiple shots in an extended region of the GK stereotactic space. Its clinical impact in dose delivery accuracy was assessed in 15 patients with a total of 96 relatively small (<2 cm) metastases treated with GK radiation surgery. Results: Phantom study results showed that use of average MR images eliminates the effect of sequence-dependent distortions, leading to a total spatial uncertainty of less than 0.3 mm, attributed mainly to gradient nonlinearities. In brain metastases patients, non-eliminated sequence-dependent distortions lead to target localization uncertainties of up to 1.3 mm (mean: 0.51 ± 0.37 mm) with respect to the corresponding target locations in the “average” MRI series. Due to these uncertainties, a considerable underdosage (5%-32% of the prescription dose) was found in 33% of the studied targets. Conclusions: The proposed methodology is simple and straightforward in its implementation. Regarding multiple brain metastases applications, the suggested approach may substantially improve total GK dose delivery accuracy in smaller, outlying targets.« less

  8. Next-Generation Sequencing of the Chrysanthemum nankingense (Asteraceae) Transcriptome Permits Large-Scale Unigene Assembly and SSR Marker Discovery

    PubMed Central

    Wang, Haibin; Jiang, Jiafu; Chen, Sumei; Qi, Xiangyu; Peng, Hui; Li, Pirui; Song, Aiping; Guan, Zhiyong; Fang, Weimin; Liao, Yuan; Chen, Fadi

    2013-01-01

    Background Simple sequence repeats (SSRs) are ubiquitous in eukaryotic genomes. Chrysanthemum is one of the largest genera in the Asteraceae family. Only few Chrysanthemum expressed sequence tag (EST) sequences have been acquired to date, so the number of available EST-SSR markers is very low. Methodology/Principal Findings Illumina paired-end sequencing technology produced over 53 million sequencing reads from C. nankingense mRNA. The subsequent de novo assembly yielded 70,895 unigenes, of which 45,789 (64.59%) unigenes showed similarity to the sequences in NCBI database. Out of 45,789 sequences, 107 have hits to the Chrysanthemum Nr protein database; 679 and 277 sequences have hits to the database of Helianthus and Lactuca species, respectively. MISA software identified a large number of putative EST-SSRs, allowing 1,788 primer pairs to be designed from the de novo transcriptome sequence and a further 363 from archival EST sequence. Among 100 primer pairs randomly chosen, 81 markers have amplicons and 20 are polymorphic for genotypes analysis in Chrysanthemum. The results showed that most (but not all) of the assays were transferable across species and that they exposed a significant amount of allelic diversity. Conclusions/Significance SSR markers acquired by transcriptome sequencing are potentially useful for marker-assisted breeding and genetic analysis in the genus Chrysanthemum and its related genera. PMID:23626799

  9. Identification of (R)-selective ω-aminotransferases by exploring evolutionary sequence space.

    PubMed

    Kim, Eun-Mi; Park, Joon Ho; Kim, Byung-Gee; Seo, Joo-Hyun

    2018-03-01

    Several (R)-selective ω-aminotransferases (R-ωATs) have been reported. The existence of additional R-ωATs having different sequence characteristics from previous ones is highly expected. In addition, it is generally accepted that R-ωATs are variants of aminotransferase group III. Based on these backgrounds, sequences in RefSeq database were scored using family profiles of branched-chain amino acid aminotransferase (BCAT) and d-alanine aminotransferase (DAT) to predict and identify putative R-ωATs. Sequences with two profile analysis scores were plotted on two-dimensional score space. Candidates with relatively similar scores in both BCAT and DAT profiles (i.e., profile analysis score using BCAT profile was similar to profile analysis score using DAT profile) were selected. Experimental results for selected candidates showed that putative R-ωATs from Saccharopolyspora erythraea (R-ωAT_Sery), Bacillus cellulosilyticus (R-ωAT_Bcel), and Bacillus thuringiensis (R-ωAT_Bthu) had R-ωAT activity. Additional experiments revealed that R-ωAT_Sery also possessed DAT activity while R-ωAT_Bcel and R-ωAT_Bthu had BCAT activity. Selecting putative R-ωATs from regions with similar profile analysis scores identified potential R-ωATs. Therefore, R-ωATs could be efficiently identified by using simple family profile analysis and exploring evolutionary sequence space. Copyright © 2017 Elsevier Inc. All rights reserved.

  10. Digital RNA sequencing minimizes sequence-dependent bias and amplification noise with optimized single-molecule barcodes

    PubMed Central

    Shiroguchi, Katsuyuki; Jia, Tony Z.; Sims, Peter A.; Xie, X. Sunney

    2012-01-01

    RNA sequencing (RNA-Seq) is a powerful tool for transcriptome profiling, but is hampered by sequence-dependent bias and inaccuracy at low copy numbers intrinsic to exponential PCR amplification. We developed a simple strategy for mitigating these complications, allowing truly digital RNA-Seq. Following reverse transcription, a large set of barcode sequences is added in excess, and nearly every cDNA molecule is uniquely labeled by random attachment of barcode sequences to both ends. After PCR, we applied paired-end deep sequencing to read the two barcodes and cDNA sequences. Rather than counting the number of reads, RNA abundance is measured based on the number of unique barcode sequences observed for a given cDNA sequence. We optimized the barcodes to be unambiguously identifiable, even in the presence of multiple sequencing errors. This method allows counting with single-copy resolution despite sequence-dependent bias and PCR-amplification noise, and is analogous to digital PCR but amendable to quantifying a whole transcriptome. We demonstrated transcriptome profiling of Escherichia coli with more accurate and reproducible quantification than conventional RNA-Seq. PMID:22232676

  11. Organelle Simple Sequence Repeat Markers Help to Distinguish Carpelloid Stamen and Normal Cytoplasmic Male Sterile Sources in Broccoli

    PubMed Central

    Shu, Jinshuai; Liu, Yumei; Li, Zhansheng; Zhang, Lili; Fang, Zhiyuan; Yang, Limei; Zhuang, Mu; Zhang, Yangyong; Lv, Honghao

    2015-01-01

    We previously discovered carpelloid stamens when breeding cytoplasmic male sterile lines in broccoli (Brassica oleracea var. italica). In this study, hybrids and multiple backcrosses were produced from different cytoplasmic male sterile carpelloid stamen sources and maintainer lines. Carpelloid stamens caused dysplasia of the flower structure and led to hooked or coiled siliques with poor seed setting, which were inherited in a maternal fashion. Using four distinct carpelloid stamens and twelve distinct normal stamens from cytoplasmic male sterile sources and one maintainer, we used 21 mitochondrial simple sequence repeat (mtSSR) primers and 32 chloroplast SSR primers to identify a mitochondrial marker, mtSSR2, that can differentiate between the cytoplasm of carpelloid and normal stamens. Thereafter, mtSSR2 was used to identify another 34 broccoli accessions, with an accuracy rate of 100%. Analysis of the polymorphic sequences revealed that the mtSSR2 open reading frame of carpelloid stamen sterile sources had a deletion of 51 bases (encoding 18 amino acids) compared with normal stamen materials. The open reading frame is located in the coding region of orf125 and orf108 of the mitochondrial genomes in Brassica crops and had the highest similarity with Raphanus sativus and Brassica carinata. The current study has not only identified a useful molecular marker to detect the cytoplasm of carpelloid stamens during broccoli breeding, but it also provides evidence that the mitochondrial genome is maternally inherited and provides a basis for studying the effect of the cytoplasm on flower organ development in plants. PMID:26407159

  12. Ultrafast Pulse Sequencing for Fast Projective Measurements of Atomic Hyperfine Qubits

    NASA Astrophysics Data System (ADS)

    Ip, Michael; Ransford, Anthony; Campbell, Wesley

    2015-05-01

    Projective readout of quantum information stored in atomic hyperfine structure typically uses state-dependent CW laser-induced fluorescence. This method requires an often sophisticated imaging system to spatially filter out the background CW laser light. We present an alternative approach that instead uses simple pulse sequences from a mode-locked laser to affect the same state-dependent excitations in less than 1 ns. The resulting atomic fluorescence occurs in the dark, allowing the placement of non-imaging detectors right next to the atom to improve the qubit state detection efficiency and speed. We also discuss methods of Doppler cooling with mode-locked lasers for trapped ions, where the creation of the necessary UV light is often difficult with CW lasers.

  13. Principles of protein folding--a perspective from simple exact models.

    PubMed Central

    Dill, K. A.; Bromberg, S.; Yue, K.; Fiebig, K. M.; Yee, D. P.; Thomas, P. D.; Chan, H. S.

    1995-01-01

    General principles of protein structure, stability, and folding kinetics have recently been explored in computer simulations of simple exact lattice models. These models represent protein chains at a rudimentary level, but they involve few parameters, approximations, or implicit biases, and they allow complete explorations of conformational and sequence spaces. Such simulations have resulted in testable predictions that are sometimes unanticipated: The folding code is mainly binary and delocalized throughout the amino acid sequence. The secondary and tertiary structures of a protein are specified mainly by the sequence of polar and nonpolar monomers. More specific interactions may refine the structure, rather than dominate the folding code. Simple exact models can account for the properties that characterize protein folding: two-state cooperativity, secondary and tertiary structures, and multistage folding kinetics--fast hydrophobic collapse followed by slower annealing. These studies suggest the possibility of creating "foldable" chain molecules other than proteins. The encoding of a unique compact chain conformation may not require amino acids; it may require only the ability to synthesize specific monomer sequences in which at least one monomer type is solvent-averse. PMID:7613459

  14. A simple protocol for combinatorial cyclic depsipeptide libraries sequencing by matrix-assisted laser desorption/ionisation mass spectrometry.

    PubMed

    Gurevich-Messina, Juan M; Giudicessi, Silvana L; Martínez-Ceron, María C; Acosta, Gerardo; Erra-Balsells, Rosa; Cascone, Osvaldo; Albericio, Fernando; Camperi, Silvia A

    2015-01-01

    Short cyclic peptides have a great interest in therapeutic, diagnostic and affinity chromatography applications. The screening of 'one-bead-one-peptide' combinatorial libraries combined with mass spectrometry (MS) is an excellent tool to find peptides with affinity for any target protein. The fragmentation patterns of cyclic peptides are quite more complex than those of their linear counterparts, and the elucidation of the resulting tandem mass spectra is rather more difficult. Here, we propose a simple protocol for combinatorial cyclic libraries synthesis and ring opening before MS analysis. In this strategy, 4-hydroxymethylbenzoic acid, which forms a benzyl ester with the first amino acid, was used as the linker. A glycolamidic ester group was incorporated after the combinatorial positions by adding glycolic acid. The library synthesis protocol consisted in the following: (i) incorporation of Fmoc-Asp[2-phenylisopropyl (OPp)]-OH to Ala-Gly-oxymethylbenzamide-ChemMatrix, (ii) synthesis of the combinatorial library, (iii) assembly of a glycolic acid, (iv) couple of an Ala residue in the N-terminal, (v) removal of OPp, (vi) peptide cyclisation through side chain Asp and N-Ala amino terminus and (vii) removal of side chain protecting groups. In order to simultaneously open the ring and release each peptide, benzyl and glycolamidic esters were cleaved with ammonia. Peptide sequences could be deduced from the tandem mass spectra of each single bead evaluated. The strategy herein proposed is suitable for the preparation of one-bead-one-cyclic depsipeptide libraries that can be easily open for its sequencing by matrix-assisted laser desorption/ionisation MS. It employs techniques and reagents frequently used in a broad range of laboratories without special expertise in organic synthesis. Copyright © 2014 European Peptide Society and John Wiley & Sons, Ltd.

  15. Vibrio vulnificus typing based on simple sequence repeats: insights into the biotype 3 group.

    PubMed

    Broza, Yoav Y; Danin-Poleg, Yael; Lerner, Larisa; Broza, Meir; Kashi, Yechezkel

    2007-09-01

    Vibrio vulnificus is an opportunistic, highly invasive human pathogen with worldwide distribution. V. vulnificus strains are commonly divided into three biochemical groups (biotypes), most members of which are pathogenic. Simple sequence repeats (SSR) provide a source of high-level genomic polymorphism used in bacterial typing. Here, we describe the use of variations in mutable SSR loci for accurate and rapid genotyping of V. vulnificus. An in silico screen of the genomes of two V. vulnificus strains revealed thousands of SSR tracts. Twelve SSR with core motifs longer than 5 bp in a panel of 32 characterized and 56 other V. vulnificus isolates, including both clinical and environmental isolates from all three biotypes, were tested for polymorphism. All tested SSR were polymorphic, and diversity indices ranged from 0.17 to 0.90, allowing a high degree of discrimination among isolates (27 of 32 characterized isolates). Genetic analysis of the SSR data resulted in the clear distinction of isolates that belong to the highly virulent biotype 3 group. Despite the clonal nature of this new group, SSR analysis demonstrated high-level discriminatory power within the biotype 3 group, as opposed to other molecular methods that failed to differentiate these isolates. Thus, SSR are suitable for rapid typing and classification of V. vulnificus strains by high-throughput capillary electrophoresis methods. SSR (>/=5 bp) by their nature enable the identification of variations occurring on a small scale and, therefore, may provide new insights into the newly emerged biotype 3 group of V. vulnificus and may be used as an efficient tool in epidemiological studies.

  16. Analysis of genetic relationships and identification of lily cultivars based on inter-simple sequence repeat markers.

    PubMed

    Cui, G F; Wu, L F; Wang, X N; Jia, W J; Duan, Q; Ma, L L; Jiang, Y L; Wang, J H

    2014-07-29

    Inter-simple sequence repeat (ISSR) markers were used to discriminate 62 lily cultivars of 5 hybrid series. Eight ISSR primers generated 104 bands in total, which all showed 100% polymorphism, and an average of 13 bands were amplified by each primer. Two software packages, POPGENE 1.32 and NTSYSpc 2.1, were used to analyze the data matrix. Our results showed that the observed number of alleles (NA), effective number of alleles (NE), Nei's genetic diversity (H), and Shannon's information index (I) were 1.9630, 1.4179, 0.2606, and 0.4080, respectively. The highest genetic similarity (0.9601) was observed between the Oriental x Trumpet and Oriental lilies, which indicated that the two hybrids had a close genetic relationship. An unweighted pair-group method with arithmetic means dendrogram showed that the 62 lily cultivars clustered into two discrete groups. The first group included the Oriental and OT cultivars, while the Asiatic, LA, and Longiflorum lilies were placed in the second cluster. The distribution of individuals in the principal component analysis was consistent with the clustering of the dendrogram. Fingerprints of all lily cultivars built from 8 primers could be separated completely. This study confirmed the effect and efficiency of ISSR identification in lily cultivars.

  17. "First generation" automated DNA sequencing technology.

    PubMed

    Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

    2011-10-01

    Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.

  18. Mapping QTL for popping expansion volume in popcorn with simple sequence repeat markers.

    PubMed

    Lu, H-J; Bernardo, R; Ohm, H W

    2003-02-01

    Popping expansion volume is the most important quality trait in popcorn ( Zea mays L.), but its genetics is not well understood. The objectives of this study were to map quantitative trait loci (QTLs) responsible for popping expansion volume in a popcorn x dent corn cross, and to compare the predicted efficiencies of phenotypic selection, marker-based selection, and marker-assisted selection for popping expansion volume. Of 259 simple sequence repeat (SSR) primer pairs screened, 83 pairs were polymorphic between the H123 (dent corn) and AG19 (popcorn) parental inbreds. Popping test data were obtained for 160 S(1) families developed from the [AG19(H123 x AG19)] BC(1) population. The heritability ( h(2)) for popping expansion volume on an S(1) family mean basis was 0.73. The presence of the gametophyte factor Ga1(s) in popcorn complicates the analysis of popcorn x dent corn crosses. But, from a practical perspective, the linkage between a favorable QTL allele and Ga1(s) in popcorn will lead to selection for the favorable QTL allele. Four QTLs, on chromosomes 1S, 3S, 5S and 5L, jointly explained 45% of the phenotypic variation. Marker-based selection for popping expansion volume would require less time and work than phenotypic selection. But due to the high h(2) of popping expansion volume, marker-based selection was predicted to be only 92% as efficient as phenotypic selection. Marker-assisted selection, which comprises index selection on phenotypic and marker scores, was predicted to be 106% as efficient as phenotypic selection. Overall, our results suggest that phenotypic selection will remain the preferred method for selection in popcorn x dent corn crosses.

  19. The characterization of a new set of EST-derived simple sequence repeat (SSR) markers as a resource for the genetic analysis of Phaseolus vulgaris

    PubMed Central

    2011-01-01

    Background Over recent years, a growing effort has been made to develop microsatellite markers for the genomic analysis of the common bean (Phaseolus vulgaris) to broaden the knowledge of the molecular genetic basis of this species. The availability of large sets of expressed sequence tags (ESTs) in public databases has given rise to an expedient approach for the identification of SSRs (Simple Sequence Repeats), specifically EST-derived SSRs. In the present work, a battery of new microsatellite markers was obtained from a search of the Phaseolus vulgaris EST database. The diversity, degree of transferability and polymorphism of these markers were tested. Results From 9,583 valid ESTs, 4,764 had microsatellite motifs, from which 377 were used to design primers, and 302 (80.11%) showed good amplification quality. To analyze transferability, a group of 167 SSRs were tested, and the results showed that they were 82% transferable across at least one species. The highest amplification rates were observed between the species from the Phaseolus (63.7%), Vigna (25.9%), Glycine (19.8%), Medicago (10.2%), Dipterix (6%) and Arachis (1.8%) genera. The average PIC (Polymorphism Information Content) varied from 0.53 for genomic SSRs to 0.47 for EST-SSRs, and the average number of alleles per locus was 4 and 3, respectively. Among the 315 newly tested SSRs in the BJ (BAT93 X Jalo EEP558) population, 24% (76) were polymorphic. The integration of these segregant loci into a framework map composed of 123 previously obtained SSR markers yielded a total of 199 segregant loci, of which 182 (91.5%) were mapped to 14 linkage groups, resulting in a map length of 1,157 cM. Conclusions A total of 302 newly developed EST-SSR markers, showing good amplification quality, are available for the genetic analysis of Phaseolus vulgaris. These markers showed satisfactory rates of transferability, especially between species that have great economic and genomic values. Their diversity was comparable to

  20. Molecular beacon sequence design algorithm.

    PubMed

    Monroe, W Todd; Haselton, Frederick R

    2003-01-01

    A method based on Web-based tools is presented to design optimally functioning molecular beacons. Molecular beacons, fluorogenic hybridization probes, are a powerful tool for the rapid and specific detection of a particular nucleic acid sequence. However, their synthesis costs can be considerable. Since molecular beacon performance is based on its sequence, it is imperative to rationally design an optimal sequence before synthesis. The algorithm presented here uses simple Microsoft Excel formulas and macros to rank candidate sequences. This analysis is carried out using mfold structural predictions along with other free Web-based tools. For smaller laboratories where molecular beacons are not the focus of research, the public domain algorithm described here may be usefully employed to aid in molecular beacon design.

  1. Adenine specific DNA chemical sequencing reaction.

    PubMed Central

    Iverson, B L; Dervan, P B

    1987-01-01

    Reaction of DNA with K2PdCl4 at pH 2.0 followed by a piperidine workup produces specific cleavage at adenine (A) residues. Product analysis revealed the K2PdCl4 reaction involves selective depurination at adenine, affording an excision reaction analogous to the other chemical DNA sequencing reactions. Adenine residues methylated at the exocyclic amine (N6) react with lower efficiency than unmethylated adenine in an identical sequence. This simple protocol specific for A may be a useful addition to current chemical sequencing reactions. Images PMID:3671067

  2. Rapid evaluation and quality control of next generation sequencing data with FaQCs

    DOE PAGES

    Lo, Chien -Chi; Chain, Patrick S. G.

    2014-12-01

    Background: Next generation sequencing (NGS) technologies that parallelize the sequencing process and produce thousands to millions, or even hundreds of millions of sequences in a single sequencing run, have revolutionized genomic and genetic research. Because of the vagaries of any platform's sequencing chemistry, the experimental processing, machine failure, and so on, the quality of sequencing reads is never perfect, and often declines as the read is extended. These errors invariably affect downstream analysis/application and should therefore be identified early on to mitigate any unforeseen effects. Results: Here we present a novel FastQ Quality Control Software (FaQCs) that can rapidly processmore » large volumes of data, and which improves upon previous solutions to monitor the quality and remove poor quality data from sequencing runs. Both the speed of processing and the memory footprint of storing all required information have been optimized via algorithmic and parallel processing solutions. The trimmed output compared side-by-side with the original data is part of the automated PDF output. We show how this tool can help data analysis by providing a few examples, including an increased percentage of reads recruited to references, improved single nucleotide polymorphism identification as well as de novo sequence assembly metrics. Conclusion: FaQCs combines several features of currently available applications into a single, user-friendly process, and includes additional unique capabilities such as filtering the PhiX control sequences, conversion of FASTQ formats, and multi-threading. The original data and trimmed summaries are reported within a variety of graphics and reports, providing a simple way to do data quality control and assurance.« less

  3. Parkin dosage mutations have greater pathogenicity in familial PD than simple sequence mutations

    PubMed Central

    Pankratz, N; Kissell, D K.; Pauciulo, M W.; Halter, C A.; Rudolph, A; Pfeiffer, R F.; Marder, K S.; Foroud, T; Nichols, W C.

    2009-01-01

    Objective: Mutations in both alleles of parkin have been shown to result in Parkinson disease (PD). However, it is unclear whether haploinsufficiency (presence of a mutation in only 1 of the 2 parkin alleles) increases the risk for PD. Methods: We performed comprehensive dosage and sequence analysis of all 12 exons of parkin in a sample of 520 independent patients with familial PD and 263 controls. We evaluated whether presence of a single parkin mutation, either a sequence (point mutation or small insertion/deletion) or dosage (whole exon deletion or duplication) mutation, was found at increased frequency in cases as compared with controls. We then compared the clinical characteristics of cases with 0, 1, or 2 parkin mutations. Results: We identified 55 independent patients with PD with at least 1 parkin mutation and 9 controls with a single sequence mutation. Cases and controls had a similar frequency of single sequence mutations (3.1% vs 3.4%, p = 0.83); however, the cases had a significantly higher rate of dosage mutations (2.6% vs 0%, p = 0.009). Cases with a single dosage mutation were more likely to have an earlier age at onset (50% with onset at ≤45 years) compared with those with no parkin mutations (10%, p = 0.00002); this was not true for cases with only a single sequence mutation (25% with onset at ≤45 years, p = 0.06). Conclusions: Parkin haploinsufficiency, specifically for a dosage mutation rather than a point mutation or small insertion/deletion, is a risk factor for familial PD and may be associated with earlier age at onset. GLOSSARY ADL = Activities of Daily Living; GDS = Geriatric Depression Scale; MLPA = multiplex ligation-dependent probe amplification; MMSE = Mini-Mental State Examination; PD = Parkinson disease; UPDRS = Unified Parkinson’s Disease Rating Scale. PMID:19636047

  4. Genetic diversity of an Azorean endemic and endangered plant species inferred from inter-simple sequence repeat markers.

    PubMed

    Lopes, Maria S; Mendonça, Duarte; Bettencourt, Sílvia X; Borba, Ana R; Melo, Catarina; Baptista, Cláudio; da Câmara Machado, Artur

    2014-06-26

    Knowledge of the levels and distribution of genetic diversity is important for designing conservation strategies for threatened and endangered species so as to guarantee sustainable survival of populations and to preserve their evolutionary potential. Picconia azorica is a valuable Azorean endemic species recently classified as endangered. To contribute with information useful for the establishment of conservation programmes, the genetic variability and differentiation among 230 samples from 11 populations collected in three Azorean islands was accessed with eight inter-simple sequence repeat markers. A total of 64 polymorphic loci were detected. The majority of genetic variability was found within populations and no genetic structure was detected between populations and between islands. Also the coefficient of genetic differentiation and the level of gene flow indicate that geographical distances do not act as barriers for gene flow. In order to ensure the survival of populations in situ and ex situ management practices should be considered, including artificial propagation through the use of plant tissue culture techniques, not only for the restoration of habitat but also for the sustainable use of its valuable wood. Published by Oxford University Press on behalf of the Annals of Botany Company.

  5. A Simple Acronym for Doing Calculus: CAL

    ERIC Educational Resources Information Center

    Hathaway, Richard J.

    2008-01-01

    An acronym is presented that provides students a potentially useful, unifying view of the major topics covered in an elementary calculus sequence. The acronym (CAL) is based on viewing the calculus procedure for solving a calculus problem P* in three steps: (1) recognizing that the problem cannot be solved using simple (non-calculus) techniques;…

  6. Label-free fluorescence strategy for sensitive detection of adenosine triphosphate using a loop DNA probe with low background noise.

    PubMed

    Lin, Chunshui; Cai, Zhixiong; Wang, Yiru; Zhu, Zhi; Yang, Chaoyong James; Chen, Xi

    2014-07-15

    A simple, rapid, label-free, and ultrasensitive fluorescence strategy for adenosine triphosphate (ATP) detection was developed using a loop DNA probe with low background noise. In this strategy, a loop DNA probe, which is the substrate for both ligation and digestion enzyme reaction, was designed. SYBR green I (SG I), a double-stranded specific dye, was applied for the readout fluorescence signal. Exonuclease I (Exo I) and exonuclease III (Exo III), sequence-independent nucleases, were selected to digest the loop DNA probe in order to minimize the background fluorescence signal. As a result, in the absence of ATP, the loop DNA was completely digested by Exo I and Exo III, leading to low background fluorescence owing to the weak electrostatic interaction between SG I and mononucleotides. On the other hand, ATP induced the ligation of the nicking site, and the sealed loop DNA resisted the digestion of Exo I and ExoIII, resulting in a remarkable increase of fluorescence response. Upon background noise reduction, the sensitivity of the ATP determination was improved significantly, and the detection limitation was found to be 1.2 pM, which is much lower than that in almost all the previously reported methods. This strategy has promise for wide application in the determination of ATP.

  7. Research Techniques Made Simple: Single-Cell RNA Sequencing and its Applications in Dermatology.

    PubMed

    Wu, Xiaojun; Yang, Bin; Udo-Inyang, Imo; Ji, Suyun; Ozog, David; Zhou, Li; Mi, Qing-Sheng

    2018-05-01

    RNA sequencing is one of the most highly reliable and reproducible methods of assessing the cell transcriptome. As high-throughput RNA sequencing libraries at the single cell level have recently developed, single cell RNA sequencing has become more feasible and popular in biology research. Single cell RNA sequencing allows investigators to evaluate cell transcriptional profiles at the single cell level. It has become a very useful tool to perform investigations that could not be addressed by other methodologies, such as the assessment of cell-to-cell variation, the identification of rare populations, and the determination of heterogeneity within a cell population. So far, the single cell RNA sequencing technique has been widely applied to embryonic development, immune cell development, and human disease progress and treatment. Here, we describe the history of single cell technology development and its potential application in the field of dermatology. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  8. Generation and analysis of expressed sequence tags from a cDNA library of the fruiting body of Ganoderma lucidum

    PubMed Central

    2010-01-01

    Background Little genomic or trancriptomic information on Ganoderma lucidum (Lingzhi) is known. This study aims to discover the transcripts involved in secondary metabolite biosynthesis and developmental regulation of G. lucidum using an expressed sequence tag (EST) library. Methods A cDNA library was constructed from the G. lucidum fruiting body. Its high-quality ESTs were assembled into unique sequences with contigs and singletons. The unique sequences were annotated according to sequence similarities to genes or proteins available in public databases. The detection of simple sequence repeats (SSRs) was preformed by online analysis. Results A total of 1,023 clones were randomly selected from the G. lucidum library and sequenced, yielding 879 high-quality ESTs. These ESTs showed similarities to a diverse range of genes. The sequences encoding squalene epoxidase (SE) and farnesyl-diphosphate synthase (FPS) were identified in this EST collection. Several candidate genes, such as hydrophobin, MOB2, profilin and PHO84 were detected for the first time in G. lucidum. Thirteen (13) potential SSR-motif microsatellite loci were also identified. Conclusion The present study demonstrates a successful application of EST analysis in the discovery of transcripts involved in the secondary metabolite biosynthesis and the developmental regulation of G. lucidum. PMID:20230644

  9. Complex Sequencing Rules of Birdsong Can be Explained by Simple Hidden Markov Processes

    PubMed Central

    Katahira, Kentaro; Suzuki, Kenta; Okanoya, Kazuo; Okada, Masato

    2011-01-01

    Complex sequencing rules observed in birdsongs provide an opportunity to investigate the neural mechanism for generating complex sequential behaviors. To relate the findings from studying birdsongs to other sequential behaviors such as human speech and musical performance, it is crucial to characterize the statistical properties of the sequencing rules in birdsongs. However, the properties of the sequencing rules in birdsongs have not yet been fully addressed. In this study, we investigate the statistical properties of the complex birdsong of the Bengalese finch (Lonchura striata var. domestica). Based on manual-annotated syllable labeles, we first show that there are significant higher-order context dependencies in Bengalese finch songs, that is, which syllable appears next depends on more than one previous syllable. We then analyze acoustic features of the song and show that higher-order context dependencies can be explained using first-order hidden state transition dynamics with redundant hidden states. This model corresponds to hidden Markov models (HMMs), well known statistical models with a large range of application for time series modeling. The song annotation with these models with first-order hidden state dynamics agreed well with manual annotation, the score was comparable to that of a second-order HMM, and surpassed the zeroth-order model (the Gaussian mixture model; GMM), which does not use context information. Our results imply that the hierarchical representation with hidden state dynamics may underlie the neural implementation for generating complex behavioral sequences with higher-order dependencies. PMID:21915345

  10. Noise correlations in cosmic microwave background experiments

    NASA Technical Reports Server (NTRS)

    Dodelson, Scott; Kosowsky, Arthur; Myers, Steven T.

    1995-01-01

    Many analysis of microwave background experiments neglect the correlation of noise in different frequency of polarization channels. We show that these correlations, should they be present, can lead to serve misinterpretation of an experiment. In particular, correlated noise arising from either electronics or atmosphere may mimic a cosmic signal. We quantify how the likelihood function for a given experiment varies with noise correlation, using both simple analytic models and actual data. For a typical microwave background anisotropy experiment, noise correlations at the level of 1% of the overall noise can seriously reduce the significance of a given detection.

  11. Object tracking via background subtraction for monitoring illegal activity in crossroad

    NASA Astrophysics Data System (ADS)

    Ghimire, Deepak; Jeong, Sunghwan; Park, Sang Hyun; Lee, Joonwhoan

    2016-07-01

    In the field of intelligent transportation system a great number of vision-based techniques have been proposed to prevent pedestrians from being hit by vehicles. This paper presents a system that can perform pedestrian and vehicle detection and monitoring of illegal activity in zebra crossings. In zebra crossing, according to the traffic light status, to fully avoid a collision, a driver or pedestrian should be warned earlier if they possess any illegal moves. In this research, at first, we detect the traffic light status of pedestrian and monitor the crossroad for vehicle pedestrian moves. The background subtraction based object detection and tracking is performed to detect pedestrian and vehicles in crossroads. Shadow removal, blob segmentation, trajectory analysis etc. are used to improve the object detection and classification performance. We demonstrate the experiment in several video sequences which are recorded in different time and environment such as day time and night time, sunny and raining environment. Our experimental results show that such simple and efficient technique can be used successfully as a traffic surveillance system to prevent accidents in zebra crossings.

  12. SEXCMD: Development and validation of sex marker sequences for whole-exome/genome and RNA sequencing.

    PubMed

    Jeong, Seongmun; Kim, Jiwoong; Park, Won; Jeon, Hongmin; Kim, Namshin

    2017-01-01

    Over the last decade, a large number of nucleotide sequences have been generated by next-generation sequencing technologies and deposited to public databases. However, most of these datasets do not specify the sex of individuals sampled because researchers typically ignore or hide this information. Male and female genomes in many species have distinctive sex chromosomes, XX/XY and ZW/ZZ, and expression levels of many sex-related genes differ between the sexes. Herein, we describe how to develop sex marker sequences from syntenic regions of sex chromosomes and use them to quickly identify the sex of individuals being analyzed. Array-based technologies routinely use either known sex markers or the B-allele frequency of X or Z chromosomes to deduce the sex of an individual. The same strategy has been used with whole-exome/genome sequence data; however, all reads must be aligned onto a reference genome to determine the B-allele frequency of the X or Z chromosomes. SEXCMD is a pipeline that can extract sex marker sequences from reference sex chromosomes and rapidly identify the sex of individuals from whole-exome/genome and RNA sequencing after training with a known dataset through a simple machine learning approach. The pipeline counts total numbers of hits from sex-specific marker sequences and identifies the sex of the individuals sampled based on the fact that XX/ZZ samples do not have Y or W chromosome hits. We have successfully validated our pipeline with mammalian (Homo sapiens; XY) and avian (Gallus gallus; ZW) genomes. Typical calculation time when applying SEXCMD to human whole-exome or RNA sequencing datasets is a few minutes, and analyzing human whole-genome datasets takes about 10 minutes. Another important application of SEXCMD is as a quality control measure to avoid mixing samples before bioinformatics analysis. SEXCMD comprises simple Python and R scripts and is freely available at https://github.com/lovemun/SEXCMD.

  13. Accounting for orphaned aftershocks in the earthquake background rate

    USGS Publications Warehouse

    Van Der Elst, Nicholas

    2017-01-01

    Aftershocks often occur within cascades of triggered seismicity in which each generation of aftershocks triggers an additional generation, and so on. The rate of earthquakes in any particular generation follows Omori's law, going approximately as 1/t. This function decays rapidly, but is heavy-tailed, and aftershock sequences may persist for long times at a rate that is difficult to discriminate from background. It is likely that some apparently spontaneous earthquakes in the observational catalogue are orphaned aftershocks of long-past main shocks. To assess the relative proportion of orphaned aftershocks in the apparent background rate, I develop an extension of the ETAS model that explicitly includes the expected contribution of orphaned aftershocks to the apparent background rate. Applying this model to California, I find that the apparent background rate can be almost entirely attributed to orphaned aftershocks, depending on the assumed duration of an aftershock sequence. This implies an earthquake cascade with a branching ratio (the average number of directly triggered aftershocks per main shock) of nearly unity. In physical terms, this implies that very few earthquakes are completely isolated from the perturbing effects of other earthquakes within the fault system. Accounting for orphaned aftershocks in the ETAS model gives more accurate estimates of the true background rate, and more realistic expectations for long-term seismicity patterns.

  14. Accounting for orphaned aftershocks in the earthquake background rate

    NASA Astrophysics Data System (ADS)

    van der Elst, Nicholas J.

    2017-11-01

    Aftershocks often occur within cascades of triggered seismicity in which each generation of aftershocks triggers an additional generation, and so on. The rate of earthquakes in any particular generation follows Omori's law, going approximately as 1/t. This function decays rapidly, but is heavy-tailed, and aftershock sequences may persist for long times at a rate that is difficult to discriminate from background. It is likely that some apparently spontaneous earthquakes in the observational catalogue are orphaned aftershocks of long-past main shocks. To assess the relative proportion of orphaned aftershocks in the apparent background rate, I develop an extension of the ETAS model that explicitly includes the expected contribution of orphaned aftershocks to the apparent background rate. Applying this model to California, I find that the apparent background rate can be almost entirely attributed to orphaned aftershocks, depending on the assumed duration of an aftershock sequence. This implies an earthquake cascade with a branching ratio (the average number of directly triggered aftershocks per main shock) of nearly unity. In physical terms, this implies that very few earthquakes are completely isolated from the perturbing effects of other earthquakes within the fault system. Accounting for orphaned aftershocks in the ETAS model gives more accurate estimates of the true background rate, and more realistic expectations for long-term seismicity patterns.

  15. Hippocampal Replay is Not a Simple Function of Experience

    PubMed Central

    Gupta, Anoopum S.; van der Meer, Matthijs A. A.; Touretzky, David S.; Redish, A. David

    2015-01-01

    Summary Replay of behavioral sequences in the hippocampus during sharp-wave-ripple-complexes (SWRs) provides a potential mechanism for memory consolidation and the learning of knowledge structures. Current hypotheses imply that replay should straightforwardly reflect recent experience. However, we find these hypotheses to be incompatible with the content of replay on a task with two distinct behavioral sequences (A&B). We observed forward and backward replay of B even when rats had been performing A for >10 minutes. Furthermore, replay of non-local sequence B occurred more often when B was infrequently experienced. Neither forward nor backward sequences preferentially represented highly-experienced trajectories within a session. Additionally, we observed the construction of never-experienced novel-path sequences. These observations challenge the idea that sequence activation during SWRs is a simple replay of recent experience. Instead, replay reflected all physically available trajectories within the environment, suggesting a potential role in active learning and maintenance of the cognitive map. PMID:20223204

  16. Generation of non-genomic oligonucleotide tag sequences for RNA template-specific PCR

    PubMed Central

    Pinto, Fernando Lopes; Svensson, Håkan; Lindblad, Peter

    2006-01-01

    Background In order to overcome genomic DNA contamination in transcriptional studies, reverse template-specific polymerase chain reaction, a modification of reverse transcriptase polymerase chain reaction, is used. The possibility of using tags whose sequences are not found in the genome further improves reverse specific polymerase chain reaction experiments. Given the absence of software available to produce genome suitable tags, a simple tool to fulfill such need was developed. Results The program was developed in Perl, with separate use of the basic local alignment search tool, making the tool platform independent (known to run on Windows XP and Linux). In order to test the performance of the generated tags, several molecular experiments were performed. The results show that Tagenerator is capable of generating tags with good priming properties, which will deliberately not result in PCR amplification of genomic DNA. Conclusion The program Tagenerator is capable of generating tag sequences that combine genome absence with good priming properties for RT-PCR based experiments, circumventing the effects of genomic DNA contamination in an RNA sample. PMID:16820068

  17. A Glance at Microsatellite Motifs from 454 Sequencing Reads of Watermelon Genomic DNA

    USDA-ARS?s Scientific Manuscript database

    A single 454 (Life Sciences Sequencing Technology) run of Charleston Gray watermelon (Citrullus lanatus var. lanatus) genomic DNA was performed and sequence data were assembled. A large scale identification of simple sequence repeat (SSR) was performed and SSR sequence data were used for the develo...

  18. A rapid and simple method of detection of Blepharisma japonicum using PCR and immobilisation on FTA paper

    PubMed Central

    Hide, Geoff; Hughes, Jacqueline M; McNuff, Robert

    2003-01-01

    Background The rapid expansion in the availability of genome and DNA sequence information has opened up new possibilities for the development of methods for detecting free-living protozoa in environmental samples. The protozoan Blepharisma japonicum was used to investigate a rapid and simple detection system based on polymerase chain reaction amplification (PCR) from organisms immobilised on FTA paper. Results Using primers designed from the α-tubulin genes of Blepharisma, specific and sensitive detection to the equivalent of a single Blepharisma cell could be achieved. Similar detection levels were found using water samples, containing Blepharisma, which were dried onto Whatman FTA paper. Conclusion This system has potential as a sensitive convenient detection system for Blepharisma and could be applied to other protozoan organisms. PMID:14516472

  19. A simple procedure for parallel sequence analysis of both strands of 5'-labeled DNA.

    PubMed

    Razvi, F; Gargiulo, G; Worcel, A

    1983-08-01

    Ligation of a 5'-labeled DNA restriction fragment results in a circular DNA molecule carrying the two 32Ps at the reformed restriction site. Double digestions of the circular DNA with the original enzyme and a second restriction enzyme cleavage near the labeled site allows direct chemical sequencing of one 5'-labeled DNA strand. Similar double digestions, using an isoschizomer that cleaves differently at the 32P-labeled site, allows direct sequencing of the now 3'-labeled complementary DNA strand. It is possible to directly sequence both strands of cloned DNA inserts by using the above protocol and a multiple cloning site vector that provides the necessary restriction sites. The simultaneous and parallel visualization of both DNA strands eliminates sequence ambiguities. In addition, the labeled circular molecules are particularly useful for single-hit DNA cleavage studies and DNA footprint analysis. As an example, we show here an analysis of the micrococcal nuclease-induced breaks on the two strands of the somatic 5S RNA gene of Xenopus borealis, which suggests that the enzyme may recognize and cleave small AT-containing palindromes along the DNA helix.

  20. ``Sequence space soup'' of proteins and copolymers

    NASA Astrophysics Data System (ADS)

    Chan, Hue Sun; Dill, Ken A.

    1991-09-01

    To study the protein folding problem, we use exhaustive computer enumeration to explore ``sequence space soup,'' an imaginary solution containing the ``native'' conformations (i.e., of lowest free energy) under folding conditions, of every possible copolymer sequence. The model is of short self-avoiding chains of hydrophobic (H) and polar (P) monomers configured on the two-dimensional square lattice. By exhaustive enumeration, we identify all native structures for every possible sequence. We find that random sequences of H/P copolymers will bear striking resemblance to known proteins: Most sequences under folding conditions will be approximately as compact as known proteins, will have considerable amounts of secondary structure, and it is most probable that an arbitrary sequence will fold to a number of lowest free energy conformations that is of order one. In these respects, this simple model shows that proteinlike behavior should arise simply in copolymers in which one monomer type is highly solvent averse. It suggests that the structures and uniquenesses of native proteins are not consequences of having 20 different monomer types, or of unique properties of amino acid monomers with regard to special packing or interactions, and thus that simple copolymers might be designable to collapse to proteinlike structures and properties. A good strategy for designing a sequence to have a minimum possible number of native states is to strategically insert many P monomers. Thus known proteins may be marginally stable due to a balance: More H residues stabilize the desired native state, but more P residues prevent simultaneous stabilization of undesired native states.

  1. Application of Inter-Simple Sequence Repeat Markers in the Analysis of Populations of the Chagas Disease Vector Triatoma infestans (Hemiptera, Reduviidae)

    PubMed Central

    Pérez de Rosas, Alicia R.; Restelli, María F.; Fernández, Cintia J.; Blariza, María J.; García, Beatriz A.

    2017-01-01

    Here we apply inter-simple sequence repeat (ISSR) markers to explore the fine-scale genetic structure and dispersal in populations of Triatoma infestans. Five selected primers from 30 primers were used to amplify ISSRs by polymerase chain reaction. A total of 90 polymorphic bands were detected across 134 individuals captured from 11 peridomestic sites from the locality of San Martín (Capayán Department, Catamarca Province, Argentina). Significant levels of genetic differentiation suggest limited gene flow among sampling sites. Spatial autocorrelation analysis confirms that dispersal occurs on the scale of ∼469 m, suggesting that insecticide spraying should be extended at least within a radius of ∼500 m around the infested area. Moreover, Bayesian clustering algorithms indicated genetic exchange among different sites analyzed, supporting the hypothesis of an important role of peridomestic structures in the process of reinfestation. PMID:28115670

  2. RAD tag sequencing as a source of SNP markers in Cynara cardunculus L

    PubMed Central

    2012-01-01

    Background The globe artichoke (Cynara cardunculus L. var. scolymus) genome is relatively poorly explored, especially compared to those of the other major Asteraceae crops sunflower and lettuce. No SNP markers are in the public domain. We have combined the recently developed restriction-site associated DNA (RAD) approach with the Illumina DNA sequencing platform to effect the rapid and mass discovery of SNP markers for C. cardunculus. Results RAD tags were sequenced from the genomic DNA of three C. cardunculus mapping population parents, generating 9.7 million reads, corresponding to ~1 Gbp of sequence. An assembly based on paired ends produced ~6.0 Mbp of genomic sequence, separated into ~19,000 contigs (mean length 312 bp), of which ~21% were fragments of putative coding sequence. The shared sequences allowed for the discovery of ~34,000 SNPs and nearly 800 indels, equivalent to a SNP frequency of 5.6 per 1,000 nt, and an indel frequency of 0.2 per 1,000 nt. A sample of heterozygous SNP loci was mapped by CAPS assays and this exercise provided validation of our mining criteria. The repetitive fraction of the genome had a high representation of retrotransposon sequence, followed by simple repeats, AT-low complexity regions and mobile DNA elements. The genomic k-mers distribution and CpG rate of C. cardunculus, compared with data derived from three whole genome-sequenced dicots species, provided a further evidence of the random representation of the C. cardunculus genome generated by RAD sampling. Conclusion The RAD tag sequencing approach is a cost-effective and rapid method to develop SNP markers in a highly heterozygous species. Our approach permitted to generate a large and robust SNP datasets by the adoption of optimized filtering criteria. PMID:22214349

  3. Individual sequences in large sets of gene sequences may be distinguished efficiently by combinations of shared sub-sequences

    PubMed Central

    Gibbs, Mark J; Armstrong, John S; Gibbs, Adrian J

    2005-01-01

    Background Most current DNA diagnostic tests for identifying organisms use specific oligonucleotide probes that are complementary in sequence to, and hence only hybridise with the DNA of one target species. By contrast, in traditional taxonomy, specimens are usually identified by 'dichotomous keys' that use combinations of characters shared by different members of the target set. Using one specific character for each target is the least efficient strategy for identification. Using combinations of shared bisectionally-distributed characters is much more efficient, and this strategy is most efficient when they separate the targets in a progressively binary way. Results We have developed a practical method for finding minimal sets of sub-sequences that identify individual sequences, and could be targeted by combinations of probes, so that the efficient strategy of traditional taxonomic identification could be used in DNA diagnosis. The sizes of minimal sub-sequence sets depended mostly on sequence diversity and sub-sequence length and interactions between these parameters. We found that 201 distinct cytochrome oxidase subunit-1 (CO1) genes from moths (Lepidoptera) were distinguished using only 15 sub-sequences 20 nucleotides long, whereas only 8–10 sub-sequences 6–10 nucleotides long were required to distinguish the CO1 genes of 92 species from the 9 largest orders of insects. Conclusion The presence/absence of sub-sequences in a set of gene sequences can be used like the questions in a traditional dichotomous taxonomic key; hybridisation probes complementary to such sub-sequences should provide a very efficient means for identifying individual species, subtypes or genotypes. Sequence diversity and sub-sequence length are the major factors that determine the numbers of distinguishing sub-sequences in any set of sequences. PMID:15817134

  4. Mining and validation of pyrosequenced simple sequence repeats (SSRs) from American cranberry (Vaccinium macrocarpon Ait.).

    PubMed

    Zhu, H; Senalik, D; McCown, B H; Zeldin, E L; Speers, J; Hyman, J; Bassil, N; Hummer, K; Simon, P W; Zalapa, J E

    2012-01-01

    The American cranberry (Vaccinium macrocarpon Ait.) is a major commercial fruit crop in North America, but limited genetic resources have been developed for the species. Furthermore, the paucity of codominant DNA markers has hampered the advance of genetic research in cranberry and the Ericaceae family in general. Therefore, we used Roche 454 sequencing technology to perform low-coverage whole genome shotgun sequencing of the cranberry cultivar 'HyRed'. After de novo assembly, the obtained sequence covered 266.3 Mb of the estimated 540-590 Mb in cranberry genome. A total of 107,244 SSR loci were detected with an overall density across the genome of 403 SSR/Mb. The AG repeat was the most frequent motif in cranberry accounting for 35% of all SSRs and together with AAG and AAAT accounted for 46% of all loci discovered. To validate the SSR loci, we designed 96 primer-pairs using contig sequence data containing perfect SSR repeats, and studied the genetic diversity of 25 cranberry genotypes. We identified 48 polymorphic SSR loci with 2-15 alleles per locus for a total of 323 alleles in the 25 cranberry genotypes. Genetic clustering by principal coordinates and genetic structure analyzes confirmed the heterogeneous nature of cranberries. The parentage composition of several hybrid cultivars was evident from the structure analyzes. Whole genome shotgun 454 sequencing was a cost-effective and efficient way to identify numerous SSR repeats in the cranberry sequence for marker development.

  5. Fusion primer and nested integrated PCR (FPNI-PCR): a new high-efficiency strategy for rapid chromosome walking or flanking sequence cloning

    PubMed Central

    2011-01-01

    Background The advent of genomics-based technologies has revolutionized many fields of biological enquiry. However, chromosome walking or flanking sequence cloning is still a necessary and important procedure to determining gene structure. Such methods are used to identify T-DNA insertion sites and so are especially relevant for organisms where large T-DNA insertion libraries have been created, such as rice and Arabidopsis. The currently available methods for flanking sequence cloning, including the popular TAIL-PCR technique, are relatively laborious and slow. Results Here, we report a simple and effective fusion primer and nested integrated PCR method (FPNI-PCR) for the identification and cloning of unknown genomic regions flanked known sequences. In brief, a set of universal primers was designed that consisted of various 15-16 base arbitrary degenerate oligonucleotides. These arbitrary degenerate primers were fused to the 3' end of an adaptor oligonucleotide which provided a known sequence without degenerate nucleotides, thereby forming the fusion primers (FPs). These fusion primers are employed in the first step of an integrated nested PCR strategy which defines the overall FPNI-PCR protocol. In order to demonstrate the efficacy of this novel strategy, we have successfully used it to isolate multiple genomic sequences namely, 21 orthologs of genes in various species of Rosaceace, 4 MYB genes of Rosa rugosa, 3 promoters of transcription factors of Petunia hybrida, and 4 flanking sequences of T-DNA insertion sites in transgenic tobacco lines and 6 specific genes from sequenced genome of rice and Arabidopsis. Conclusions The successful amplification of target products through FPNI-PCR verified that this novel strategy is an effective, low cost and simple procedure. Furthermore, FPNI-PCR represents a more sensitive, rapid and accurate technique than the established TAIL-PCR and hiTAIL-PCR procedures. PMID:22093809

  6. Microsatellite DNA in genomic survey sequences and UniGenes of loblolly pine

    Treesearch

    Craig S Echt; Surya Saha; Dennis L Deemer; C Dana Nelson

    2011-01-01

    Genomic DNA sequence databases are a potential and growing resource for simple sequence repeat (SSR) marker development in loblolly pine (Pinus taeda L.). Loblolly pine also has many expressed sequence tags (ESTs) available for microsatellite (SSR) marker development. We compared loblolly pine SSR densities in genome survey sequences (GSSs) to those in non-redundant...

  7. SOBA: sequence ontology bioinformatics analysis.

    PubMed

    Moore, Barry; Fan, Guozhen; Eilbeck, Karen

    2010-07-01

    The advent of cheaper, faster sequencing technologies has pushed the task of sequence annotation from the exclusive domain of large-scale multi-national sequencing projects to that of research laboratories and small consortia. The bioinformatics burden placed on these laboratories, some with very little programming experience can be daunting. Fortunately, there exist software libraries and pipelines designed with these groups in mind, to ease the transition from an assembled genome to an annotated and accessible genome resource. We have developed the Sequence Ontology Bioinformatics Analysis (SOBA) tool to provide a simple statistical and graphical summary of an annotated genome. We envisage its use during annotation jamborees, genome comparison and for use by developers for rapid feedback during annotation software development and testing. SOBA also provides annotation consistency feedback to ensure correct use of terminology within annotations, and guides users to add new terms to the Sequence Ontology when required. SOBA is available at http://www.sequenceontology.org/cgi-bin/soba.cgi.

  8. Genetic Diversity of Arabica Coffee (Coffea arabica L.) in Nicaragua as Estimated by Simple Sequence Repeat Markers

    PubMed Central

    Geleta, Mulatu; Herrera, Isabel; Monzón, Arnulfo; Bryngelsson, Tomas

    2012-01-01

    Coffea arabica L. (arabica coffee), the only tetraploid species in the genus Coffea, represents the majority of the world's coffee production and has a significant contribution to Nicaragua's economy. The present paper was conducted to determine the genetic diversity of arabica coffee in Nicaragua for its conservation and breeding values. Twenty-six populations that represent eight varieties in Nicaragua were investigated using simple sequence repeat (SSR) markers. A total of 24 alleles were obtained from the 12 loci investigated across 260 individual plants. The total Nei's gene diversity (H T) and the within-population gene diversity (H S) were 0.35 and 0.29, respectively, which is comparable with that previously reported from other countries and regions. Among the varieties, the highest diversity was recorded in the variety Catimor. Analysis of variance (AMOVA) revealed that about 87% of the total genetic variation was found within populations and the remaining 13% differentiate the populations (F ST = 0.13; P < 0.001). The variation among the varieties was also significant. The genetic variation in Nicaraguan coffee is significant enough to be used in the breeding programs, and most of this variation can be conserved through ex situ conservation of a low number of populations from each variety. PMID:22701376

  9. Sequence Bundles: a novel method for visualising, discovering and exploring sequence motifs

    PubMed Central

    2014-01-01

    Background We introduce Sequence Bundles--a novel data visualisation method for representing multiple sequence alignments (MSAs). We identify and address key limitations of the existing bioinformatics data visualisation methods (i.e. the Sequence Logo) by enabling Sequence Bundles to give salient visual expression to sequence motifs and other data features, which would otherwise remain hidden. Methods For the development of Sequence Bundles we employed research-led information design methodologies. Sequences are encoded as uninterrupted, semi-opaque lines plotted on a 2-dimensional reconfigurable grid. Each line represents a single sequence. The thickness and opacity of the stack at each residue in each position indicates the level of conservation and the lines' curved paths expose patterns in correlation and functionality. Several MSAs can be visualised in a composite image. The Sequence Bundles method is designed to favour a tangible, continuous and intuitive display of information. Results We have developed a software demonstration application for generating a Sequence Bundles visualisation of MSAs provided for the BioVis 2013 redesign contest. A subsequent exploration of the visualised line patterns allowed for the discovery of a number of interesting features in the dataset. Reported features include the extreme conservation of sequences displaying a specific residue and bifurcations of the consensus sequence. Conclusions Sequence Bundles is a novel method for visualisation of MSAs and the discovery of sequence motifs. It can aid in generating new insight and hypothesis making. Sequence Bundles is well disposed for future implementation as an interactive visual analytics software, which can complement existing visualisation tools. PMID:25237395

  10. Genetic Variation and Population Differentiation in a Medical Herb Houttuynia cordata in China Revealed by Inter-Simple Sequence Repeats (ISSRs)

    PubMed Central

    Wei, Lin; Wu, Xian-Jin

    2012-01-01

    Houttuynia cordata is an important traditional Chinese herb with unresolved genetics and taxonomy, which lead to potential problems in the conservation and utilization of the resource. Inter-simple sequence repeat (ISSR) markers were used to assess the level and distribution of genetic diversity in 226 individuals from 15 populations of H. cordata in China. ISSR analysis revealed low genetic variations within populations but high genetic differentiations among populations. This genetic structure probably mainly reflects the historical association among populations. Genetic cluster analysis showed that the basal clade is composed of populations from Southwest China, and the other populations have continuous and eastward distributions. The structure of genetic diversity in H. cordata demonstrated that this species might have survived in Southwest China during the glacial age, and subsequently experienced an eastern postglacial expansion. Based on the results of genetic analysis, it was proposed that as many as possible targeted populations for conservation be included. PMID:22942696

  11. Genetic variation and population differentiation in a medical herb Houttuynia cordata in China revealed by inter-simple sequence repeats (ISSRs).

    PubMed

    Wei, Lin; Wu, Xian-Jin

    2012-01-01

    Houttuynia cordata is an important traditional Chinese herb with unresolved genetics and taxonomy, which lead to potential problems in the conservation and utilization of the resource. Inter-simple sequence repeat (ISSR) markers were used to assess the level and distribution of genetic diversity in 226 individuals from 15 populations of H. cordata in China. ISSR analysis revealed low genetic variations within populations but high genetic differentiations among populations. This genetic structure probably mainly reflects the historical association among populations. Genetic cluster analysis showed that the basal clade is composed of populations from Southwest China, and the other populations have continuous and eastward distributions. The structure of genetic diversity in H. cordata demonstrated that this species might have survived in Southwest China during the glacial age, and subsequently experienced an eastern postglacial expansion. Based on the results of genetic analysis, it was proposed that as many as possible targeted populations for conservation be included.

  12. Sequence analysis reveals genomic factors affecting EST-SSR primer performance and polymorphism

    USDA-ARS?s Scientific Manuscript database

    Search for simple sequence repeat (SSR) motifs and design of flanking primers in expressed sequence tag (EST) sequences can be easily done at a large scale using bioinformatics programs. However, failed amplification and/or detection, along with lack of polymorphism, is often seen among randomly sel...

  13. Learning predictive statistics from temporal sequences: Dynamics and strategies

    PubMed Central

    Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E.; Kourtzi, Zoe

    2017-01-01

    Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics—that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments. PMID:28973111

  14. Learning predictive statistics from temporal sequences: Dynamics and strategies.

    PubMed

    Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E; Kourtzi, Zoe

    2017-10-01

    Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics-that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments.

  15. Deciphering mRNA Sequence Determinants of Protein Production Rate

    NASA Astrophysics Data System (ADS)

    Szavits-Nossan, Juraj; Ciandrini, Luca; Romano, M. Carmen

    2018-03-01

    One of the greatest challenges in biophysical models of translation is to identify coding sequence features that affect the rate of translation and therefore the overall protein production in the cell. We propose an analytic method to solve a translation model based on the inhomogeneous totally asymmetric simple exclusion process, which allows us to unveil simple design principles of nucleotide sequences determining protein production rates. Our solution shows an excellent agreement when compared to numerical genome-wide simulations of S. cerevisiae transcript sequences and predicts that the first 10 codons, which is the ribosome footprint length on the mRNA, together with the value of the initiation rate, are the main determinants of protein production rate under physiological conditions. Finally, we interpret the obtained analytic results based on the evolutionary role of the codons' choice for regulating translation rates and ribosome densities.

  16. High-Throughput Sequencing and De Novo Assembly of the Isatis indigotica Transcriptome

    PubMed Central

    Tang, Xiaoqing; Xiao, Yunhua; Lv, Tingting; Wang, Fangquan; Zhu, QianHao; Zheng, Tianqing; Yang, Jie

    2014-01-01

    Background Isatis indigotica, the source of the traditional Chinese medicine Radix isatidis (Ban-Lan-Gen), is an extremely important economical crop in China. To facilitate biological, biochemical and molecular research on the medicinal chemicals in I. indigotica, here we report the first I. indigotica transcriptome generated by RNA sequencing (RNA-seq). Results RNA-seq library was created using RNA extracted from a mixed sample including leaf and root. A total of 33,238 unigenes were assembled from more than 28 million of high quality short reads. The quality of the assembly was experimentally examined by cDNA sequencing of seven randomly selected unigenes. Based on blast search 28,184 unigenes had a hit in at least one of the protein and nucleotide databases used in this study, and 8 unigenes were found to be associated with biosynthesis of indole and its derivatives. According to Gene Ontology classification, 22,365 unigenes were categorized into 48 functional groups. Furthermore, Clusters of Orthologous Group and Swiss-Port annotation were assigned for 7,707 and 18,679 unigenes, respectively. Analysis of repeat motifs identified 6,400 simple sequence repeat markers in 4,509 unigenes. Conclusion Our data provide a comprehensive sequence resource for molecular study of I. indigotica. Our results will facilitate studies on the functions of genes involved in the indole alkaloid biosynthesis pathway and on metabolism of nitrogen and indole alkaloids in I. indigotica and its related species. PMID:25259890

  17. Transcriptome analysis of carnation (Dianthus caryophyllus L.) based on next-generation sequencing technology

    PubMed Central

    2012-01-01

    Background Carnation (Dianthus caryophyllus L.), in the family Caryophyllaceae, can be found in a wide range of colors and is a model system for studies of flower senescence. In addition, it is one of the most important flowers in the global floriculture industry. However, few genomics resources, such as sequences and markers are available for carnation or other members of the Caryophyllaceae. To increase our understanding of the genetic control of important characters in carnation, we generated an expressed sequence tag (EST) database for a carnation cultivar important in horticulture by high-throughput sequencing using 454 pyrosequencing technology. Results We constructed a normalized cDNA library and a 3’-UTR library of carnation, obtaining a total of 1,162,126 high-quality reads. These reads were assembled into 300,740 unigenes consisting of 37,844 contigs and 262,896 singlets. The contigs were searched against an Arabidopsis sequence database, and 61.8% (23,380) of them had at least one BLASTX hit. These contigs were also annotated with Gene Ontology (GO) and were found to cover a broad range of GO categories. Furthermore, we identified 17,362 potential simple sequence repeats (SSRs) in 14,291 of the unigenes. We focused on gene discovery in the areas of flower color and ethylene biosynthesis. Transcripts were identified for almost every gene involved in flower chlorophyll and carotenoid metabolism and in anthocyanin biosynthesis. Transcripts were also identified for every step in the ethylene biosynthesis pathway. Conclusions We present the first large-scale sequence data set for carnation, generated using next-generation sequencing technology. The large EST database generated from these sequences is an informative resource for identifying genes involved in various biological processes in carnation and provides an EST resource for understanding the genetic diversity of this plant. PMID:22747974

  18. In silico mining and characterization of simple sequence repeats from gilthead sea bream (Sparus aurata) expressed sequence tags (EST-SSRs); PCR amplification, polymorphism evaluation and multiplexing and cross-species assays.

    PubMed

    Vogiatzi, Emmanouella; Lagnel, Jacques; Pakaki, Victoria; Louro, Bruno; Canario, Adelino V M; Reinhardt, Richard; Kotoulas, Georgios; Magoulas, Antonios; Tsigenopoulos, Costas S

    2011-06-01

    We screened for simple sequence repeats (SSRs) found in ESTs derived from an EST-database development project ('Marine Genomics Europe' Network of Excellence). Different motifs of di-, tri-, tetra-, penta- and hexanucleotide SSRs were evaluated for variation in length and position in the expressed sequences, relative abundance and distribution in gilthead sea bream (Sparus aurata). We found 899 ESTs that harbor 997 SSRs (4.94%). On average, one SSR was found per 2.95 kb of EST sequence and the dinucleotide SSRs are the most abundant accounting for 47.6% of the total number. EST-SSRs were used as template for primer design. 664 primer pairs could be successfully identified and a subset of 206 pairs of primers was synthesized, PCR-tested and visualized on ethidium bromide stained agarose gels. The main objective was to further assess the potential of EST-SSRs as informative markers and investigate their cross-species amplification in sixteen teleost fish species: seven sparid species and nine other species from different families. Approximately 78% of the primer pairs gave PCR products of expected size in gilthead sea bream, and as expected, the rate of successful amplification of sea bream EST-SSRs was higher in sparids, lower in other perciforms and even lower in species of the Clupeiform and Gadiform orders. We finally determined the polymorphism and the heterozygosity of 63 markers in a wild gilthead sea bream population; fifty-eight loci were found to be polymorphic with the expected heterozygosity and the number of alleles ranging from 0.089 to 0.946 and from 2 to 27, respectively. These tools and markers are expected to enhance the available genetic linkage map in gilthead sea bream, to assist comparative mapping and genome analyses for this species and further with other model fish species and finally to help advance genetic analysis for cultivated and wild populations and accelerate breeding programs. Copyright © 2011 Elsevier B.V. All rights reserved.

  19. Next-generation sequencing library preparation method for identification of RNA viruses on the Ion Torrent Sequencing Platform.

    PubMed

    Chen, Guiqian; Qiu, Yuan; Zhuang, Qingye; Wang, Suchun; Wang, Tong; Chen, Jiming; Wang, Kaicheng

    2018-05-09

    Next generation sequencing (NGS) is a powerful tool for the characterization, discovery, and molecular identification of RNA viruses. There were multiple NGS library preparation methods published for strand-specific RNA-seq, but some methods are not suitable for identifying and characterizing RNA viruses. In this study, we report a NGS library preparation method to identify RNA viruses using the Ion Torrent PGM platform. The NGS sequencing adapters were directly inserted into the sequencing library through reverse transcription and polymerase chain reaction, without fragmentation and ligation of nucleic acids. The results show that this method is simple to perform, able to identify multiple species of RNA viruses in clinical samples.

  20. Comprehensive analysis of Arabidopsis expression level polymorphisms with simple inheritance

    PubMed Central

    Plantegenet, Stephanie; Weber, Johann; Goldstein, Darlene R; Zeller, Georg; Nussbaumer, Cindy; Thomas, Jérôme; Weigel, Detlef; Harshman, Keith; Hardtke, Christian S

    2009-01-01

    In Arabidopsis thaliana, gene expression level polymorphisms (ELPs) between natural accessions that exhibit simple, single locus inheritance are promising quantitative trait locus (QTL) candidates to explain phenotypic variability. It is assumed that such ELPs overwhelmingly represent regulatory element polymorphisms. However, comprehensive genome-wide analyses linking expression level, regulatory sequence and gene structure variation are missing, preventing definite verification of this assumption. Here, we analyzed ELPs observed between the Eil-0 and Lc-0 accessions. Compared with non-variable controls, 5′ regulatory sequence variation in the corresponding genes is indeed increased. However, ∼42% of all the ELP genes also carry major transcription unit deletions in one parent as revealed by genome tiling arrays, representing a >4-fold enrichment over controls. Within the subset of ELPs with simple inheritance, this proportion is even higher and deletions are generally more severe. Similar results were obtained from analyses of the Bay-0 and Sha accessions, using alternative technical approaches. Collectively, our results suggest that drastic structural changes are a major cause for ELPs with simple inheritance, corroborating experimentally observed indel preponderance in cloned Arabidopsis QTL. PMID:19225455

  1. RSAT: regulatory sequence analysis tools.

    PubMed

    Thomas-Chollier, Morgane; Sand, Olivier; Turatsinze, Jean-Valéry; Janky, Rekin's; Defrance, Matthieu; Vervisch, Eric; Brohée, Sylvain; van Helden, Jacques

    2008-07-01

    The regulatory sequence analysis tools (RSAT, http://rsat.ulb.ac.be/rsat/) is a software suite that integrates a wide collection of modular tools for the detection of cis-regulatory elements in genome sequences. The suite includes programs for sequence retrieval, pattern discovery, phylogenetic footprint detection, pattern matching, genome scanning and feature map drawing. Random controls can be performed with random gene selections or by generating random sequences according to a variety of background models (Bernoulli, Markov). Beyond the original word-based pattern-discovery tools (oligo-analysis and dyad-analysis), we recently added a battery of tools for matrix-based detection of cis-acting elements, with some original features (adaptive background models, Markov-chain estimation of P-values) that do not exist in other matrix-based scanning tools. The web server offers an intuitive interface, where each program can be accessed either separately or connected to the other tools. In addition, the tools are now available as web services, enabling their integration in programmatic workflows. Genomes are regularly updated from various genome repositories (NCBI and EnsEMBL) and 682 organisms are currently supported. Since 1998, the tools have been used by several hundreds of researchers from all over the world. Several predictions made with RSAT were validated experimentally and published.

  2. A simple method for MR elastography: a gradient-echo type multi-echo sequence.

    PubMed

    Numano, Tomokazu; Mizuhara, Kazuyuki; Hata, Junichi; Washio, Toshikatsu; Homma, Kazuhiro

    2015-01-01

    To demonstrate the feasibility of a novel MR elastography (MRE) technique based on a conventional gradient-echo type multi-echo MR sequence which does not need additional bipolar magnetic field gradients (motion encoding gradient: MEG), yet is sensitive to vibration. In a gradient-echo type multi-echo MR sequence, several images are produced from each echo of the train with different echo times (TEs). If these echoes are synchronized with the vibration, each readout's gradient lobes achieve a MEG-like effect, and the later generated echo causes a greater MEG-like effect. The sequence was tested for the tissue-mimicking agarose gel phantoms and the psoas major muscles of healthy volunteers. It was confirmed that the readout gradient lobes caused an MEG-like effect and the later TE images had higher sensitivity to vibrations. The magnitude image of later generated echo suffered the T2 decay and the susceptibility artifacts, but the wave image and elastogram of later generated echo were unaffected by these effects. In in vivo experiments, this method was able to measure the mean shear modulus of the psoas major muscle. From the results of phantom experiments and volunteer studies, it was shown that this method has clinical application potential. Copyright © 2014 Elsevier Inc. All rights reserved.

  3. Simple Experiments on Magnetism and Electricity...from Edison.

    ERIC Educational Resources Information Center

    Schultz, Robert F.

    Background information, lists of materials needed and procedures used are provided for 16 simple experiments on electricity and magnetism. These experiments are organized into sections dealing with: (1) Edison's carbon experiments (building a galvanometer, investigating the variable conductivity of carbon, and examining the carbon transmitter…

  4. Identification, characterization, and utilization of genome-wide simple sequence repeats to identify a QTL for acidity in apple

    PubMed Central

    2012-01-01

    Background Apple is an economically important fruit crop worldwide. Developing a genetic linkage map is a critical step towards mapping and cloning of genes responsible for important horticultural traits in apple. To facilitate linkage map construction, we surveyed and characterized the distribution and frequency of perfect microsatellites in assembled contig sequences of the apple genome. Results A total of 28,538 SSRs have been identified in the apple genome, with an overall density of 40.8 SSRs per Mb. Di-nucleotide repeats are the most frequent microsatellites in the apple genome, accounting for 71.9% of all microsatellites. AT/TA repeats are the most frequent in genomic regions, accounting for 38.3% of all the G-SSRs, while AG/GA dimers prevail in transcribed sequences, and account for 59.4% of all EST-SSRs. A total set of 310 SSRs is selected to amplify eight apple genotypes. Of these, 245 (79.0%) are found to be polymorphic among cultivars and wild species tested. AG/GA motifs in genomic regions have detected more alleles and higher PIC values than AT/TA or AC/CA motifs. Moreover, AG/GA repeats are more variable than any other dimers in apple, and should be preferentially selected for studies, such as genetic diversity and linkage map construction. A total of 54 newly developed apple SSRs have been genetically mapped. Interestingly, clustering of markers with distorted segregation is observed on linkage groups 1, 2, 10, 15, and 16. A QTL responsible for malic acid content of apple fruits is detected on linkage group 8, and accounts for ~13.5% of the observed phenotypic variation. Conclusions This study demonstrates that di-nucleotide repeats are prevalent in the apple genome and that AT/TA and AG/GA repeats are the most frequent in genomic and transcribed sequences of apple, respectively. All SSR motifs identified in this study as well as those newly mapped SSRs will serve as valuable resources for pursuing apple genetic studies, aiding the apple breeding

  5. Analysis of genetic diversity and population structure of oil palm (Elaeis guineensis) from China and Malaysia based on species-specific simple sequence repeat markers.

    PubMed

    Zhou, L X; Xiao, Y; Xia, W; Yang, Y D

    2015-12-08

    Genetic diversity and patterns of population structure of the 94 oil palm lines were investigated using species-specific simple sequence repeat (SSR) markers. We designed primers for 63 SSR loci based on their flanking sequences and conducted amplification in 94 oil palm DNA samples. The amplification result showed that a relatively high level of genetic diversity was observed between oil palm individuals according a set of 21 polymorphic microsatellite loci. The observed heterozygosity (Ho) was 0.3683 and 0.4035, with an average of 0.3859. The Ho value was a reliable determinant of the discriminatory power of the SSR primer combinations. The principal component analysis and unweighted pair-group method with arithmetic averaging cluster analysis showed the 94 oil palm lines were grouped into one cluster. These results demonstrated that the oil palm in Hainan Province of China and the germplasm introduced from Malaysia may be from the same source. The SSR protocol was effective and reliable for assessing the genetic diversity of oil palm. Knowledge of the genetic diversity and population structure will be crucial for establishing appropriate management stocks for this species.

  6. ProfileGrids: a sequence alignment visualization paradigm that avoids the limitations of Sequence Logos

    PubMed Central

    2014-01-01

    Background The 2013 BioVis Contest provided an opportunity to evaluate different paradigms for visualizing protein multiple sequence alignments. Such data sets are becoming extremely large and thus taxing current visualization paradigms. Sequence Logos represent consensus sequences but have limitations for protein alignments. As an alternative, ProfileGrids are a new protein sequence alignment visualization paradigm that represents an alignment as a color-coded matrix of the residue frequency occurring at every homologous position in the aligned protein family. Results The JProfileGrid software program was used to analyze the BioVis contest data sets to generate figures for comparison with the Sequence Logo reference images. Conclusions The ProfileGrid representation allows for the clear and effective analysis of protein multiple sequence alignments. This includes both a general overview of the conservation and diversity sequence patterns as well as the interactive ability to query the details of the protein residue distributions in the alignment. The JProfileGrid software is free and available from http://www.ProfileGrid.org. PMID:25237393

  7. Cosmic microwave background radiation anisotropies in brane worlds.

    PubMed

    Koyama, Kazuya

    2003-11-28

    We propose a new formulation to calculate the cosmic microwave background (CMB) spectrum in the Randall-Sundrum two-brane model based on recent progress in solving the bulk geometry using a low energy approximation. The evolution of the anisotropic stress imprinted on the brane by the 5D Weyl tensor is calculated. An impact of the dark radiation perturbation on the CMB spectrum is investigated in a simple model assuming an initially scale-invariant adiabatic perturbation. The dark radiation perturbation induces isocurvature perturbations, but the resultant spectrum can be quite different from the prediction of simple mixtures of adiabatic and isocurvature perturbations due to Weyl anisotropic stress.

  8. SeqHound: biological sequence and structure database as a platform for bioinformatics research

    PubMed Central

    2002-01-01

    Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit. PMID:12401134

  9. Dim target trajectory-associated detection in bright earth limb background

    NASA Astrophysics Data System (ADS)

    Chen, Penghui; Xu, Xiaojian; He, Xiaoyu; Jiang, Yuesong

    2015-09-01

    The intensive emission of earth limb in the field of view of sensors contributes much to the observation images. Due to the low signal-to-noise ratio (SNR), it is a challenge to detect small targets in earth limb background, especially for the detection of point-like targets from a single frame. To improve the target detection, track before detection (TBD) based on the frame sequence is performed. In this paper, a new technique is proposed to determine the target associated trajectories, which jointly carries out background removing, maximum value projection (MVP) and Hough transform. The background of the bright earth limb in the observation images is removed according to the profile characteristics. For a moving target, the corresponding pixels in the MVP image are shifting approximately regularly in time sequence. And the target trajectory is determined by Hough transform according to the pixel characteristics of the target and the clutter and noise. Comparing with traditional frame-by-frame methods, determining associated trajectories from MVP reduces the computation load. Numerical simulations are presented to demonstrate the effectiveness of the approach proposed.

  10. Quantification of the cerebrospinal fluid from a new whole body MRI sequence

    NASA Astrophysics Data System (ADS)

    Lebret, Alain; Petit, Eric; Durning, Bruno; Hodel, Jérôme; Rahmouni, Alain; Decq, Philippe

    2012-03-01

    Our work aims to develop a biomechanical model of hydrocephalus both intended to perform clinical research and to assist the neurosurgeon in diagnosis decisions. Recently, we have defined a new MR imaging sequence based on SPACE (Sampling Perfection with Application optimized Contrast using different flip-angle Evolution). On these images, the cerebrospinal fluid (CSF) appears as a homogeneous hypersignal. Therefore such images are suitable for segmentation and for volume assessment of the CSF. In this paper we present a fully automatic 3D segmentation of such SPACE MRI sequences. We choose a topological approach considering that CSF can be modeled as a simply connected object (i.e. a filled sphere). First an initial object which must be strictly included in the CSF and homotopic to a filled sphere, is determined by using a moment-preserving thresholding. Then a priority function based on an Euclidean distance map is computed in order to control the thickening process that adds "simple points" to the initial thresholded object. A point is called simple if its addition or its suppression does not result in change of topology neither for the object, nor for the background. The method is validated by measuring fluid volume of brain phantoms and by comparing our volume assessments on clinical data to those derived from a segmentation controlled by expert physicians. Then we show that a distinction between pathological cases and healthy adult people can be achieved by a linear discriminant analysis on volumes of the ventricular and intracranial subarachnoid spaces.

  11. The glycan-specific sulfotransferase (R77W)GalNAc-4-ST1 putatively responsible for peeling skin syndrome has normal properties consistent with a simple sequence polymorphisim.

    PubMed

    Fiete, Dorothy; Mi, Yiling; Beranek, Mary; Baenziger, Nancy L; Baenziger, Jacques U

    2017-05-01

    Expanded access to DNA sequencing now fosters ready detection of site-specific human genome alterations whose actual significance requires in-depth functional study to rule in or out disease-causing mutations. This is a particular concern for genomic sequence differences in glycosyltransferases, whose implications are often difficult to assess. A recent whole-exome sequencing study identifies (c.229 C > T) in the GalNAc-4-ST1 glycosyltransferase (CHST8) as a disease-causing missense R77W mutation yielding the genodermatosis peeling skin syndrome (PSS) when homozygous. Cabral et al. (Genomics. 2012;99:202-208) cite this sequence change as reducing keratinocyte GalNAc-4-ST1 activity, thus decreasing glycosaminoglycan sulfation, as the mechanism for this blistering disorder. Such an identification could point toward potential clinical and/or prenatal diagnosis of a harmful medical condition. However, GalNAc-4-ST1 has minimal activity toward glycosaminoglycans, instead modifying terminal β1,4-linked GalNAc on N- and O-linked oligosaccharides on specific glycoproteins. We find expression, processing and catalytic activity of GalNAc-4-ST1 completely equivalent between wild type and (R77W) sulfotransferases. Moreover, keratinocytes have little or no GalNAc-4-ST1 mRNA, indicating that they do not express GalNAc-4-ST1. In addition, loss-of-function of GalNAc-4-ST1 primarily presents as reproductive system aberrations rather than skin effects. These findings, an allele frequency of 0.004357, and a 10-fold difference in prevalence of CHST8 (c.299 C > T, R77W) across different ethnic groups, suggest that this sequence represents a "passenger" distributed polymorphism, a simple sequence variant form of the enzyme having normal activity, rather than a "driver" disease-causing mutation that accounts for PSS. This study presents an example for guiding biomedical research initiatives, as well as medical and personal/family perspectives, regarding newly-identified genomic sequence

  12. Observable tensor-to-scalar ratio and secondary gravitational wave background

    NASA Astrophysics Data System (ADS)

    Chatterjee, Arindam; Mazumdar, Anupam

    2018-03-01

    In this paper we will highlight how a simple vacuum energy dominated inflection-point inflation can match the current data from cosmic microwave background radiation, and predict large primordial tensor to scalar ratio, r ˜O (10-3-10-2), with observable second order gravitational wave background, which can be potentially detectable from future experiments, such as DECi-hertz Interferometer Gravitational wave Observatory (DECIGO), Laser Interferometer Space Antenna (eLISA), cosmic explorer (CE), and big bang observatory (BBO).

  13. Query-seeded iterative sequence similarity searching improves selectivity 5–20-fold

    PubMed Central

    Li, Weizhong; Lopez, Rodrigo

    2017-01-01

    Abstract Iterative similarity search programs, like psiblast, jackhmmer, and psisearch, are much more sensitive than pairwise similarity search methods like blast and ssearch because they build a position specific scoring model (a PSSM or HMM) that captures the pattern of sequence conservation characteristic to a protein family. But models are subject to contamination; once an unrelated sequence has been added to the model, homologs of the unrelated sequence will also produce high scores, and the model can diverge from the original protein family. Examination of alignment errors during psiblast PSSM contamination suggested a simple strategy for dramatically reducing PSSM contamination. psiblast PSSMs are built from the query-based multiple sequence alignment (MSA) implied by the pairwise alignments between the query model (PSSM, HMM) and the subject sequences in the library. When the original query sequence residues are inserted into gapped positions in the aligned subject sequence, the resulting PSSM rarely produces alignment over-extensions or alignments to unrelated sequences. This simple step, which tends to anchor the PSSM to the original query sequence and slightly increase target percent identity, can reduce the frequency of false-positive alignments more than 20-fold compared with psiblast and jackhmmer, with little loss in search sensitivity. PMID:27923999

  14. Neural representations and mechanisms for the performance of simple speech sequences

    PubMed Central

    Bohland, Jason W.; Bullock, Daniel; Guenther, Frank H.

    2010-01-01

    Speakers plan the phonological content of their utterances prior to their release as speech motor acts. Using a finite alphabet of learned phonemes and a relatively small number of syllable structures, speakers are able to rapidly plan and produce arbitrary syllable sequences that fall within the rules of their language. The class of computational models of sequence planning and performance termed competitive queuing (CQ) models have followed Lashley (1951) in assuming that inherently parallel neural representations underlie serial action, and this idea is increasingly supported by experimental evidence. In this paper we develop a neural model that extends the existing DIVA model of speech production in two complementary ways. The new model includes paired structure and content subsystems (cf. MacNeilage, 1998) that provide parallel representations of a forthcoming speech plan, as well as mechanisms for interfacing these phonological planning representations with learned sensorimotor programs to enable stepping through multi-syllabic speech plans. On the basis of previous reports, the model’s components are hypothesized to be localized to specific cortical and subcortical structures, including the left inferior frontal sulcus, the medial premotor cortex, the basal ganglia and thalamus. The new model, called GODIVA (Gradient Order DIVA), thus fills a void in current speech research by providing formal mechanistic hypotheses about both phonological and phonetic processes that are grounded by neuroanatomy and physiology. This framework also generates predictions that can be tested in future neuroimaging and clinical case studies. PMID:19583476

  15. Double-modulation spectroscopy of molecular ions - Eliminating the background in velocity-modulation spectroscopy

    NASA Technical Reports Server (NTRS)

    Lan, Guang; Tholl, Hans Dieter; Farley, John W.

    1991-01-01

    Velocity-modulation spectroscopy is an established technique for performing laser absorption spectroscopy of molecular ions in a discharge. However, such experiments are often plagued by a coherent background signal arising from emission from the discharge or from electronic pickup. Fluctuations in the background can obscure the desired signal. A simple technique using amplitude modulation of the laser and two lock-in amplifiers in series to detect the signal is demonstrated. The background and background fluctuations are thereby eliminated, facilitating the detection of molecular ions.

  16. Massively Parallel Sequencing Reveals the Complex Structure of an Irradiated Human Chromosome on a Mouse Background in the Tc1 Model of Down Syndrome

    PubMed Central

    Clayton, Stephen; Prigmore, Elena; Langley, Elizabeth; Yang, Fengtang; Maguire, Sean; Fu, Beiyuan; Rajan, Diana; Sheppard, Olivia; Scott, Carol; Hauser, Heidi; Stephens, Philip J.; Stebbings, Lucy A.; Ng, Bee Ling; Fitzgerald, Tomas; Quail, Michael A.; Banerjee, Ruby; Rothkamm, Kai; Tybulewicz, Victor L. J.; Fisher, Elizabeth M. C.; Carter, Nigel P.

    2013-01-01

    Down syndrome (DS) is caused by trisomy of chromosome 21 (Hsa21) and presents a complex phenotype that arises from abnormal dosage of genes on this chromosome. However, the individual dosage-sensitive genes underlying each phenotype remain largely unknown. To help dissect genotype – phenotype correlations in this complex syndrome, the first fully transchromosomic mouse model, the Tc1 mouse, which carries a copy of human chromosome 21 was produced in 2005. The Tc1 strain is trisomic for the majority of genes that cause phenotypes associated with DS, and this freely available mouse strain has become used widely to study DS, the effects of gene dosage abnormalities, and the effect on the basic biology of cells when a mouse carries a freely segregating human chromosome. Tc1 mice were created by a process that included irradiation microcell-mediated chromosome transfer of Hsa21 into recipient mouse embryonic stem cells. Here, the combination of next generation sequencing, array-CGH and fluorescence in situ hybridization technologies has enabled us to identify unsuspected rearrangements of Hsa21 in this mouse model; revealing one deletion, six duplications and more than 25 de novo structural rearrangements. Our study is not only essential for informing functional studies of the Tc1 mouse but also (1) presents for the first time a detailed sequence analysis of the effects of gamma radiation on an entire human chromosome, which gives some mechanistic insight into the effects of radiation damage on DNA, and (2) overcomes specific technical difficulties of assaying a human chromosome on a mouse background where highly conserved sequences may confound the analysis. Sequence data generated in this study is deposited in the ENA database, Study Accession number: ERP000439. PMID:23596509

  17. A Practical Workshop for Generating Simple DNA Fingerprints of Plants

    ERIC Educational Resources Information Center

    Rouziere, A.-S.; Redman, J. E.

    2011-01-01

    Gel electrophoresis DNA fingerprints offer a graphical and visually appealing illumination of the similarities and differences between DNA sequences of different species and individuals. A polymerase chain reaction (PCR) and restriction digest protocol was designed to give high-school students the opportunity to generate simple fingerprints of…

  18. Optimal Background Estimators in Single-Molecule FRET Microscopy.

    PubMed

    Preus, Søren; Hildebrandt, Lasse L; Birkedal, Victoria

    2016-09-20

    Single-molecule total internal reflection fluorescence (TIRF) microscopy constitutes an umbrella of powerful tools that facilitate direct observation of the biophysical properties, population heterogeneities, and interactions of single biomolecules without the need for ensemble synchronization. Due to the low signal/noise ratio in single-molecule TIRF microscopy experiments, it is important to determine the local background intensity, especially when the fluorescence intensity of the molecule is used quantitatively. Here we compare and evaluate the performance of different aperture-based background estimators used particularly in single-molecule Förster resonance energy transfer. We introduce the general concept of multiaperture signatures and use this technique to demonstrate how the choice of background can affect the measured fluorescence signal considerably. A new, to our knowledge, and simple background estimator is proposed, called the local statistical percentile (LSP). We show that the LSP background estimator performs as well as current background estimators at low molecular densities and significantly better in regions of high molecular densities. The LSP background estimator is thus suited for single-particle TIRF microscopy of dense biological samples in which the intensity itself is an observable of the technique. Copyright © 2016 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  19. Low-pass sequencing for microbial comparative genomics

    PubMed Central

    Goo, Young Ah; Roach, Jared; Glusman, Gustavo; Baliga, Nitin S; Deutsch, Kerry; Pan, Min; Kennedy, Sean; DasSarma, Shiladitya; Victor Ng, Wailap; Hood, Leroy

    2004-01-01

    Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1) the metabolically versatile Haloarcula marismortui; (2) the non-pigmented Natrialba asiatica; (3) the psychrophile Halorubrum lacusprofundi and (4) the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI) for their predicted proteins. Multiple insertion sequence (IS) elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP) and transcription factor IIB (TFB) homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1) high GC content and (2) low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the IS-element rich

  20. Arbitrarily accurate twin composite π -pulse sequences

    NASA Astrophysics Data System (ADS)

    Torosov, Boyan T.; Vitanov, Nikolay V.

    2018-04-01

    We present three classes of symmetric broadband composite pulse sequences. The composite phases are given by analytic formulas (rational fractions of π ) valid for any number of constituent pulses. The transition probability is expressed by simple analytic formulas and the order of pulse area error compensation grows linearly with the number of pulses. Therefore, any desired compensation order can be produced by an appropriate composite sequence; in this sense, they are arbitrarily accurate. These composite pulses perform equally well as or better than previously published ones. Moreover, the current sequences are more flexible as they allow total pulse areas of arbitrary integer multiples of π .

  1. Rapid Whole-Genome Sequencing for Investigation of a Neonatal MRSA Outbreak

    PubMed Central

    Köser, Claudio U.; Holden, Matthew T.G.; Ellington, Matthew J.; Cartwright, Edward J.P.; Brown, Nicholas M.; Ogilvy-Stuart, Amanda L.; Hsu, Li Yang; Chewapreecha, Claire; Croucher, Nicholas J.; Harris, Simon R.; Sanders, Mandy; Enright, Mark C.; Dougan, Gordon; Bentley, Stephen D.; Parkhill, Julian; Fraser, Louise J.; Betley, Jason R.; Schulz-Trieglaff, Ole B.; Smith, Geoffrey P.; Peacock, Sharon J.

    2013-01-01

    Background Isolates of methicillin-resistant Staphylococcus aureus (MRSA) belonging to a single lineage are often indistinguishable by means of current typing techniques. Whole-genome sequencing may provide improved resolution to define transmission pathways and characterize outbreaks. Methods We investigated a putative MRSA outbreak in a neonatal intensive care unit. By using rapid high-throughput sequencing technology with a clinically relevant turnaround time, we retrospectively sequenced the DNA from seven isolates associated with the outbreak and another seven MRSA isolates associated with carriage of MRSA or bacteremia in the same hospital. Results We constructed a phylogenetic tree by comparing single-nucleotide polymorphisms (SNPs) in the core genome to a reference genome (an epidemic MRSA clone, EMRSA-15 [sequence type 22]). This revealed a distinct cluster of outbreak isolates and clear separation between these and the nonoutbreak isolates. A previously missed transmission event was detected between two patients with bacteremia who were not part of the outbreak. We created an artificial “resistome” of antibiotic-resistance genes and demonstrated concordance between it and the results of phenotypic susceptibility testing; we also created a “toxome” consisting of toxin genes. One outbreak isolate had a hypermutator phenotype with a higher number of SNPs than the other outbreak isolates, highlighting the difficulty of imposing a simple threshold for the number of SNPs between isolates to decide whether they are part of a recent transmission chain. Conclusions Whole-genome sequencing can provide clinically relevant data within a time frame that can influence patient care. The need for automated data interpretation and the provision of clinically meaningful reports represent hurdles to clinical implementation. (Funded by the U.K. Clinical Research Collaboration Translational Infection Research Initiative and others.) PMID:22693998

  2. Development and transferability of black and red raspberry microsatellite markers from short-read sequences

    USDA-ARS?s Scientific Manuscript database

    The advent of next-generation sequencing technologies has been a boon to the cost-effective development of molecular markers, particularly in non-model species. Here, we demonstrate the efficiency of microsatellite or simple sequence repeat (SSR) marker development from short-read sequences using th...

  3. Background Checks on School Personnel. ERIC Digest Series EA 55.

    ERIC Educational Resources Information Center

    Baas, Alan

    Although it is relatively simple to check on applicants' basic professional competency, ensuring the moral competency of potential school employees is much more difficult. This digest examines major legal issues, district liabilities and responsibilities, suggested guidelines, and information sources involving employee background checks. Of more…

  4. Development of highly polymorphic simple sequence repeat markers using genome-wide microsatellite variant analysis in Foxtail millet [Setaria italica (L.) P. Beauv.

    PubMed Central

    2014-01-01

    Background Foxtail millet (Setaria italica (L.) Beauv.) is an important gramineous grain-food and forage crop. It is grown worldwide for human and livestock consumption. Its small genome and diploid nature have led to foxtail millet fast becoming a novel model for investigating plant architecture, drought tolerance and C4 photosynthesis of grain and bioenergy crops. Therefore, cost-effective, reliable and highly polymorphic molecular markers covering the entire genome are required for diversity, mapping and functional genomics studies in this model species. Result A total of 5,020 highly repetitive microsatellite motifs were isolated from the released genome of the genotype 'Yugu1’ by sequence scanning. Based on sequence comparison between S. italica and S. viridis, a set of 788 SSR primer pairs were designed. Of these primers, 733 produced reproducible amplicons and were polymorphic among 28 Setaria genotypes selected from diverse geographical locations. The number of alleles detected by these SSR markers ranged from 2 to 16, with an average polymorphism information content of 0.67. The result obtained by neighbor-joining cluster analysis of 28 Setaria genotypes, based on Nei’s genetic distance of the SSR data, showed that these SSR markers are highly polymorphic and effective. Conclusions A large set of highly polymorphic SSR markers were successfully and efficiently developed based on genomic sequence comparison between different genotypes of the genus Setaria. The large number of new SSR markers and their placement on the physical map represent a valuable resource for studying diversity, constructing genetic maps, functional gene mapping, QTL exploration and molecular breeding in foxtail millet and its closely related species. PMID:24472631

  5. Genome Annotation Generator: a simple tool for generating and correcting WGS annotation tables for NCBI submission

    PubMed Central

    Hall, Brian; Derego, Theodore; Bremer, Forest T; Cannoles, Kyle

    2018-01-01

    Abstract Background One of the most overlooked, yet critical, components of a whole genome sequencing (WGS) project is the submission and curation of the data to a genomic repository, most commonly the National Center for Biotechnology Information (NCBI). While large genome centers or genome groups have developed software tools for post-annotation assembly filtering, annotation, and conversion into the NCBI’s annotation table format, these tools typically require back-end setup and connection to an Structured Query Language (SQL) database and/or some knowledge of programming (Perl, Python) to implement. With WGS becoming commonplace, genome sequencing projects are moving away from the genome centers and into the ecology or biology lab, where fewer resources are present to support the process of genome assembly curation. To fill this gap, we developed software to assess, filter, and transfer annotation and convert a draft genome assembly and annotation set into the NCBI annotation table (.tbl) format, facilitating submission to the NCBI Genome Assembly database. This software has no dependencies, is compatible across platforms, and utilizes a simple command to perform a variety of simple and complex post-analysis, pre-NCBI submission WGS project tasks. Findings The Genome Annotation Generator is a consistent and user-friendly bioinformatics tool that can be used to generate a .tbl file that is consistent with the NCBI submission pipeline Conclusions The Genome Annotation Generator achieves the goal of providing a publicly available tool that will facilitate the submission of annotated genome assemblies to the NCBI. It is useful for any individual researcher or research group that wishes to submit a genome assembly of their study system to the NCBI. PMID:29635297

  6. Sequence Composition and Gene Content of the Short Arm of Rye (Secale cereale) Chromosome 1

    PubMed Central

    Fluch, Silvia; Kopecky, Dieter; Burg, Kornel; Šimková, Hana; Taudien, Stefan; Petzold, Andreas; Kubaláková, Marie; Platzer, Matthias; Berenyi, Maria; Krainer, Siegfried; Doležel, Jaroslav; Lelley, Tamas

    2012-01-01

    Background The purpose of the study is to elucidate the sequence composition of the short arm of rye chromosome 1 (Secale cereale) with special focus on its gene content, because this portion of the rye genome is an integrated part of several hundreds of bread wheat varieties worldwide. Methodology/Principal Findings Multiple Displacement Amplification of 1RS DNA, obtained from flow sorted 1RS chromosomes, using 1RS ditelosomic wheat-rye addition line, and subsequent Roche 454FLX sequencing of this DNA yielded 195,313,589 bp sequence information. This quantity of sequence information resulted in 0.43× sequence coverage of the 1RS chromosome arm, permitting the identification of genes with estimated probability of 95%. A detailed analysis revealed that more than 5% of the 1RS sequence consisted of gene space, identifying at least 3,121 gene loci representing 1,882 different gene functions. Repetitive elements comprised about 72% of the 1RS sequence, Gypsy/Sabrina (13.3%) being the most abundant. More than four thousand simple sequence repeat (SSR) sites mostly located in gene related sequence reads were identified for possible marker development. The existence of chloroplast insertions in 1RS has been verified by identifying chimeric chloroplast-genomic sequence reads. Synteny analysis of 1RS to the full genomes of Oryza sativa and Brachypodium distachyon revealed that about half of the genes of 1RS correspond to the distal end of the short arm of rice chromosome 5 and the proximal region of the long arm of Brachypodium distachyon chromosome 2. Comparison of the gene content of 1RS to 1HS barley chromosome arm revealed high conservation of genes related to chromosome 5 of rice. Conclusions The present study revealed the gene content and potential gene functions on this chromosome arm and demonstrated numerous sequence elements like SSRs and gene-related sequences, which can be utilised for future research as well as in breeding of wheat and rye. PMID:22328922

  7. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    PubMed Central

    2011-01-01

    Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/. PMID:21385349

  8. Probabilistic BPRRC: Robust Change Detection against Illumination Changes and Background Movements

    NASA Astrophysics Data System (ADS)

    Yokoi, Kentaro

    This paper presents Probabilistic Bi-polar Radial Reach Correlation (PrBPRRC), a change detection method that is robust against illumination changes and background movements. Most of the traditional change detection methods are robust against either illumination changes or background movements; BPRRC is one of the illumination-robust change detection methods. We introduce a probabilistic background texture model into BPRRC and add the robustness against background movements including foreground invasions such as moving cars, walking people, swaying trees, and falling snow. We show the superiority of PrBPRRC in the environment with illumination changes and background movements by using three public datasets and one private dataset: ATON Highway data, Karlsruhe traffic sequence data, PETS 2007 data, and Walking-in-a-room data.

  9. Correlated perturbations from inflation and the cosmic microwave background.

    PubMed

    Amendola, Luca; Gordon, Christopher; Wands, David; Sasaki, Misao

    2002-05-27

    We compare the latest cosmic microwave background data with theoretical predictions including correlated adiabatic and cold dark matter (CDM) isocurvature perturbations with a simple power-law dependence. We find that there is a degeneracy between the amplitude of correlated isocurvature perturbations and the spectral tilt. A negative (red) tilt is found to be compatible with a larger isocurvature contribution. Estimates of the baryon and CDM densities are found to be almost independent of the isocurvature amplitude. The main result is that current microwave background data do not exclude a dominant contribution from CDM isocurvature fluctuations on large scales.

  10. The Joint Effects of Background Selection and Genetic Recombination on Local Gene Genealogies

    PubMed Central

    Zeng, Kai; Charlesworth, Brian

    2011-01-01

    Background selection, the effects of the continual removal of deleterious mutations by natural selection on variability at linked sites, is potentially a major determinant of DNA sequence variability. However, the joint effects of background selection and genetic recombination on the shape of the neutral gene genealogy have proved hard to study analytically. The only existing formula concerns the mean coalescent time for a pair of alleles, making it difficult to assess the importance of background selection from genome-wide data on sequence polymorphism. Here we develop a structured coalescent model of background selection with recombination and implement it in a computer program that efficiently generates neutral gene genealogies for an arbitrary sample size. We check the validity of the structured coalescent model against forward-in-time simulations and show that it accurately captures the effects of background selection. The model produces more accurate predictions of the mean coalescent time than the existing formula and supports the conclusion that the effect of background selection is greater in the interior of a deleterious region than at its boundaries. The level of linkage disequilibrium between sites is elevated by background selection, to an extent that is well summarized by a change in effective population size. The structured coalescent model is readily extendable to more realistic situations and should prove useful for analyzing genome-wide polymorphism data. PMID:21705759

  11. The joint effects of background selection and genetic recombination on local gene genealogies.

    PubMed

    Zeng, Kai; Charlesworth, Brian

    2011-09-01

    Background selection, the effects of the continual removal of deleterious mutations by natural selection on variability at linked sites, is potentially a major determinant of DNA sequence variability. However, the joint effects of background selection and genetic recombination on the shape of the neutral gene genealogy have proved hard to study analytically. The only existing formula concerns the mean coalescent time for a pair of alleles, making it difficult to assess the importance of background selection from genome-wide data on sequence polymorphism. Here we develop a structured coalescent model of background selection with recombination and implement it in a computer program that efficiently generates neutral gene genealogies for an arbitrary sample size. We check the validity of the structured coalescent model against forward-in-time simulations and show that it accurately captures the effects of background selection. The model produces more accurate predictions of the mean coalescent time than the existing formula and supports the conclusion that the effect of background selection is greater in the interior of a deleterious region than at its boundaries. The level of linkage disequilibrium between sites is elevated by background selection, to an extent that is well summarized by a change in effective population size. The structured coalescent model is readily extendable to more realistic situations and should prove useful for analyzing genome-wide polymorphism data.

  12. High-throughput sequencing of black pepper root transcriptome

    PubMed Central

    2012-01-01

    Background Black pepper (Piper nigrum L.) is one of the most popular spices in the world. It is used in cooking and the preservation of food and even has medicinal properties. Losses in production from disease are a major limitation in the culture of this crop. The major diseases are root rot and foot rot, which are results of root infection by Fusarium solani and Phytophtora capsici, respectively. Understanding the molecular interaction between the pathogens and the host’s root region is important for obtaining resistant cultivars by biotechnological breeding. Genetic and molecular data for this species, though, are limited. In this paper, RNA-Seq technology has been employed, for the first time, to describe the root transcriptome of black pepper. Results The root transcriptome of black pepper was sequenced by the NGS SOLiD platform and assembled using the multiple-k method. Blast2Go and orthoMCL methods were used to annotate 10338 unigenes. The 4472 predicted proteins showed about 52% homology with the Arabidopsis proteome. Two root proteomes identified 615 proteins, which seem to define the plant’s root pattern. Simple-sequence repeats were identified that may be useful in studies of genetic diversity and may have applications in biotechnology and ecology. Conclusions This dataset of 10338 unigenes is crucially important for the biotechnological breeding of black pepper and the ecogenomics of the Magnoliids, a major group of basal angiosperms. PMID:22984782

  13. Effects of different preservation methods on inter simple sequence repeat (ISSR) and random amplified polymorphic DNA (RAPD) molecular markers in botanic samples.

    PubMed

    Wang, Xiaolong; Li, Lin; Zhao, Jiaxin; Li, Fangliang; Guo, Wei; Chen, Xia

    2017-04-01

    To evaluate the effects of different preservation methods (stored in a -20°C ice chest, preserved in liquid nitrogen and dried in silica gel) on inter simple sequence repeat (ISSR) or random amplified polymorphic DNA (RAPD) analyses in various botanical specimens (including broad-leaved plants, needle-leaved plants and succulent plants) for different times (three weeks and three years), we used a statistical analysis based on the number of bands, genetic index and cluster analysis. The results demonstrate that methods used to preserve samples can provide sufficient amounts of genomic DNA for ISSR and RAPD analyses; however, the effect of different preservation methods on these analyses vary significantly, and the preservation time has little effect on these analyses. Our results provide a reference for researchers to select the most suitable preservation method depending on their study subject for the analysis of molecular markers based on genomic DNA. Copyright © 2017 Académie des sciences. Published by Elsevier Masson SAS. All rights reserved.

  14. Molecular characterization of three common olive (Olea europaea L.) cultivars in Palestine, using simple sequence repeat (SSR) markers.

    PubMed

    Obaid, Ramiz; Abu-Qaoud, Hassan; Arafeh, Rami

    2014-09-03

    Eight accessions of olive trees from three common varieties in Palestine, Nabali Baladi, Nabali Mohassan and Surri, were genetically evaluated using five simple sequence repeat (SSR) markers. A total of 17 alleles from 5 loci were observed in which 15 (88.2%) were polymorphic and 2 (11.8%) were monomorphic. An average of 3.4 alleles per locus was found ranging from 2.0 alleles with the primers GAPU-103 and DCA-9 to 5.0 alleles with U9932 and DCA-16. The smallest amplicon size observed was 50 bp with the primer DCA-16, whereas the largest one (450 bp) with the primer U9932. Cluster analysis with the unweighted pair group method with arithmetic average (UPGMA) showed three clusters: a cluster with four accessions from the 'Nabali Baladi' cultivar, another cluster with three accessions that represents the 'Nabali Mohassen' cultivar and finally the 'Surri' cultivar. The similarity coefficient for the eight olive tree samples ranged from a maximum of 100% between two accessions from Nabali Baladi and also in two other samples from Nabali Mohassan, to a minimum similarity coefficient (0.315) between the Surri and two Nabali Baladi accessions. The results in this investigation clearly highlight the genetic dissimilarity between the three main olive cultivars that have been misidentified and mixed up in the past, based on conventional morphological characters.

  15. Molecular characterization of three common olive (Olea europaea L.) cultivars in Palestine, using simple sequence repeat (SSR) markers

    PubMed Central

    Obaid, Ramiz; Abu-Qaoud, Hassan; Arafeh, Rami

    2014-01-01

    Eight accessions of olive trees from three common varieties in Palestine, Nabali Baladi, Nabali Mohassan and Surri, were genetically evaluated using five simple sequence repeat (SSR) markers. A total of 17 alleles from 5 loci were observed in which 15 (88.2%) were polymorphic and 2 (11.8%) were monomorphic. An average of 3.4 alleles per locus was found ranging from 2.0 alleles with the primers GAPU-103 and DCA-9 to 5.0 alleles with U9932 and DCA-16. The smallest amplicon size observed was 50 bp with the primer DCA-16, whereas the largest one (450 bp) with the primer U9932. Cluster analysis with the unweighted pair group method with arithmetic average (UPGMA) showed three clusters: a cluster with four accessions from the ‘Nabali Baladi’ cultivar, another cluster with three accessions that represents the ‘Nabali Mohassen’ cultivar and finally the ‘Surri’ cultivar. The similarity coefficient for the eight olive tree samples ranged from a maximum of 100% between two accessions from Nabali Baladi and also in two other samples from Nabali Mohassan, to a minimum similarity coefficient (0.315) between the Surri and two Nabali Baladi accessions. The results in this investigation clearly highlight the genetic dissimilarity between the three main olive cultivars that have been misidentified and mixed up in the past, based on conventional morphological characters. PMID:26019564

  16. Diversity and genetic stability in banana genotypes in a breeding program using inter simple sequence repeats (ISSR) markers.

    PubMed

    Silva, A V C; Nascimento, A L S; Vitória, M F; Rabbani, A R C; Soares, A N R; Lédo, A S

    2017-02-23

    Banana (Musa spp) is a fruit species frequently cultivated and consumed worldwide. Molecular markers are important for estimating genetic diversity in germplasm and between genotypes in breeding programs. The objective of this study was to analyze the genetic diversity of 21 banana genotypes (FHIA 23, PA42-44, Maçã, Pacovan Ken, Bucaneiro, YB42-47, Grand Naine, Tropical, FHIA 18, PA94-01, YB42-17, Enxerto, Japira, Pacovã, Prata-Anã, Maravilha, PV79-34, Caipira, Princesa, Garantida, and Thap Maeo), by using inter-simple sequence repeat (ISSR) markers. Material was generated from the banana breeding program of Embrapa Cassava & Fruits and evaluated at Embrapa Coastal Tablelands. The 12 primers used in this study generated 97.5% polymorphism. Four clusters were identified among the different genotypes studied, and the sum of the first two principal components was 48.91%. From the Unweighted Pair Group Method using Arithmetic averages (UPGMA) dendrogram, it was possible to identify two main clusters and subclusters. Two genotypes (Garantida and Thap Maeo) remained isolated from the others, both in the UPGMA clustering and in the principal cordinate analysis (PCoA). Using ISSR markers, we could analyze the genetic diversity of the studied material and state that these markers were efficient at detecting sufficient polymorphism to estimate the genetic variability in banana genotypes.

  17. Comparative Evaluation of Background Subtraction Algorithms in Remote Scene Videos Captured by MWIR Sensors

    PubMed Central

    Yao, Guangle; Lei, Tao; Zhong, Jiandan; Jiang, Ping; Jia, Wenwu

    2017-01-01

    Background subtraction (BS) is one of the most commonly encountered tasks in video analysis and tracking systems. It distinguishes the foreground (moving objects) from the video sequences captured by static imaging sensors. Background subtraction in remote scene infrared (IR) video is important and common to lots of fields. This paper provides a Remote Scene IR Dataset captured by our designed medium-wave infrared (MWIR) sensor. Each video sequence in this dataset is identified with specific BS challenges and the pixel-wise ground truth of foreground (FG) for each frame is also provided. A series of experiments were conducted to evaluate BS algorithms on this proposed dataset. The overall performance of BS algorithms and the processor/memory requirements were compared. Proper evaluation metrics or criteria were employed to evaluate the capability of each BS algorithm to handle different kinds of BS challenges represented in this dataset. The results and conclusions in this paper provide valid references to develop new BS algorithm for remote scene IR video sequence, and some of them are not only limited to remote scene or IR video sequence but also generic for background subtraction. The Remote Scene IR dataset and the foreground masks detected by each evaluated BS algorithm are available online: https://github.com/JerryYaoGl/BSEvaluationRemoteSceneIR. PMID:28837112

  18. Folding and Stabilization of Native-Sequence-Reversed Proteins

    PubMed Central

    Zhang, Yuanzhao; Weber, Jeffrey K; Zhou, Ruhong

    2016-01-01

    Though the problem of sequence-reversed protein folding is largely unexplored, one might speculate that reversed native protein sequences should be significantly more foldable than purely random heteropolymer sequences. In this article, we investigate how the reverse-sequences of native proteins might fold by examining a series of small proteins of increasing structural complexity (α-helix, β-hairpin, α-helix bundle, and α/β-protein). Employing a tandem protein structure prediction algorithmic and molecular dynamics simulation approach, we find that the ability of reverse sequences to adopt native-like folds is strongly influenced by protein size and the flexibility of the native hydrophobic core. For β-hairpins with reverse-sequences that fail to fold, we employ a simple mutational strategy for guiding stable hairpin formation that involves the insertion of amino acids into the β-turn region. This systematic look at reverse sequence duality sheds new light on the problem of protein sequence-structure mapping and may serve to inspire new protein design and protein structure prediction protocols. PMID:27113844

  19. Folding and Stabilization of Native-Sequence-Reversed Proteins

    NASA Astrophysics Data System (ADS)

    Zhang, Yuanzhao; Weber, Jeffrey K.; Zhou, Ruhong

    2016-04-01

    Though the problem of sequence-reversed protein folding is largely unexplored, one might speculate that reversed native protein sequences should be significantly more foldable than purely random heteropolymer sequences. In this article, we investigate how the reverse-sequences of native proteins might fold by examining a series of small proteins of increasing structural complexity (α-helix, β-hairpin, α-helix bundle, and α/β-protein). Employing a tandem protein structure prediction algorithmic and molecular dynamics simulation approach, we find that the ability of reverse sequences to adopt native-like folds is strongly influenced by protein size and the flexibility of the native hydrophobic core. For β-hairpins with reverse-sequences that fail to fold, we employ a simple mutational strategy for guiding stable hairpin formation that involves the insertion of amino acids into the β-turn region. This systematic look at reverse sequence duality sheds new light on the problem of protein sequence-structure mapping and may serve to inspire new protein design and protein structure prediction protocols.

  20. Navy LPD-17 Amphibious Ship Procurement: Background, Issues, and Options for Congress

    DTIC Science & Technology

    2010-07-01

    Background, Issues, and Options for Congress 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e . TASK...performed out of sequence and significant rework has been required, disrupting the optimal construction sequence and application of lessons learned...deeply concerned about Northrop Grumman Ship Systems’ ( NGSS ) ability to recover in the aftermath of Hurricane Katrina, particularly in regard to

  1. Navy LPD-17 Amphibious Ship Procurement: Background, Issues, and Options for Congress

    DTIC Science & Technology

    2010-06-10

    Background, Issues, and Options for Congress 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e . TASK...out of sequence and significant rework has been required, disrupting the optimal construction sequence and application of lessons learned for...concerned about Northrop Grumman Ship Systems’ ( NGSS ) ability to recover in the aftermath of Hurricane Katrina, particularly in regard to construction

  2. Evaluation of fire recurrence effect on genetic diversity in maritime pine (Pinus pinaster Ait.) stands using Inter-Simple Sequence Repeat profiles.

    PubMed

    Lucas-Borja, M E; Ahrazem, O; Candel-Pérez, D; Moya, D; Fonseca, T; Hernández Tecles, E; De Las Heras, J; Gómez-Gómez, L

    2016-12-01

    The management of maritime pine in fire-prone habitats is a challenging task and fine-scale population genetic analyses are necessary to check if different fire recurrences affect genetic variability. The objective of this study was to assess the effect of fire recurrence on maritime pine genetic diversity using inter-simple sequence repeat markers (ISSR). Three maritime pine (Pinus pinaster Ait.) populations from Northern Portugal were chosen to characterize the genetic variability among populations. In relation to fire recurrence, Seirós population was affected by fire both in 1990 and 2005 whereas Vila Seca-2 population was affected by fire just in 2005. The Vila Seca-1 population has been never affected by fire. Our results showed the highest Nei's genetic diversity (He=0.320), Shannon information index (I=0.474) and polymorphic loci (PPL=87.79%) among samples from twice burned populations (Seirós site). Thus, fire regime plays an important role affecting genetic diversity in the short-term, although not generating maritime pine genetic erosion. Copyright © 2016 Elsevier B.V. All rights reserved.

  3. Multiplexed microsatellite recovery using massively parallel sequencing

    USGS Publications Warehouse

    Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C.

    2011-01-01

    Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5M (USD).

  4. Multiplexed microsatellite recovery using massively parallel sequencing

    Treesearch

    T.N. Jennings; B.J. Knaus; T.D. Mullins; S.M. Haig; R.C. Cronn

    2011-01-01

    Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of...

  5. Comparative study of the effectiveness and limitations of current methods for detecting sequence coevolution.

    PubMed

    Mao, Wenzhi; Kaya, Cihan; Dutta, Anindita; Horovitz, Amnon; Bahar, Ivet

    2015-06-15

    With rapid accumulation of sequence data on several species, extracting rational and systematic information from multiple sequence alignments (MSAs) is becoming increasingly important. Currently, there is a plethora of computational methods for investigating coupled evolutionary changes in pairs of positions along the amino acid sequence, and making inferences on structure and function. Yet, the significance of coevolution signals remains to be established. Also, a large number of false positives (FPs) arise from insufficient MSA size, phylogenetic background and indirect couplings. Here, a set of 16 pairs of non-interacting proteins is thoroughly examined to assess the effectiveness and limitations of different methods. The analysis shows that recent computationally expensive methods designed to remove biases from indirect couplings outperform others in detecting tertiary structural contacts as well as eliminating intermolecular FPs; whereas traditional methods such as mutual information benefit from refinements such as shuffling, while being highly efficient. Computations repeated with 2,330 pairs of protein families from the Negatome database corroborated these results. Finally, using a training dataset of 162 families of proteins, we propose a combined method that outperforms existing individual methods. Overall, the study provides simple guidelines towards the choice of suitable methods and strategies based on available MSA size and computing resources. Software is freely available through the Evol component of ProDy API. © The Author 2015. Published by Oxford University Press.

  6. BioWord: a sequence manipulation suite for Microsoft Word.

    PubMed

    Anzaldi, Laura J; Muñoz-Fernández, Daniel; Erill, Ivan

    2012-06-07

    The ability to manipulate, edit and process DNA and protein sequences has rapidly become a necessary skill for practicing biologists across a wide swath of disciplines. In spite of this, most everyday sequence manipulation tools are distributed across several programs and web servers, sometimes requiring installation and typically involving frequent switching between applications. To address this problem, here we have developed BioWord, a macro-enabled self-installing template for Microsoft Word documents that integrates an extensive suite of DNA and protein sequence manipulation tools. BioWord is distributed as a single macro-enabled template that self-installs with a single click. After installation, BioWord will open as a tab in the Office ribbon. Biologists can then easily manipulate DNA and protein sequences using a familiar interface and minimize the need to switch between applications. Beyond simple sequence manipulation, BioWord integrates functionality ranging from dyad search and consensus logos to motif discovery and pair-wise alignment. Written in Visual Basic for Applications (VBA) as an open source, object-oriented project, BioWord allows users with varying programming experience to expand and customize the program to better meet their own needs. BioWord integrates a powerful set of tools for biological sequence manipulation within a handy, user-friendly tab in a widely used word processing software package. The use of a simple scripting language and an object-oriented scheme facilitates customization by users and provides a very accessible educational platform for introducing students to basic bioinformatics algorithms.

  7. Flexible, Mastery-Oriented Astrophysics Sequence.

    ERIC Educational Resources Information Center

    Zeilik, Michael, II

    1981-01-01

    Describes the implementation and impact of a two-semester mastery-oriented astrophysics sequence for upper-level physics/astrophysics majors designed to handle flexibly a wide range of student backgrounds. A Personalized System of Instruction (PSI) format was used fostering frequent student-instructor interaction and role-modeling behavior in…

  8. Development of novel simple sequence repeat markers in bitter gourd (Momordica charantia L.) through enriched genomic libraries and their utilization in analysis of genetic diversity and cross-species transferability.

    PubMed

    Saxena, Swati; Singh, Archana; Archak, Sunil; Behera, Tushar K; John, Joseph K; Meshram, Sudhir U; Gaikwad, Ambika B

    2015-01-01

    Microsatellite or simple sequence repeat (SSR) markers are the preferred markers for genetic analyses of crop plants. The availability of a limited number of such markers in bitter gourd (Momordica charantia L.) necessitates the development and characterization of more SSR markers. These were developed from genomic libraries enriched for three dinucleotide, five trinucleotide, and two tetranucleotide core repeat motifs. Employing the strategy of polymerase chain reaction-based screening, the number of clones to be sequenced was reduced by 81 % and 93.7 % of the sequenced clones contained in microsatellite repeats. Unique primer-pairs were designed for 160 microsatellite loci, and amplicons of expected length were obtained for 151 loci (94.4 %). Evaluation of diversity in 54 bitter gourd accessions at 51 loci indicated that 20 % of the loci were polymorphic with the polymorphic information content values ranging from 0.13 to 0.77. Fifteen Indian varieties were clearly distinguished indicative of the usefulness of the developed markers. Markers at 40 loci (78.4 %) were transferable to six species, viz. Momordica cymbalaria, Momordica subangulata subsp. renigera, Momordica balsamina, Momordica dioca, Momordica cochinchinesis, and Momordica sahyadrica. The microsatellite markers reported will be useful in various genetic and molecular genetic studies in bitter gourd, a cucurbit of immense nutritive, medicinal, and economic importance.

  9. Alignment-free Transcriptomic and Metatranscriptomic Comparison Using Sequencing Signatures with Variable Length Markov Chains.

    PubMed

    Liao, Weinan; Ren, Jie; Wang, Kun; Wang, Shun; Zeng, Feng; Wang, Ying; Sun, Fengzhu

    2016-11-23

    The comparison between microbial sequencing data is critical to understand the dynamics of microbial communities. The alignment-based tools analyzing metagenomic datasets require reference sequences and read alignments. The available alignment-free dissimilarity approaches model the background sequences with Fixed Order Markov Chain (FOMC) yielding promising results for the comparison of microbial communities. However, in FOMC, the number of parameters grows exponentially with the increase of the order of Markov Chain (MC). Under a fixed high order of MC, the parameters might not be accurately estimated owing to the limitation of sequencing depth. In our study, we investigate an alternative to FOMC to model background sequences with the data-driven Variable Length Markov Chain (VLMC) in metatranscriptomic data. The VLMC originally designed for long sequences was extended to apply to high-throughput sequencing reads and the strategies to estimate the corresponding parameters were developed. The flexible number of parameters in VLMC avoids estimating the vast number of parameters of high-order MC under limited sequencing depth. Different from the manual selection in FOMC, VLMC determines the MC order adaptively. Several beta diversity measures based on VLMC were applied to compare the bacterial RNA-Seq and metatranscriptomic datasets. Experiments show that VLMC outperforms FOMC to model the background sequences in transcriptomic and metatranscriptomic samples. A software pipeline is available at https://d2vlmc.codeplex.com.

  10. The Contribution of Short Repeats of Low Sequence Complexity to Large Conifer Genomes

    Treesearch

    A. Schmidt; R.L. Doudrick; J.S. Heslop-Harrison; T. Schmidt

    2000-01-01

    Abstract: The abundance and genomic organization of six simple sequence repeats, consisting of di-, tri-, and tetranucleotide sequence motifs, and a minisatellite repeat have been analyzed in different gymnosperms by Southern hybridization. Within the gymnosperm genomes investigated, the abundance and genomic organization of micro- and...

  11. Single molecule targeted sequencing for cancer gene mutation detection.

    PubMed

    Gao, Yan; Deng, Liwei; Yan, Qin; Gao, Yongqian; Wu, Zengding; Cai, Jinsen; Ji, Daorui; Li, Gailing; Wu, Ping; Jin, Huan; Zhao, Luyang; Liu, Song; Ge, Liangjin; Deem, Michael W; He, Jiankui

    2016-05-19

    With the rapid decline in cost of sequencing, it is now affordable to examine multiple genes in a single disease-targeted clinical test using next generation sequencing. Current targeted sequencing methods require a separate step of targeted capture enrichment during sample preparation before sequencing. Although there are fast sample preparation methods available in market, the library preparation process is still relatively complicated for physicians to use routinely. Here, we introduced an amplification-free Single Molecule Targeted Sequencing (SMTS) technology, which combined targeted capture and sequencing in one step. We demonstrated that this technology can detect low-frequency mutations using artificially synthesized DNA sample. SMTS has several potential advantages, including simple sample preparation thus no biases and errors are introduced by PCR reaction. SMTS has the potential to be an easy and quick sequencing technology for clinical diagnosis such as cancer gene mutation detection, infectious disease detection, inherited condition screening and noninvasive prenatal diagnosis.

  12. Background subtraction for fluorescence EXAFS data of a very dilute dopant Z in Z + 1 host.

    PubMed

    Medling, Scott; Bridges, Frank

    2011-07-01

    When conducting EXAFS at the Cu K-edge for ZnS:Cu with very low Cu concentration (<0.04% Cu), a large background was present that increased with energy. This background arises from a Zn X-ray Raman peak, which moves through the Cu fluorescence window, plus the tail of the Zn fluorescence peak. This large background distorts the EXAFS and must be removed separately before reducing the data. A simple means to remove this background is described.

  13. Estimate of Cosmic Muon Background for Shallow Underground Neutrino Detectors

    NASA Astrophysics Data System (ADS)

    Casimiro, E.; Simão, F. R. A.; Anjos, J. C.

    One of the severe limitations in detecting neutrino signals from nuclear reactors is that the copious cosmic ray background imposes the use of a time veto upon the passage of the muons to reduce the number of fake signals due to muon-induced spallation neutrons. For this reason neutrino detectors are usually located underground, with a large overburden. However there are practical limitations that do restrain from locating the detectors at large depths underground. In order to decide the depth underground at which the Neutrino Angra Detector (currently in preparation) should be installed, an estimate of the cosmogenic background in the detector as a function of the depth is required. We report here a simple analytical estimation of the muon rates in the detector volume for different plausible depths, assuming a simple plain overburden geometry. We extend the calculation to the case of the San Onofre neutrino detector and to the case of the Double Chooz neutrino detector, where other estimates or measurements have been performed. Our estimated rates are consistent.

  14. Diversity of viruses detected by deep sequencing in pigs from a common background

    USDA-ARS?s Scientific Manuscript database

    The trial was successful in identifying a number of viruses in the feces of the pigs demonstrating the application of this technology to determine the background noise in the animals. The findings in this study are similar to the fecal virome in pigs from a typical commercial swine farm in the Unite...

  15. Simple, reliable, and nondestructive method for the measurement of vacuum pressure without specialized equipment.

    PubMed

    Yuan, Jin-Peng; Ji, Zhong-Hua; Zhao, Yan-Ting; Chang, Xue-Fang; Xiao, Lian-Tuan; Jia, Suo-Tang

    2013-09-01

    We present a simple, reliable, and nondestructive method for the measurement of vacuum pressure in a magneto-optical trap. The vacuum pressure is verified to be proportional to the collision rate constant between cold atoms and the background gas with a coefficient k, which can be calculated by means of the simple ideal gas law. The rate constant for loss due to collisions with all background gases can be derived from the total collision loss rate by a series of loading curves of cold atoms under different trapping laser intensities. The presented method is also applicable for other cold atomic systems and meets the miniaturization requirement of commercial applications.

  16. Comparison of double-locus sequence typing (DLST) and multilocus sequence typing (MLST) for the investigation of Pseudomonas aeruginosa populations.

    PubMed

    Cholley, Pascal; Stojanov, Milos; Hocquet, Didier; Thouverez, Michelle; Bertrand, Xavier; Blanc, Dominique S

    2015-08-01

    Reliable molecular typing methods are necessary to investigate the epidemiology of bacterial pathogens. Reference methods such as multilocus sequence typing (MLST) and pulsed-field gel electrophoresis (PFGE) are costly and time consuming. Here, we compared our newly developed double-locus sequence typing (DLST) method for Pseudomonas aeruginosa to MLST and PFGE on a collection of 281 isolates. DLST was as discriminatory as MLST and was able to recognize "high-risk" epidemic clones. Both methods were highly congruent. Not surprisingly, a higher discriminatory power was observed with PFGE. In conclusion, being a simple method (single-strand sequencing of only 2 loci), DLST is valuable as a first-line typing tool for epidemiological investigations of P. aeruginosa. Coupled to a more discriminant method like PFGE or whole genome sequencing, it might represent an efficient typing strategy to investigate or prevent outbreaks. Copyright © 2015 Elsevier Inc. All rights reserved.

  17. Novel numerical and graphical representation of DNA sequences and proteins.

    PubMed

    Randić, M; Novic, M; Vikić-Topić, D; Plavsić, D

    2006-12-01

    We have introduced novel numerical and graphical representations of DNA, which offer a simple and unique characterization of DNA sequences. The numerical representation of a DNA sequence is given as a sequence of real numbers derived from a unique graphical representation of the standard genetic code. There is no loss of information on the primary structure of a DNA sequence associated with this numerical representation. The novel representations are illustrated with the coding sequences of the first exon of beta-globin gene of half a dozen species in addition to human. The method can be extended to proteins as is exemplified by humanin, a 24-aa peptide that has recently been identified as a specific inhibitor of neuronal cell death induced by familial Alzheimer's disease mutant genes.

  18. Evaluation of anonymous and expressed sequence tag derived polymorphic microsatellite markers in the tobacco budworm Heliothis virescens (Lepidoptera: noctuidae)

    USDA-ARS?s Scientific Manuscript database

    Polymorphic genetic markers were identified and characterized using a partial genomic library of Heliothis virescens enriched for simple sequence repeats (SSR) and nucleotide sequences of expressed sequence tags (EST). Nucleotide sequences of 192 clones from the partial genomic library yielded 147 u...

  19. Corruption of genomic databases with anomalous sequence.

    PubMed

    Lamperti, E D; Kittelberger, J M; Smith, T F; Villa-Komaroff, L

    1992-06-11

    We describe evidence that DNA sequences from vectors used for cloning and sequencing have been incorporated accidentally into eukaryotic entries in the GenBank database. These incorporations were not restricted to one type of vector or to a single mechanism. Many minor instances may have been the result of simple editing errors, but some entries contained large blocks of vector sequence that had been incorporated by contamination or other accidents during cloning. Some cases involved unusual rearrangements and areas of vector distant from the normal insertion sites. Matches to vector were found in 0.23% of 20,000 sequences analyzed in GenBank Release 63. Although the possibility of anomalous sequence incorporation has been recognized since the inception of GenBank and should be easy to avoid, recent evidence suggests that this problem is increasing more quickly than the database itself. The presence of anomalous sequence may have serious consequences for the interpretation and use of database entries, and will have an impact on issues of database management. The incorporated vector fragments described here may also be useful for a crude estimate of the fidelity of sequence information in the database. In alignments with well-defined ends, the matching sequences showed 96.8% identity to vector; when poorer matches with arbitrary limits were included, the aggregate identity to vector sequence was 94.8%.

  20. Genetic mapping of ascochyta blight resistance in chickpea (Cicer arietinum L.) using a simple sequence repeat linkage map.

    PubMed

    Tar'an, B; Warkentin, T D; Tullu, A; Vandenberg, A

    2007-01-01

    Ascochyta blight, caused by the fungus Ascochyta rabiei (Pass.) Lab., is one of the most devastating diseases of chickpea (Cicer arietinum L.) worldwide. Research was conducted to map genetic factors for resistance to ascochyta blight using a linkage map constructed with 144 simple sequence repeat markers and 1 morphological marker (fc, flower colour). Stem cutting was used to vegetatively propagate 186 F2 plants derived from a cross between Cicer arietinum L. 'ICCV96029' and 'CDC Frontier'. A total of 556 cutting-derived plants were evaluated for their reaction to ascochyta blight under controlled conditions. Disease reaction of the F1 and F2 plants demonstrated that the resistance was dominantly inherited. A Fain's test based on the means and variances of the ascochyta blight reaction of the F3 families showed that a few genes were segregating in the population. Composite interval mapping identified 3 genomic regions that were associated with the reaction to ascochyta blight. One quantitative trait locus (QTL) on each of LG3, LG4, and LG6 accounted for 13%, 29%, and 12%, respectively, of the total estimated phenotypic variation for the reaction to ascochyta blight. Together, these loci controlled 56% of the total estimated phenotypic variation. The QTL on LG4 and LG6 were in common with the previously reported QTL for ascochyta blight resistance, whereas the QTL on LG3 was unique to the current population.

  1. Genetic diversity of the Andean tuber-bearing species, oca (Oxalis tuberosa Mol.), investigated by inter-simple sequence repeats.

    PubMed

    Pissard, A; Ghislain, M; Bertin, P

    2006-01-01

    The Andean tuber-bearing species, Oxalis tuberosa Mol., is a vegetatively propagated crop cultivated in the uplands of the Andes. Its genetic diversity was investigated in the present study using the inter-simple sequence repeat (ISSR) technique. Thirty-two accessions originating from South America (Argentina, Bolivia, Chile, and Peru) and maintained in vitro were chosen to represent the ecogeographic diversity of its cultivation area. Twenty-two primers were tested and 9 were selected according to fingerprinting quality and reproducibility. Genetic diversity analysis was performed with 90 markers. Jaccard's genetic distance between accessions ranged from 0 to 0.49 with an average of 0.28 +/- 0.08 (mean +/- SD). Dendrogram (UPGMA (unweighted pair-group method with arithmetic averaging)) and factorial correspondence analysis (FCA) showed that the genetic structure was influenced by the collection site. The two most distant clusters contained all of the Peruvian accessions, one from Bolivia, none from Argentina or Chile. Analysis by country revealed that Peru presented the greatest genetic distances from the other countries and possessed the highest intra-country genetic distance (0.30 +/- 0.08). This suggests that the Peruvian oca accessions form a distinct genetic group. The relatively low level of genetic diversity in the oca species may be related to its predominating reproduction strategy, i.e., vegetative propagation. The extent and structure of the genetic diversity of the species detailed here should help the establishment of conservation strategies.

  2. Background Selection in Partially Selfing Populations

    PubMed Central

    Roze, Denis

    2016-01-01

    Self-fertilizing species often present lower levels of neutral polymorphism than their outcrossing relatives. Indeed, selfing automatically increases the rate of coalescence per generation, but also enhances the effects of background selection and genetic hitchhiking by reducing the efficiency of recombination. Approximations for the effect of background selection in partially selfing populations have been derived previously, assuming tight linkage between deleterious alleles and neutral loci. However, loosely linked deleterious mutations may have important effects on neutral diversity in highly selfing populations. In this article, I use a general method based on multilocus population genetics theory to express the effect of a deleterious allele on diversity at a linked neutral locus in terms of moments of genetic associations between loci. Expressions for these genetic moments at equilibrium are then computed for arbitrary rates of selfing and recombination. An extrapolation of the results to the case where deleterious alleles segregate at multiple loci is checked using individual-based simulations. At high selfing rates, the tight linkage approximation underestimates the effect of background selection in genomes with moderate to high map length; however, another simple approximation can be obtained for this situation and provides accurate predictions as long as the deleterious mutation rate is not too high. PMID:27075726

  3. Simple diazonium chemistry to develop specific gene sensing platforms.

    PubMed

    Revenga-Parra, M; García-Mendiola, T; González-Costas, J; González-Romero, E; Marín, A García; Pau, J L; Pariente, F; Lorenzo, E

    2014-02-27

    A simple strategy for covalent immobilizing DNA sequences, based on the formation of stable diazonized conducting platforms, is described. The electrochemical reduction of 4-nitrobenzenediazonium salt onto screen-printed carbon electrodes (SPCE) in aqueous media gives rise to terminal grafted amino groups. The presence of primary aromatic amines allows the formation of diazonium cations capable to react with the amines present at the DNA capture probe. As a comparison a second strategy based on the binding of aminated DNA capture probes to the developed diazonized conducting platforms through a crosslinking agent was also employed. The resulting DNA sensing platforms were characterized by cyclic voltammetry, electrochemical impedance spectroscopy and spectroscopic ellipsometry. The hybridization event with the complementary sequence was detected using hexaamineruthenium (III) chloride as electrochemical indicator. Finally, they were applied to the analysis of a 145-bp sequence from the human gene MRP3, reaching a detection limit of 210 pg μL(-1). Copyright © 2014 Elsevier B.V. All rights reserved.

  4. Software for pre-processing Illumina next-generation sequencing short read sequences

    PubMed Central

    2014-01-01

    Background When compared to Sanger sequencing technology, next-generation sequencing (NGS) technologies are hindered by shorter sequence read length, higher base-call error rate, non-uniform coverage, and platform-specific sequencing artifacts. These characteristics lower the quality of their downstream analyses, e.g. de novo and reference-based assembly, by introducing sequencing artifacts and errors that may contribute to incorrect interpretation of data. Although many tools have been developed for quality control and pre-processing of NGS data, none of them provide flexible and comprehensive trimming options in conjunction with parallel processing to expedite pre-processing of large NGS datasets. Methods We developed ngsShoRT (next-generation sequencing Short Reads Trimmer), a flexible and comprehensive open-source software package written in Perl that provides a set of algorithms commonly used for pre-processing NGS short read sequences. We compared the features and performance of ngsShoRT with existing tools: CutAdapt, NGS QC Toolkit and Trimmomatic. We also compared the effects of using pre-processed short read sequences generated by different algorithms on de novo and reference-based assembly for three different genomes: Caenorhabditis elegans, Saccharomyces cerevisiae S288c, and Escherichia coli O157 H7. Results Several combinations of ngsShoRT algorithms were tested on publicly available Illumina GA II, HiSeq 2000, and MiSeq eukaryotic and bacteria genomic short read sequences with the focus on removing sequencing artifacts and low-quality reads and/or bases. Our results show that across three organisms and three sequencing platforms, trimming improved the mean quality scores of trimmed sequences. Using trimmed sequences for de novo and reference-based assembly improved assembly quality as well as assembler performance. In general, ngsShoRT outperformed comparable trimming tools in terms of trimming speed and improvement of de novo and reference

  5. Ultra-low background DNA cloning system.

    PubMed

    Goto, Kenta; Nagano, Yukio

    2013-01-01

    Yeast-based in vivo cloning is useful for cloning DNA fragments into plasmid vectors and is based on the ability of yeast to recombine the DNA fragments by homologous recombination. Although this method is efficient, it produces some by-products. We have developed an "ultra-low background DNA cloning system" on the basis of yeast-based in vivo cloning, by almost completely eliminating the generation of by-products and applying the method to commonly used Escherichia coli vectors, particularly those lacking yeast replication origins and carrying an ampicillin resistance gene (Amp(r)). First, we constructed a conversion cassette containing the DNA sequences in the following order: an Amp(r) 5' UTR (untranslated region) and coding region, an autonomous replication sequence and a centromere sequence from yeast, a TRP1 yeast selectable marker, and an Amp(r) 3' UTR. This cassette allowed conversion of the Amp(r)-containing vector into the yeast/E. coli shuttle vector through use of the Amp(r) sequence by homologous recombination. Furthermore, simultaneous transformation of the desired DNA fragment into yeast allowed cloning of this DNA fragment into the same vector. We rescued the plasmid vectors from all yeast transformants, and by-products containing the E. coli replication origin disappeared. Next, the rescued vectors were transformed into E. coli and the by-products containing the yeast replication origin disappeared. Thus, our method used yeast- and E. coli-specific "origins of replication" to eliminate the generation of by-products. Finally, we successfully cloned the DNA fragment into the vector with almost 100% efficiency.

  6. Optical Processing Techniques For Pseudorandom Sequence Prediction

    NASA Astrophysics Data System (ADS)

    Gustafson, Steven C.

    1983-11-01

    Pseudorandom sequences are series of apparently random numbers generated, for example, by linear or nonlinear feedback shift registers. An important application of these sequences is in spread spectrum communication systems, in which, for example, the transmitted carrier phase is digitally modulated rapidly and pseudorandomly and in which the information to be transmitted is incorporated as a slow modulation in the pseudorandom sequence. In this case the transmitted information can be extracted only by a receiver that uses for demodulation the same pseudorandom sequence used by the transmitter, and thus this type of communication system has a very high immunity to third-party interference. However, if a third party can predict in real time the probable future course of the transmitted pseudorandom sequence given past samples of this sequence, then interference immunity can be significantly reduced.. In this application effective pseudorandom sequence prediction techniques should be (1) applicable in real time to rapid (e.g., megahertz) sequence generation rates, (2) applicable to both linear and nonlinear pseudorandom sequence generation processes, and (3) applicable to error-prone past sequence samples of limited number and continuity. Certain optical processing techniques that may meet these requirements are discussed in this paper. In particular, techniques based on incoherent optical processors that perform general linear transforms or (more specifically) matrix-vector multiplications are considered. Computer simulation examples are presented which indicate that significant prediction accuracy can be obtained using these transforms for simple pseudorandom sequences. However, the useful prediction of more complex pseudorandom sequences will probably require the application of more sophisticated optical processing techniques.

  7. Shielding concepts for low-background proportional counter arrays in surface laboratories

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aalseth, Craig E.; Humble, Paul H.; Mace, Emily K.

    2016-02-01

    Development of ultra low background gas proportional counters has made the contribution from naturally occurring radioactive isotopes – primarily and activity in the uranium and thorium decay chains – inconsequential to instrumental sensitivity levels when measurements are performed in above ground surface laboratories. Simple lead shielding is enough to mitigate against gamma rays as gas proportional counters are already relatively insensitive to naturally occurring gamma radiation. The dominant background in these surface laboratory measurements using ultra low background gas proportional counters is due to cosmic ray generated muons, neutrons, and protons. Studies of measurements with ultra low background gas proportionalmore » counters in surface and underground laboratories as well as radiation transport Monte Carlo simulations suggest a preferred conceptual design to achieve the highest possible sensitivity from an array of low background gas proportional counters when operated in a surface laboratory. The basis for a low background gas proportional counter array and the preferred shielding configuration is reported, especially in relation to measurements of radioactive gases having low energy decays such as 37Ar.« less

  8. Label-Free Fluorescent DNA Dendrimers for microRNA Detection Based On Nonlinear Hybridization Chain Reaction-Mediated Multiple G-Quadruplex with Low Background Signal.

    PubMed

    Xue, Qingwang; Liu, Chunxue; Li, Xia; Dai, Li; Wang, Huaisheng

    2018-04-18

    Various fluorescent sensing systems for miRNA detection have been developed, but they mostly contain enzymatic amplification reactions and label procedures. The strict reaction conditions of tool enzymes and the high cost of labeling limit their potential applications, especially in complex biological matrices. Here, we have addressed the difficult problems and report a strategy for label-free fluorescent DNA dendrimers based on enzyme-free nonlinear hybridization chain reaction (HCR)-mediated multiple G-quadruplex for simple, sensitive, and selective detection of miRNAs with low-background signal. In the strategy, a split G-quadruplex (3:1) sequence is ingeniously designed at both ends of two double-stranded DNAs, which is exploited as building blocks for nonlinear HCR assembly, thereby acquiring a low background signal. A hairpin switch probe (HSP) was employed as recognition and transduction element. Upon sensing the target miRNA, the nonlinear HCR assembly of two blocks (blocks-A and blocks-B) was initiated with the help of two single-stranded DNA assistants, resulting in chain-branching growth of DNA dendrimers with multiple G-quadruplex incorporation. With the zinc(II)-protoporphyrin IX (ZnPPIX) selectively intercalated into the multiple G-quadruplexes, fluorescent DNA dendrimers were obtained, leading to an exponential fluorescence intensity increase. Benefiting from excellent performances of nonlinear HCR and low background signal, this strategy possesses the characteristics of a simplified reaction operation process, as well as high sensitivity. Moreover, the proposed fluorescent sensing strategy also shows preferable selectivity, and can be implemented without modified DNA blocks. Importantly, the strategy has also been tested for miRNA quantification with high confidence in breast cancer cells. Thus, this proposed strategy for label-free fluorescent DNA dendrimers based on a nonlinear HCR-mediated multiple G-quadruplex will be turned into an alternative

  9. BASIC: A Simple and Accurate Modular DNA Assembly Method.

    PubMed

    Storch, Marko; Casini, Arturo; Mackrow, Ben; Ellis, Tom; Baldwin, Geoff S

    2017-01-01

    Biopart Assembly Standard for Idempotent Cloning (BASIC) is a simple, accurate, and robust DNA assembly method. The method is based on linker-mediated DNA assembly and provides highly accurate DNA assembly with 99 % correct assemblies for four parts and 90 % correct assemblies for seven parts [1]. The BASIC standard defines a single entry vector for all parts flanked by the same prefix and suffix sequences and its idempotent nature means that the assembled construct is returned in the same format. Once a part has been adapted into the BASIC format it can be placed at any position within a BASIC assembly without the need for reformatting. This allows laboratories to grow comprehensive and universal part libraries and to share them efficiently. The modularity within the BASIC framework is further extended by the possibility of encoding ribosomal binding sites (RBS) and peptide linker sequences directly on the linkers used for assembly. This makes BASIC a highly versatile library construction method for combinatorial part assembly including the construction of promoter, RBS, gene variant, and protein-tag libraries. In comparison with other DNA assembly standards and methods, BASIC offers a simple robust protocol; it relies on a single entry vector, provides for easy hierarchical assembly, and is highly accurate for up to seven parts per assembly round [2].

  10. Background instrumental music and serial recall.

    PubMed

    Nittono, H

    1997-06-01

    Although speech and vocal music are consistently shown to impair serial recall for visually presented items, instrumental music does not always produce a significant disruption. This study investigated the features of instrumental music that would modulate the disruption in serial recall. 24 students were presented sequences of nine digits and required to recall the digits in order of presentation. Instrumental music as played either forward or backward during the task. Forward music caused significantly more disruption than did silence, whereas the reversed music did not. Some higher-order factor may be at work in the effect of background music on serial recall.

  11. a Simple Symmetric Algorithm Using a Likeness with Introns Behavior in RNA Sequences

    NASA Astrophysics Data System (ADS)

    Regoli, Massimo

    2009-02-01

    The RNA-Crypto System (shortly RCS) is a symmetric key algorithm to cipher data. The idea for this new algorithm starts from the observation of nature. In particular from the observation of RNA behavior and some of its properties. The RNA sequences has some sections called Introns. Introns, derived from the term "intragenic regions", are non-coding sections of precursor mRNA (pre-mRNA) or other RNAs, that are removed (spliced out of the RNA) before the mature RNA is formed. Once the introns have been spliced out of a pre-mRNA, the resulting mRNA sequence is ready to be translated into a protein. The corresponding parts of a gene are known as introns as well. The nature and the role of Introns in the pre-mRNA is not clear and it is under ponderous researches by Biologists but, in our case, we will use the presence of Introns in the RNA-Crypto System output as a strong method to add chaotic non coding information and an unnecessary behaviour in the access to the secret key to code the messages. In the RNA-Crypto System algoritnm the introns are sections of the ciphered message with non-coding information as well as in the precursor mRNA.

  12. Morphometric analysis of a fresh simple crater on the Moon.

    NASA Astrophysics Data System (ADS)

    Vivaldi, V.; Ninfo, A.; Massironi, M.; Martellato, E.; Cremonese, G.

    In this research we are proposing an innovative method to determine and quantify the morphology of a simple fresh impact crater. Linné is a well preserved impact crater of 2.2 km in diameter, located at 27.7oN 11.8oE, near the western edge of Mare Serenitatis on the Moon. The crater was photographed by the Lunar Orbiter and the Apollo space missions. Its particular morphology may place Linné as the most striking example of small fresh simple crater. Morphometric analysis, conducted on recent high resolution DTM from LROC (NASA), quantitatively confirmed the pristine morphology of the crater, revealing a clear inner layering which highlight a sequence of lava emplacement events.

  13. A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses

    USDA-ARS?s Scientific Manuscript database

    Background: Next-generation sequencing (NGS) allows ultra-deep sequencing of nucleic acids. The use of sequence-independent amplification of viral nucleic acids without utilization of target-specific primers provides advantages over traditional sequencing methods and allows detection of unsuspected ...

  14. Very low luminosity active galaxies and the X-ray background

    NASA Technical Reports Server (NTRS)

    Elvis, M.; Soltan, A.; Keel, W. C.

    1984-01-01

    The properties of very low luminosity active galactic nuclei are not well studied, and, in particular, their possible contribution to the diffuse X-ray background is not known. In the present investigation, an X-ray luminosity function for the range from 10 to the 39th to 10 to the 42.5th ergs/s is constructed. The obtained X-ray luminosity function is integrated to estimate the contribution of these very low luminosity active galaxies to the diffuse X-ray background. The construction of the X-ray luminosity function is based on data obtained by Keel (1983) and some simple assumptions about optical and X-ray properties.

  15. Novel DNA probes with low background and high hybridization-triggered fluorescence.

    PubMed

    Lukhtanov, Eugeny A; Lokhov, Sergey G; Gorn, Vladimir V; Podyminogin, Mikhail A; Mahoney, Walt

    2007-01-01

    Novel fluorogenic DNA probes are described. The probes (called Pleiades) have a minor groove binder (MGB) and a fluorophore at the 5'-end and a non-fluorescent quencher at the 3'-end of the DNA sequence. This configuration provides surprisingly low background and high hybridization-triggered fluorescence. Here, we comparatively study the performance of such probes, MGB-Eclipse probes, and molecular beacons. Unlike the other two probe formats, the Pleiades probes have low, temperature-independent background fluorescence and excellent signal-to-background ratios. The probes possess good mismatch discrimination ability and high rates of hybridization. Based on the analysis of fluorescence and absorption spectra we propose a mechanism of action for the Pleiades probes. First, hydrophobic interactions between the quencher and the MGB bring the ends of the probe and, therefore, the fluorophore and the quencher in close proximity. Second, the MGB interacts with the fluorophore and independent of the quencher is able to provide a modest (2-4-fold) quenching effect. Joint action of the MGB and the quencher is the basis for the unique quenching mechanism. The fluorescence is efficiently restored upon binding of the probe to target sequence due to a disruption in the MGB-quencher interaction and concealment of the MGB moiety inside the minor groove.

  16. Novel DNA probes with low background and high hybridization-triggered fluorescence

    PubMed Central

    Lukhtanov, Eugeny A.; Lokhov, Sergey G.; Gorn, Vladimir V.; Podyminogin, Mikhail A.; Mahoney, Walt

    2007-01-01

    Novel fluorogenic DNA probes are described. The probes (called Pleiades) have a minor groove binder (MGB) and a fluorophore at the 5′-end and a non-fluorescent quencher at the 3′-end of the DNA sequence. This configuration provides surprisingly low background and high hybridization-triggered fluorescence. Here, we comparatively study the performance of such probes, MGB-Eclipse probes, and molecular beacons. Unlike the other two probe formats, the Pleiades probes have low, temperature-independent background fluorescence and excellent signal-to-background ratios. The probes possess good mismatch discrimination ability and high rates of hybridization. Based on the analysis of fluorescence and absorption spectra we propose a mechanism of action for the Pleiades probes. First, hydrophobic interactions between the quencher and the MGB bring the ends of the probe and, therefore, the fluorophore and the quencher in close proximity. Second, the MGB interacts with the fluorophore and independent of the quencher is able to provide a modest (2–4-fold) quenching effect. Joint action of the MGB and the quencher is the basis for the unique quenching mechanism. The fluorescence is efficiently restored upon binding of the probe to target sequence due to a disruption in the MGB–quencher interaction and concealment of the MGB moiety inside the minor groove. PMID:17259212

  17. Detection of Bacillus anthracis DNA in Complex Soil and Air Samples Using Next-Generation Sequencing

    PubMed Central

    Be, Nicholas A.; Thissen, James B.; Gardner, Shea N.; McLoughlin, Kevin S.; Fofanov, Viacheslav Y.; Koshinsky, Heather; Ellingson, Sally R.; Brettin, Thomas S.; Jackson, Paul J.; Jaing, Crystal J.

    2013-01-01

    Bacillus anthracis is the potentially lethal etiologic agent of anthrax disease, and is a significant concern in the realm of biodefense. One of the cornerstones of an effective biodefense strategy is the ability to detect infectious agents with a high degree of sensitivity and specificity in the context of a complex sample background. The nature of the B. anthracis genome, however, renders specific detection difficult, due to close homology with B. cereus and B. thuringiensis. We therefore elected to determine the efficacy of next-generation sequencing analysis and microarrays for detection of B. anthracis in an environmental background. We applied next-generation sequencing to titrated genome copy numbers of B. anthracis in the presence of background nucleic acid extracted from aerosol and soil samples. We found next-generation sequencing to be capable of detecting as few as 10 genomic equivalents of B. anthracis DNA per nanogram of background nucleic acid. Detection was accomplished by mapping reads to either a defined subset of reference genomes or to the full GenBank database. Moreover, sequence data obtained from B. anthracis could be reliably distinguished from sequence data mapping to either B. cereus or B. thuringiensis. We also demonstrated the efficacy of a microbial census microarray in detecting B. anthracis in the same samples, representing a cost-effective and high-throughput approach, complementary to next-generation sequencing. Our results, in combination with the capacity of sequencing for providing insights into the genomic characteristics of complex and novel organisms, suggest that these platforms should be considered important components of a biosurveillance strategy. PMID:24039948

  18. A SIMPLE RADIO-CHROMATOGRAM SCANNER

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McWeeny, D.J.; Burton, H.S.

    1962-07-01

    A sturdy, simple, and reliable radiochromatogram scanner is described. It is constructed from a Panax Universal Castle, a Panax 5054 rate meter, and a recording milliamometer. The castle houses 2 thin endwindows, G--M tubes type GE- EHM-2 mounted one above the other, windows 1/4 in. apart. The 1-in. chromatogram passes continuously thru a selection of slits permitting a choice of views by the G-M tubes. The background count is 10.5 counts per minute and the detection limit for S/sup 35/ as a 3 mm spot on Whatman no. 1 paper is less than 0.2 nc. (T.R.H.)

  19. How Does Sequence Structure Affect the Judgment of Time? Exploring a Weighted Sum of Segments Model

    ERIC Educational Resources Information Center

    Matthews, William J.

    2013-01-01

    This paper examines the judgment of segmented temporal intervals, using short tone sequences as a convenient test case. In four experiments, we investigate how the relative lengths, arrangement, and pitches of the tones in a sequence affect judgments of sequence duration, and ask whether the data can be described by a simple weighted sum of…

  20. The cosmic gamma-ray background from Type Ia supernovae

    NASA Technical Reports Server (NTRS)

    The, Lih-Sin; Leising, Mark D.; Clayton, Donald D.

    1993-01-01

    We present an improved calculation of the cumulative gamma-ray spectrum of Type Ia supernovae during the history of the universe. We follow Clayton & Ward (1975) in using a few Friedmann models and two simple histories of the average galaxian nucleosynthesis rate, but we improve their calculation by modeling the gamma-ray scattering in detailed numerical models of SN Ia's. The results confirm that near 1 MeV the SN Ia background may dominate, and that it is potentially observable, with high scientific importance. A very accurate measurement of the cosmic background spectrum between 0.1 and 1.0 MeV may reveal the turn-on time and the evolution of the rate of Type Ia supernova nucleosynthesis in the universe.

  1. Simple Sequence Repeat and S-locus Genotyping to Explore Genetic Variability in Polyploid Prunus spinosa and P. insititia.

    PubMed

    Halász, Júlia; Makovics-Zsohár, Noémi; Szőke, Ferenc; Ercisli, Sezai; Hegedűs, Attila

    2017-02-01

    Polyploid Prunus spinosa (2n = 4×) and P. insititia (2n = 6×) represent enormous genetic potential in Central Europe, which can be exploited in breeding programmes. In Hungary, 17 cultivar candidates were selected from wild-growing populations including 10 P. spinosa, 4 P. insititia and three P. spinosa × P. domestica hybrids (2n = 5×). Their taxonomic classification was based on their phenotypic characteristics. Six simple sequence repeats (SSRs) and the multiallelic S-locus genotyping were used to characterize genetic variability and reliable identification of the tested accessions. A total of 98 SSR alleles were identified, which presents 19.5 average allele number per locus, and each of the 17 genotypes could be discriminated based on unique SSR fingerprints. A total of 23 S-RNase alleles were identified. The complete and partial S-genotype was determined for 8 and 9 accessions, respectively. The identification of a cross-incompatible pair of cultivar candidates and several semi-compatible combinations help maximize fruit set in commercial orchards. Our results indicate that the S-allele pools of wild-growing P. spinosa and P. insititia are overlapping in Hungary. A phylogenetic and principal component analysis confirmed the high level of diversity and genetic differentiation present within the analysed genotypes and helped clarify doubtful taxonomic identities. Our data confirm that S-locus genotyping is suitable for diversity studies in polyploid Prunus species. The analysed accessions represent huge genetic potential that can be exploited in commercial cultivation.

  2. Morphological and Inter Simple Sequence Repeat (ISSR) markers analyses of Corynespora cassiicola isolates from rubber plantations in Malaysia.

    PubMed

    Nghia, Nguyen Anh; Kadir, Jugah; Sunderasan, E; Puad Abdullah, Mohd; Malik, Adam; Napis, Suhaimi

    2008-10-01

    Morphological features and Inter Simple Sequence Repeat (ISSR) polymorphism were employed to analyse 21 Corynespora cassiicola isolates obtained from a number of Hevea clones grown in rubber plantations in Malaysia. The C. cassiicola isolates used in this study were collected from several states in Malaysia from 1998 to 2005. The morphology of the isolates was characteristic of that previously described for C. cassiicola. Variations in colony and conidial morphology were observed not only among isolates but also within a single isolate with no inclination to either clonal or geographical origin of the isolates. ISSR analysis delineated the isolates into two distinct clusters. The dendrogram created from UPGMA analysis based on Nei and Li's coefficient (calculated from the binary matrix data of 106 amplified DNA bands generated from 8 ISSR primers) showed that cluster 1 encompasses 12 isolates from the states of Johor and Selangor (this cluster was further split into 2 sub clusters (1A, 1B), sub cluster 1B consists of a unique isolate, CKT05D); while cluster 2 comprises of 9 isolates that were obtained from the other states. Detached leaf assay performed on selected Hevea clones showed that the pathogenicity of representative isolates from cluster 1 (with the exception of CKT05D) resembled that of race 1; and isolates in cluster 2 showed pathogenicity similar to race 2 of the fungus that was previously identified in Malaysia. The isolate CKT05D from sub cluster 1B showed pathogenicity dissimilar to either race 1 or race 2.

  3. GABI-Kat SimpleSearch: new features of the Arabidopsis thaliana T-DNA mutant database.

    PubMed

    Kleinboelting, Nils; Huep, Gunnar; Kloetgen, Andreas; Viehoever, Prisca; Weisshaar, Bernd

    2012-01-01

    T-DNA insertion mutants are very valuable for reverse genetics in Arabidopsis thaliana. Several projects have generated large sequence-indexed collections of T-DNA insertion lines, of which GABI-Kat is the second largest resource worldwide. User access to the collection and its Flanking Sequence Tags (FSTs) is provided by the front end SimpleSearch (http://www.GABI-Kat.de). Several significant improvements have been implemented recently. The database now relies on the TAIRv10 genome sequence and annotation dataset. All FSTs have been newly mapped using an optimized procedure that leads to improved accuracy of insertion site predictions. A fraction of the collection with weak FST yield was re-analysed by generating new FSTs. Along with newly found predictions for older sequences about 20,000 new FSTs were included in the database. Information about groups of FSTs pointing to the same insertion site that is found in several lines but is real only in a single line are included, and many problematic FST-to-line links have been corrected using new wet-lab data. SimpleSearch currently contains data from ~71,000 lines with predicted insertions covering 62.5% of the 27,206 nuclear protein coding genes, and offers insertion allele-specific data from 9545 confirmed lines that are available from the Nottingham Arabidopsis Stock Centre.

  4. Non-Abelian Gauge Theory in the Lorentz Violating Background

    NASA Astrophysics Data System (ADS)

    Ganai, Prince A.; Shah, Mushtaq B.; Syed, Masood; Ahmad, Owais

    2018-03-01

    In this paper, we will discuss a simple non-Abelian gauge theory in the broken Lorentz spacetime background. We will study the partial breaking of Lorentz symmetry down to its sub-group. We will use the formalism of very special relativity for analysing this non-Abelian gauge theory. Moreover, we will discuss the quantisation of this theory using the BRST symmetry. Also, we will analyse this theory in the maximal Abelian gauge.

  5. Why are background telephone conversations distracting?

    PubMed

    Marsh, John E; Ljung, Robert; Jahncke, Helena; MacCutcheon, Douglas; Pausch, Florian; Ball, Linden J; Vachon, François

    2018-06-01

    Telephone conversation is ubiquitous within the office setting. Overhearing a telephone conversation-whereby only one of the two speakers is heard-is subjectively more annoying and objectively more distracting than overhearing a full conversation. The present study sought to determine whether this "halfalogue" effect is attributable to unexpected offsets and onsets within the background speech (acoustic unexpectedness) or to the tendency to predict the unheard part of the conversation (semantic [un]predictability), and whether these effects can be shielded against through top-down cognitive control. In Experiment 1, participants performed an office-related task in quiet or in the presence of halfalogue and dialogue background speech. Irrelevant speech was either meaningful or meaningless speech. The halfalogue effect was only present for the meaningful speech condition. Experiment 2 addressed whether higher task-engagement could shield against the halfalogue effect by manipulating the font of the to-be-read material. Although the halfalogue effect was found with an easy-to-read font (fluent text), the use of a difficult-to-read font (disfluent text) eliminated the effect. The halfalogue effect is thus attributable to the semantic (un)predictability, not the acoustic unexpectedness, of background telephone conversation and can be prevented by simple means such as increasing the level of engagement required by the focal task. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  6. Enhanced sequencing coverage with digital droplet multiple displacement amplification

    PubMed Central

    Sidore, Angus M.; Lan, Freeman; Lim, Shaun W.; Abate, Adam R.

    2016-01-01

    Sequencing small quantities of DNA is important for applications ranging from the assembly of uncultivable microbial genomes to the identification of cancer-associated mutations. To obtain sufficient quantities of DNA for sequencing, the small amount of starting material must be amplified significantly. However, existing methods often yield errors or non-uniform coverage, reducing sequencing data quality. Here, we describe digital droplet multiple displacement amplification, a method that enables massive amplification of low-input material while maintaining sequence accuracy and uniformity. The low-input material is compartmentalized as single molecules in millions of picoliter droplets. Because the molecules are isolated in compartments, they amplify to saturation without competing for resources; this yields uniform representation of all sequences in the final product and, in turn, enhances the quality of the sequence data. We demonstrate the ability to uniformly amplify the genomes of single Escherichia coli cells, comprising just 4.7 fg of starting DNA, and obtain sequencing coverage distributions that rival that of unamplified material. Digital droplet multiple displacement amplification provides a simple and effective method for amplifying minute amounts of DNA for accurate and uniform sequencing. PMID:26704978

  7. Development of genomic resources for the narrow-leafed lupin (Lupinus angustifolius): construction of a bacterial artificial chromosome (BAC) library and BAC-end sequencing

    PubMed Central

    2011-01-01

    Background Lupinus angustifolius L, also known as narrow-leafed lupin (NLL), is becoming an important grain legume crop that is valuable for sustainable farming and is becoming recognised as a potential human health food. Recent interest is being directed at NLL to improve grain production, disease and pest management and health benefits of the grain. However, studies have been hindered by a lack of extensive genomic resources for the species. Results A NLL BAC library was constructed consisting of 111,360 clones with an average insert size of 99.7 Kbp from cv Tanjil. The library has approximately 12 × genome coverage. Both ends of 9600 randomly selected BAC clones were sequenced to generate 13985 BAC end-sequences (BESs), covering approximately 1% of the NLL genome. These BESs permitted a preliminary characterisation of the NLL genome such as organisation and composition, with the BESs having approximately 39% G:C content, 16.6% repetitive DNA and 5.4% putative gene-encoding regions. From the BESs 9966 simple sequence repeat (SSR) motifs were identified and some of these are shown to be potential markers. Conclusions The NLL BAC library and BAC-end sequences are powerful resources for genetic and genomic research on lupin. These resources will provide a robust platform for future high-resolution mapping, map-based cloning, comparative genomics and assembly of whole-genome sequencing data for the species. PMID:22014081

  8. Olfactory cortical adaptation facilitates detection of odors against background.

    PubMed

    Kadohisa, Mikiko; Wilson, Donald A

    2006-03-01

    Detection and discrimination of odors generally, if not always, occurs against an odorous background. On any given inhalation, olfactory receptor neurons will be activated by features of both the target odorant and features of background stimuli. To identify a target odorant against a background therefore, the olfactory system must be capable of grouping a subset of features into an odor object distinct from the background. Our previous work has suggested that rapid homosynaptic depression of afferents to the anterior piriform cortex (aPCX) contributes to both cortical odor adaptation to prolonged stimulation and habituation of simple odor-evoked behaviors. We hypothesize here that this process may also contribute to figure-ground separation of a target odorant from background stimulation. Single-unit recordings were made from both mitral/tufted cells and aPCX neurons in urethan-anesthetized rats and mice. Single-unit responses to odorant stimuli and their binary mixtures were determined. One of the odorants was randomly selected as the background and presented for 50 s. Forty seconds after the onset of the background stimulus, the second target odorant was presented, producing a binary mixture. The results suggest that mitral/tufted cells continue to respond to the background odorant and, when the target odorant is presented, had response magnitudes similar to that evoked by the binary mixture. In contrast, aPCX neurons filter out the background stimulus while maintaining responses to the target stimulus. Thus the aPCX acts as a filter driven most strongly by changing stimuli, providing a potential mechanism for olfactory figure-ground separation and selective reading of olfactory bulb output.

  9. Structure, Function, Self-Assembly and Origin of Simple Membrane Proteins

    NASA Technical Reports Server (NTRS)

    Pohorille, Andrew

    2003-01-01

    Integral membrane proteins perform such essential cellular functions as transport of ions, nutrients and waste products across cell walls, transduction of environmental signals, regulation of cell fusion, recognition of other cells, energy capture and its conversion into high-energy compounds. In fact, 30-40% of genes in modem organisms codes for membrane proteins. Although contemporary membrane proteins or their functional assemblies can be quite complex, their transmembrane fragments are usually remarkably simple. The most common structural motif for these fragments is a bundle of alpha-helices, but occasionally it could be a beta-barrel. In a series of molecular dynamics computer simulations we investigated self-organizing properties of simple membrane proteins based on these structural motifs. Specifically, we studied folding and insertion into membranes of short, nonpolar or amphiphatic peptides. We also investigated glycophorin A, a peptide that forms sequence-specific dimers, and a transmembrane aggregate of four identical alpha-helices that forms an efficient and selective voltage-gated proton channel was investigated. Many peptides are attracted to water-membrane interfaces. Once at the interface, nonpolar peptides spontaneously fold to a-helices. Whenever the sequence permits, peptides that contain both polar and nonpolar amino also adopt helical structures, in which polar and nonpolar amino acid side chains are immersed in water and membrane, respectively. Specific identity of side chains is less important. Helical peptides at the interface could insert into the membrane and adopt a transmembrane conformation. However, insertion of a single helix is unfavorable because polar groups in the peptide become completely dehydrated upon insertion. The unfavorable free energy of insertion can be regained by spontaneous association of peptides in the membrane. The first step in this process is the formation of dimers, although the most common are aggregates of 4

  10. An efficient and scalable graph modeling approach for capturing information at different levels in next generation sequencing reads

    PubMed Central

    2013-01-01

    Background Next generation sequencing technologies have greatly advanced many research areas of the biomedical sciences through their capability to generate massive amounts of genetic information at unprecedented rates. The advent of next generation sequencing has led to the development of numerous computational tools to analyze and assemble the millions to billions of short sequencing reads produced by these technologies. While these tools filled an important gap, current approaches for storing, processing, and analyzing short read datasets generally have remained simple and lack the complexity needed to efficiently model the produced reads and assemble them correctly. Results Previously, we presented an overlap graph coarsening scheme for modeling read overlap relationships on multiple levels. Most current read assembly and analysis approaches use a single graph or set of clusters to represent the relationships among a read dataset. Instead, we use a series of graphs to represent the reads and their overlap relationships across a spectrum of information granularity. At each information level our algorithm is capable of generating clusters of reads from the reduced graph, forming an integrated graph modeling and clustering approach for read analysis and assembly. Previously we applied our algorithm to simulated and real 454 datasets to assess its ability to efficiently model and cluster next generation sequencing data. In this paper we extend our algorithm to large simulated and real Illumina datasets to demonstrate that our algorithm is practical for both sequencing technologies. Conclusions Our overlap graph theoretic algorithm is able to model next generation sequencing reads at various levels of granularity through the process of graph coarsening. Additionally, our model allows for efficient representation of the read overlap relationships, is scalable for large datasets, and is practical for both Illumina and 454 sequencing technologies. PMID:24564333

  11. Investigating on the Differences between Triggered and Background Seismicity in Italy and Southern California.

    NASA Astrophysics Data System (ADS)

    Stallone, A.; Marzocchi, W.

    2017-12-01

    Earthquake occurrence may be approximated by a multidimensional Poisson clustering process, where each point of the Poisson process is replaced by a cluster of points, the latter corresponding to the well-known aftershock sequence (triggered events). Earthquake clusters and their parents are assumed to occur according to a Poisson process at a constant temporal rate proportional to the tectonic strain rate, while events within a cluster are modeled as generations of dependent events reproduced by a branching process. Although the occurrence of such space-time clusters is a general feature in different tectonic settings, seismic sequences seem to have marked differences from region to region: one example, among many others, is that seismic sequences of moderate magnitude in Italian Apennines seem to last longer than similar seismic sequences in California. In this work we investigate on the existence of possible differences in the earthquake clustering process in these two areas. At first, we separate the triggered and background components of seismicity in the Italian and Southern California seismic catalog. Then we study the space-time domain of the triggered earthquakes with the aim to identify possible variations in the triggering properties across the two regions. In the second part of the work we focus our attention on the characteristics of the background seismicity in both seismic catalogs. The assumption of time stationarity of the background seismicity (which includes both cluster parents and isolated events) is still under debate. Some authors suggest that the independent component of seismicity could undergo transient perturbations at various time scales due to different physical mechanisms, such as, for example, viscoelastic relaxation, presence of fluids, non-stationary plate motion, etc, whose impact may depend on the tectonic setting. Here we test if the background seismicity in the two regions can be satisfactorily described by the time

  12. GIGA: a simple, efficient algorithm for gene tree inference in the genomic age.

    PubMed

    Thomas, Paul D

    2010-06-09

    Phylogenetic relationships between genes are not only of theoretical interest: they enable us to learn about human genes through the experimental work on their relatives in numerous model organisms from bacteria to fruit flies and mice. Yet the most commonly used computational algorithms for reconstructing gene trees can be inaccurate for numerous reasons, both algorithmic and biological. Additional information beyond gene sequence data has been shown to improve the accuracy of reconstructions, though at great computational cost. We describe a simple, fast algorithm for inferring gene phylogenies, which makes use of information that was not available prior to the genomic age: namely, a reliable species tree spanning much of the tree of life, and knowledge of the complete complement of genes in a species' genome. The algorithm, called GIGA, constructs trees agglomeratively from a distance matrix representation of sequences, using simple rules to incorporate this genomic age information. GIGA makes use of a novel conceptualization of gene trees as being composed of orthologous subtrees (containing only speciation events), which are joined by other evolutionary events such as gene duplication or horizontal gene transfer. An important innovation in GIGA is that, at every step in the agglomeration process, the tree is interpreted/reinterpreted in terms of the evolutionary events that created it. Remarkably, GIGA performs well even when using a very simple distance metric (pairwise sequence differences) and no distance averaging over clades during the tree construction process. GIGA is efficient, allowing phylogenetic reconstruction of very large gene families and determination of orthologs on a large scale. It is exceptionally robust to adding more gene sequences, opening up the possibility of creating stable identifiers for referring to not only extant genes, but also their common ancestors. We compared trees produced by GIGA to those in the TreeFam database, and they

  13. Local alignment of two-base encoded DNA sequence

    PubMed Central

    Homer, Nils; Merriman, Barry; Nelson, Stanley F

    2009-01-01

    Background DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. Results We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. Conclusion The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data. PMID:19508732

  14. GRAVITATIONAL WAVE BACKGROUND FROM BINARY MERGERS AND METALLICITY EVOLUTION OF GALAXIES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nakazato, Ken’ichiro; Sago, Norichika; Niino, Yuu, E-mail: nakazato@artsci.kyushu-u.ac.jp

    The cosmological evolution of the binary black hole (BH) merger rate and the energy density of the gravitational wave (GW) background are investigated. To evaluate the redshift dependence of the BH formation rate, BHs are assumed to originate from low-metallicity stars, and the relations between the star formation rate, metallicity and stellar mass of galaxies are combined with the stellar mass function at each redshift. As a result, it is found that when the energy density of the GW background is scaled with the merger rate at the local universe, the scaling factor does not depend on the critical metallicitymore » for the formation of BHs. Also taking into account the merger of binary neutron stars, a simple formula to express the energy spectrum of the GW background is constructed for the inspiral phase. The relation between the local merger rate and the energy density of the GW background will be examined by future GW observations.« less

  15. Functional region prediction with a set of appropriate homologous sequences-an index for sequence selection by integrating structure and sequence information with spatial statistics

    PubMed Central

    2012-01-01

    Background The detection of conserved residue clusters on a protein structure is one of the effective strategies for the prediction of functional protein regions. Various methods, such as Evolutionary Trace, have been developed based on this strategy. In such approaches, the conserved residues are identified through comparisons of homologous amino acid sequences. Therefore, the selection of homologous sequences is a critical step. It is empirically known that a certain degree of sequence divergence in the set of homologous sequences is required for the identification of conserved residues. However, the development of a method to select homologous sequences appropriate for the identification of conserved residues has not been sufficiently addressed. An objective and general method to select appropriate homologous sequences is desired for the efficient prediction of functional regions. Results We have developed a novel index to select the sequences appropriate for the identification of conserved residues, and implemented the index within our method to predict the functional regions of a protein. The implementation of the index improved the performance of the functional region prediction. The index represents the degree of conserved residue clustering on the tertiary structure of the protein. For this purpose, the structure and sequence information were integrated within the index by the application of spatial statistics. Spatial statistics is a field of statistics in which not only the attributes but also the geometrical coordinates of the data are considered simultaneously. Higher degrees of clustering generate larger index scores. We adopted the set of homologous sequences with the highest index score, under the assumption that the best prediction accuracy is obtained when the degree of clustering is the maximum. The set of sequences selected by the index led to higher functional region prediction performance than the sets of sequences selected by other sequence

  16. Sedimentary sequence evolution in a Foredeep basin: Eastern Venezuela

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bejarano, C.; Funes, D.; Sarzalho, S.

    1996-08-01

    Well log-seismic sequence stratigraphy analysis in the Eastern Venezuela Foreland Basin leads to study of the evolution of sedimentary sequences onto the Cretaceous-Paleocene passive margin. This basin comprises two different foredeep sub-basins: The Guarico subbasin to the west, older, and the Maturin sub-basin to the east, younger. A foredeep switching between these two sub-basins is observed at 12.5 m.y. Seismic interpretation and well log sections across the study area show sedimentary sequences with transgressive sands and coastal onlaps to the east-southeast for the Guarico sub-basin, as well as truncations below the switching sequence (12.5 m.y.), and the Maturin sub-basin showsmore » apparent coastal onlaps to the west-northwest, as well as a marine onlap (deeper water) in the west, where it starts to establish. Sequence stratigraphy analysis of these sequences with well logs allowed the study of the evolution of stratigraphic section from Paleocene to middle Miocene (68.0-12.0 m.y.). On the basis of well log patterns, the sequences were divided in regressive-transgressive-regressive sedimentary cycles caused by changes in relative sea level. Facies distributions were analyzed and the sequences were divided into simple sequences or sub- sequences of a greater frequencies than third order depositional sequences.« less

  17. Colloidal polymers with controlled sequence and branching constructed from magnetic field assembled nanoparticles.

    PubMed

    Bannwarth, Markus B; Utech, Stefanie; Ebert, Sandro; Weitz, David A; Crespy, Daniel; Landfester, Katharina

    2015-03-24

    The assembly of nanoparticles into polymer-like architectures is challenging and usually requires highly defined colloidal building blocks. Here, we show that the broad size-distribution of a simple dispersion of magnetic nanocolloids can be exploited to obtain various polymer-like architectures. The particles are assembled under an external magnetic field and permanently linked by thermal sintering. The remarkable variety of polymer-analogue architectures that arises from this simple process ranges from statistical and block copolymer-like sequencing to branched chains and networks. This library of architectures can be realized by controlling the sequencing of the particles and the junction points via a size-dependent self-assembly of the single building blocks.

  18. Plant genome and transcriptome annotations: from misconceptions to simple solutions

    PubMed Central

    Bolger, Marie E; Arsova, Borjana; Usadel, Björn

    2018-01-01

    Abstract Next-generation sequencing has triggered an explosion of available genomic and transcriptomic resources in the plant sciences. Although genome and transcriptome sequencing has become orders of magnitudes cheaper and more efficient, often the functional annotation process is lagging behind. This might be hampered by the lack of a comprehensive enumeration of simple-to-use tools available to the plant researcher. In this comprehensive review, we present (i) typical ontologies to be used in the plant sciences, (ii) useful databases and resources used for functional annotation, (iii) what to expect from an annotated plant genome, (iv) an automated annotation pipeline and (v) a recipe and reference chart outlining typical steps used to annotate plant genomes/transcriptomes using publicly available resources. PMID:28062412

  19. Modeling How, When, and What Is Learned in a Simple Fault-Finding Task

    ERIC Educational Resources Information Center

    Ritter, Frank E.; Bibby, Peter A.

    2008-01-01

    We have developed a process model that learns in multiple ways while finding faults in a simple control panel device. The model predicts human participants' learning through its own learning. The model's performance was systematically compared to human learning data, including the time course and specific sequence of learned behaviors. These…

  20. Compressing DNA sequence databases with coil

    PubMed Central

    White, W Timothy J; Hendy, Michael D

    2008-01-01

    Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work. PMID:18489794

  1. Yellow lupin (Lupinus luteus L.) transcriptome sequencing: molecular marker development and comparative studies

    PubMed Central

    2012-01-01

    Background Yellow lupin (Lupinus luteus L.) is a minor legume crop characterized by its high seed protein content. Although grown in several temperate countries, its orphan condition has limited the generation of genomic tools to aid breeding efforts to improve yield and nutritional quality. In this study, we report the construction of 454-expresed sequence tag (EST) libraries, carried out comparative studies between L. luteus and model legume species, developed a comprehensive set of EST-simple sequence repeat (SSR) markers, and validated their utility on diversity studies and transferability to related species. Results Two runs of 454 pyrosequencing yielded 205 Mb and 530 Mb of sequence data for L1 (young leaves, buds and flowers) and L2 (immature seeds) EST- libraries. A combined assembly (L1L2) yielded 71,655 contigs with an average contig length of 632 nucleotides. L1L2 contigs were clustered into 55,309 isotigs. 38,200 isotigs translated into proteins and 8,741 of them were full length. Around 57% of L. luteus sequences had significant similarity with at least one sequence of Medicago, Lotus, Arabidopsis, or Glycine, and 40.17% showed positive matches with all of these species. L. luteus isotigs were also screened for the presence of SSR sequences. A total of 2,572 isotigs contained at least one EST-SSR, with a frequency of one SSR per 17.75 kbp. Empirical evaluation of the EST-SSR candidate markers resulted in 222 polymorphic EST-SSRs. Two hundred and fifty four (65.7%) and 113 (30%) SSR primer pairs were able to amplify fragments from L. hispanicus and L. mutabilis DNA, respectively. Fifty polymorphic EST-SSRs were used to genotype a sample of 64 L. luteus accessions. Neighbor-joining distance analysis detected the existence of several clusters among L. luteus accessions, strongly suggesting the existence of population subdivisions. However, no clear clustering patterns followed the accession’s origin. Conclusion L. luteus deep transcriptome

  2. The number of reduced alignments between two DNA sequences

    PubMed Central

    2014-01-01

    Background In this study we consider DNA sequences as mathematical strings. Total and reduced alignments between two DNA sequences have been considered in the literature to measure their similarity. Results for explicit representations of some alignments have been already obtained. Results We present exact, explicit and computable formulas for the number of different possible alignments between two DNA sequences and a new formula for a class of reduced alignments. Conclusions A unified approach for a wide class of alignments between two DNA sequences has been provided. The formula is computable and, if complemented by software development, will provide a deeper insight into the theory of sequence alignment and give rise to new comparison methods. AMS Subject Classification Primary 92B05, 33C20, secondary 39A14, 65Q30 PMID:24684679

  3. Molecular characterizations of somatic hybrids developed between Pleurotus florida and Lentinus squarrosulus through inter-simple sequence repeat markers and sequencing of ribosomal RNA-ITS gene.

    PubMed

    Mallick, Pijush; Chattaraj, Shruti; Sikdar, Samir Ranjan

    2017-10-01

    The 12 pfls somatic hybrids and 2 parents of Pleurotus florida and Lentinus s quarrosulus were characterized by ISSR and sequencing of rRNA-ITS genes. Five ISSR primers were used and amplified a total of 54 reproducible fragments with 98.14% polymorphism among all the pfls hybrid populations and parental strains. UPGMA-based cluster exhibited a dendrogram with three major groups between the parents and pfls hybrids. Parent P . florida and L . squarrosulus showed different degrees of genetic distance with all the hybrid lines and they showed closeness to hybrid pfls 1m and pfls 1h , respectively. ITS1(F) and ITS4(R) amplified the rRNA-ITS gene with 611-867 bp sequence length. The nucleotide polymorphisms were found in the ITS1, ITS2 and 5.8S rRNA region with different number of bases. Based on rRNA-ITS sequence, UPGMA cluster exhibited three distinct groups between L. squarrosulus and pfls 1p , pfls 1m and pfls 1s , and pfls 1e and P. florida .

  4. LookSeq: a browser-based viewer for deep sequencing data.

    PubMed

    Manske, Heinrich Magnus; Kwiatkowski, Dominic P

    2009-11-01

    Sequencing a genome to great depth can be highly informative about heterogeneity within an individual or a population. Here we address the problem of how to visualize the multiple layers of information contained in deep sequencing data. We propose an interactive AJAX-based web viewer for browsing large data sets of aligned sequence reads. By enabling seamless browsing and fast zooming, the LookSeq program assists the user to assimilate information at different levels of resolution, from an overview of a genomic region to fine details such as heterogeneity within the sample. A specific problem, particularly if the sample is heterogeneous, is how to depict information about structural variation. LookSeq provides a simple graphical representation of paired sequence reads that is more revealing about potential insertions and deletions than are conventional methods.

  5. Specialized microbial databases for inductive exploration of microbial genome sequences

    PubMed Central

    Fang, Gang; Ho, Christine; Qiu, Yaowu; Cubas, Virginie; Yu, Zhou; Cabau, Cédric; Cheung, Frankie; Moszer, Ivan; Danchin, Antoine

    2005-01-01

    Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore , a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya) has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis) associated to related organisms for comparison. PMID:15698474

  6. A 28,000 Years Old Cro-Magnon mtDNA Sequence Differs from All Potentially Contaminating Modern Sequences

    PubMed Central

    Caramelli, David; Milani, Lucio; Vai, Stefania; Modi, Alessandra; Pecchioli, Elena; Girardi, Matteo; Pilli, Elena; Lari, Martina; Lippi, Barbara; Ronchitelli, Annamaria; Mallegni, Francesco; Casoli, Antonella; Bertorelle, Giorgio; Barbujani, Guido

    2008-01-01

    Background DNA sequences from ancient speciments may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal) and early modern (Cro-Magnoid) Europeans. Methodology/Principal Findings We typed the mitochondrial DNA (mtDNA) hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23) and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. Conclusions/Significance: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans. PMID:18628960

  7. Massive graviton on arbitrary background: derivation, syzygies, applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bernard, Laura; Deffayet, Cédric; IHES, Institut des Hautes Études Scientifiques,Le Bois-Marie, 35 route de Chartres, F-91440 Bures-sur-Yvette

    2015-06-23

    We give the detailed derivation of the fully covariant form of the quadratic action and the derived linear equations of motion for a massive graviton in an arbitrary background metric (which were presented in arXiv:1410.8302 [hep-th]). Our starting point is the de Rham-Gabadadze-Tolley (dRGT) family of ghost free massive gravities and using a simple model of this family, we are able to express this action and these equations of motion in terms of a single metric in which the graviton propagates, hence removing in particular the need for a “reference metric' which is present in the non perturbative formulation. Wemore » show further how 5 covariant constraints can be obtained including one which leads to the tracelessness of the graviton on flat space-time and removes the Boulware-Deser ghost. This last constraint involves powers and combinations of the curvature of the background metric. The 5 constraints are obtained for a background metric which is unconstrained, i.e. which does not have to obey the background field equations. We then apply these results to the case of Einstein space-times, where we show that the 5 constraints become trivial, and Friedmann-Lemaître-Robertson-Walker space-times, for which we correct in particular some results that appeared elsewhere. To reach our results, we derive several non trivial identities, syzygies, involving the graviton fields, its derivatives and the background metric curvature. These identities have their own interest. We also discover that there exist backgrounds for which the dRGT equations cannot be unambiguously linearized.« less

  8. Massive graviton on arbitrary background: derivation, syzygies, applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bernard, Laura; Deffayet, Cédric; Strauss, Mikael von, E-mail: bernard@iap.fr, E-mail: deffayet@iap.fr, E-mail: strauss@iap.fr

    2015-06-01

    We give the detailed derivation of the fully covariant form of the quadratic action and the derived linear equations of motion for a massive graviton in an arbitrary background metric (which were presented in arXiv:1410.8302 [hep-th]). Our starting point is the de Rham-Gabadadze-Tolley (dRGT) family of ghost free massive gravities and using a simple model of this family, we are able to express this action and these equations of motion in terms of a single metric in which the graviton propagates, hence removing in particular the need for a ''reference metric' which is present in the non perturbative formulation. Wemore » show further how 5 covariant constraints can be obtained including one which leads to the tracelessness of the graviton on flat space-time and removes the Boulware-Deser ghost. This last constraint involves powers and combinations of the curvature of the background metric. The 5 constraints are obtained for a background metric which is unconstrained, i.e. which does not have to obey the background field equations. We then apply these results to the case of Einstein space-times, where we show that the 5 constraints become trivial, and Friedmann-Lemaître-Robertson-Walker space-times, for which we correct in particular some results that appeared elsewhere. To reach our results, we derive several non trivial identities, syzygies, involving the graviton fields, its derivatives and the background metric curvature. These identities have their own interest. We also discover that there exist backgrounds for which the dRGT equations cannot be unambiguously linearized.« less

  9. A Simple and Efficient Method for Assembling TALE Protein Based on Plasmid Library

    PubMed Central

    Xu, Huarong; Xin, Ying; Zhang, Tingting; Ma, Lixia; Wang, Xin; Chen, Zhilong; Zhang, Zhiying

    2013-01-01

    DNA binding domain of the transcription activator-like effectors (TALEs) from Xanthomonas sp. consists of tandem repeats that can be rearranged according to a simple cipher to target new DNA sequences with high DNA-binding specificity. This technology has been successfully applied in varieties of species for genome engineering. However, assembling long TALE tandem repeats remains a big challenge precluding wide use of this technology. Although several new methodologies for efficiently assembling TALE repeats have been recently reported, all of them require either sophisticated facilities or skilled technicians to carry them out. Here, we described a simple and efficient method for generating customized TALE nucleases (TALENs) and TALE transcription factors (TALE-TFs) based on TALE repeat tetramer library. A tetramer library consisting of 256 tetramers covers all possible combinations of 4 base pairs. A set of unique primers was designed for amplification of these tetramers. PCR products were assembled by one step of digestion/ligation reaction. 12 TALE constructs including 4 TALEN pairs targeted to mouse Gt(ROSA)26Sor gene and mouse Mstn gene sequences as well as 4 TALE-TF constructs targeted to mouse Oct4, c-Myc, Klf4 and Sox2 gene promoter sequences were generated by using our method. The construction routines took 3 days and parallel constructions were available. The rate of positive clones during colony PCR verification was 64% on average. Sequencing results suggested that all TALE constructs were performed with high successful rate. This is a rapid and cost-efficient method using the most common enzymes and facilities with a high success rate. PMID:23840477

  10. A simple and efficient method for assembling TALE protein based on plasmid library.

    PubMed

    Zhang, Zhiqiang; Li, Duo; Xu, Huarong; Xin, Ying; Zhang, Tingting; Ma, Lixia; Wang, Xin; Chen, Zhilong; Zhang, Zhiying

    2013-01-01

    DNA binding domain of the transcription activator-like effectors (TALEs) from Xanthomonas sp. consists of tandem repeats that can be rearranged according to a simple cipher to target new DNA sequences with high DNA-binding specificity. This technology has been successfully applied in varieties of species for genome engineering. However, assembling long TALE tandem repeats remains a big challenge precluding wide use of this technology. Although several new methodologies for efficiently assembling TALE repeats have been recently reported, all of them require either sophisticated facilities or skilled technicians to carry them out. Here, we described a simple and efficient method for generating customized TALE nucleases (TALENs) and TALE transcription factors (TALE-TFs) based on TALE repeat tetramer library. A tetramer library consisting of 256 tetramers covers all possible combinations of 4 base pairs. A set of unique primers was designed for amplification of these tetramers. PCR products were assembled by one step of digestion/ligation reaction. 12 TALE constructs including 4 TALEN pairs targeted to mouse Gt(ROSA)26Sor gene and mouse Mstn gene sequences as well as 4 TALE-TF constructs targeted to mouse Oct4, c-Myc, Klf4 and Sox2 gene promoter sequences were generated by using our method. The construction routines took 3 days and parallel constructions were available. The rate of positive clones during colony PCR verification was 64% on average. Sequencing results suggested that all TALE constructs were performed with high successful rate. This is a rapid and cost-efficient method using the most common enzymes and facilities with a high success rate.

  11. Probing the Intergalactic Magnetic Field with the Anisotropy of the Extragalactic Gamma-ray Background

    NASA Technical Reports Server (NTRS)

    Venters, T. M.; Pavlidou, V.

    2013-01-01

    The intergalactic magnetic field (IGMF) may leave an imprint on the angular anisotropy of the extragalactic gamma-ray background through its effect on electromagnetic cascades triggered by interactions between very high energy photons and the extragalactic background light. A strong IGMF will deflect secondary particles produced in these cascades and will thus tend to isotropize lower energy cascade photons, thereby inducing a modulation in the anisotropy energy spectrum of the gamma-ray background. Here we present a simple, proof-of-concept calculation of the magnitude of this effect and demonstrate that current Fermi data already seem to prefer nonnegligible IGMF values. The anisotropy energy spectrum of the Fermi gamma-ray background could thus be used as a probe of the IGMF strength.

  12. Next generation sequencing provides rapid access to the genome of wheat stripe rust

    USDA-ARS?s Scientific Manuscript database

    Background: The wheat stripe rust fungus (Puccinia striiformis f. sp. tritici, PST) is responsible for significant yield losses in wheat production worldwide. In spite of its economic importance, the PST genomic sequence is not currently available. Fortunately Next Generation Sequencing (NGS) has ra...

  13. Children's Criteria for Representational Adequacy in the Perception of Simple Sonic Stimuli

    ERIC Educational Resources Information Center

    Verschaffel, Lieven; Reybrouck, Mark; Jans, Christine; Van Dooren, Wim

    2010-01-01

    This study investigates children's metarepresentational competence with regard to listening to and making sense of simple sonic stimuli. Using diSessa's (2003) work on metarepresentational competence in mathematics and sciences as theoretical and empirical background, it aims to assess children's criteria for representational adequacy of graphical…

  14. What's in your next-generation sequence data? An exploration of unmapped DNA and RNA sequence reads from the bovine reference individual

    USDA-ARS?s Scientific Manuscript database

    BACKGROUND: Next-generation sequencing projects commonly commence by aligning reads to a reference genome assembly. While improvements in alignment algorithms and computational hardware have greatly enhanced the efficiency and accuracy of alignments, a significant percentage of reads often remain u...

  15. Chroma key without color restrictions based on asynchronous amplitude modulation of background illumination on retroreflective screens

    NASA Astrophysics Data System (ADS)

    Vidal, Borja; Lafuente, Juan A.

    2016-03-01

    A simple technique to avoid color limitations in image capture systems based on chroma key video composition using retroreflective screens and light-emitting diodes (LED) rings is proposed and demonstrated. The combination of an asynchronous temporal modulation onto the background illumination and simple image processing removes the usual restrictions on foreground colors in the scene. The technique removes technical constraints in stage composition, allowing its design to be purely based on artistic grounds. Since it only requires adding a very simple electronic circuit to widely used chroma keying hardware based on retroreflective screens, the technique is easily applicable to TV and filming studios.

  16. A Simple Method for Amplifying RNA Targets (SMART)

    PubMed Central

    McCalla, Stephanie E.; Ong, Carmichael; Sarma, Aartik; Opal, Steven M.; Artenstein, Andrew W.; Tripathi, Anubhav

    2012-01-01

    We present a novel and simple method for amplifying RNA targets (named by its acronym, SMART), and for detection, using engineered amplification probes that overcome existing limitations of current RNA-based technologies. This system amplifies and detects optimal engineered ssDNA probes that hybridize to target RNA. The amplifiable probe-target RNA complex is captured on magnetic beads using a sequence-specific capture probe and is separated from unbound probe using a novel microfluidic technique. Hybridization sequences are not constrained as they are in conventional target-amplification reactions such as nucleic acid sequence amplification (NASBA). Our engineered ssDNA probe was amplified both off-chip and in a microchip reservoir at the end of the separation microchannel using isothermal NASBA. Optimal solution conditions for ssDNA amplification were investigated. Although KCl and MgCl2 are typically found in NASBA reactions, replacing 70 mmol/L of the 82 mmol/L total chloride ions with acetate resulted in optimal reaction conditions, particularly for low but clinically relevant probe concentrations (≤100 fmol/L). With the optimal probe design and solution conditions, we also successfully removed the initial heating step of NASBA, thus achieving a true isothermal reaction. The SMART assay using a synthetic model influenza DNA target sequence served as a fundamental demonstration of the efficacy of the capture and microfluidic separation system, thus bridging our system to a clinically relevant detection problem. PMID:22691910

  17. Detection of possible restriction sites for type II restriction enzymes in DNA sequences.

    PubMed

    Gagniuc, P; Cimponeriu, D; Ionescu-Tîrgovişte, C; Mihai, Andrada; Stavarachi, Monica; Mihai, T; Gavrilă, L

    2011-01-01

    In order to make a step forward in the knowledge of the mechanism operating in complex polygenic disorders such as diabetes and obesity, this paper proposes a new algorithm (PRSD -possible restriction site detection) and its implementation in Applied Genetics software. This software can be used for in silico detection of potential (hidden) recognition sites for endonucleases and for nucleotide repeats identification. The recognition sites for endonucleases may result from hidden sequences through deletion or insertion of a specific number of nucleotides. Tests were conducted on DNA sequences downloaded from NCBI servers using specific recognition sites for common type II restriction enzymes introduced in the software database (n = 126). Each possible recognition site indicated by the PRSD algorithm implemented in Applied Genetics was checked and confirmed by NEBcutter V2.0 and Webcutter 2.0 software. In the sequence NG_008724.1 (which includes 63632 nucleotides) we found a high number of potential restriction sites for ECO R1 that may be produced by deletion (n = 43 sites) or insertion (n = 591 sites) of one nucleotide. The second module of Applied Genetics has been designed to find simple repeats sizes with a real future in understanding the role of SNPs (Single Nucleotide Polymorphisms) in the pathogenesis of the complex metabolic disorders. We have tested the presence of simple repetitive sequences in five DNA sequence. The software indicated exact position of each repeats detected in the tested sequences. Future development of Applied Genetics can provide an alternative for powerful tools used to search for restriction sites or repetitive sequences or to improve genotyping methods.

  18. A Simple Label Switching Algorithm for Semisupervised Structural SVMs.

    PubMed

    Balamurugan, P; Shevade, Shirish; Sundararajan, S

    2015-10-01

    In structured output learning, obtaining labeled data for real-world applications is usually costly, while unlabeled examples are available in abundance. Semisupervised structured classification deals with a small number of labeled examples and a large number of unlabeled structured data. In this work, we consider semisupervised structural support vector machines with domain constraints. The optimization problem, which in general is not convex, contains the loss terms associated with the labeled and unlabeled examples, along with the domain constraints. We propose a simple optimization approach that alternates between solving a supervised learning problem and a constraint matching problem. Solving the constraint matching problem is difficult for structured prediction, and we propose an efficient and effective label switching method to solve it. The alternating optimization is carried out within a deterministic annealing framework, which helps in effective constraint matching and avoiding poor local minima, which are not very useful. The algorithm is simple and easy to implement. Further, it is suitable for any structured output learning problem where exact inference is available. Experiments on benchmark sequence labeling data sets and a natural language parsing data set show that the proposed approach, though simple, achieves comparable generalization performance.

  19. Draft Sequences of the Radish (Raphanus sativus L.) Genome

    PubMed Central

    Kitashiba, Hiroyasu; Li, Feng; Hirakawa, Hideki; Kawanabe, Takahiro; Zou, Zhongwei; Hasegawa, Yoichi; Tonosaki, Kaoru; Shirasawa, Sachiko; Fukushima, Aki; Yokoi, Shuji; Takahata, Yoshihito; Kakizaki, Tomohiro; Ishida, Masahiko; Okamoto, Shunsuke; Sakamoto, Koji; Shirasawa, Kenta; Tabata, Satoshi; Nishio, Takeshi

    2014-01-01

    Radish (Raphanus sativus L., n = 9) is one of the major vegetables in Asia. Since the genomes of Brassica and related species including radish underwent genome rearrangement, it is quite difficult to perform functional analysis based on the reported genomic sequence of Brassica rapa. Therefore, we performed genome sequencing of radish. Short reads of genomic sequences of 191.1 Gb were obtained by next-generation sequencing (NGS) for a radish inbred line, and 76,592 scaffolds of ≥300 bp were constructed along with the bacterial artificial chromosome-end sequences. Finally, the whole draft genomic sequence of 402 Mb spanning 75.9% of the estimated genomic size and containing 61,572 predicted genes was obtained. Subsequently, 221 single nucleotide polymorphism markers and 768 PCR-RFLP markers were used together with the 746 markers produced in our previous study for the construction of a linkage map. The map was combined further with another radish linkage map constructed mainly with expressed sequence tag-simple sequence repeat markers into a high-density integrated map of 1,166 cM with 2,553 DNA markers. A total of 1,345 scaffolds were assigned to the linkage map, spanning 116.0 Mb. Bulked PCR products amplified by 2,880 primer pairs were sequenced by NGS, and SNPs in eight inbred lines were identified. PMID:24848699

  20. Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing

    PubMed Central

    Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

    2018-01-01

    Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393

  1. Synthetic oligonucleotide probes deduced from amino acid sequence data. Theoretical and practical considerations.

    PubMed

    Lathe, R

    1985-05-05

    Synthetic probes deduced from amino acid sequence data are widely used to detect cognate coding sequences in libraries of cloned DNA segments. The redundancy of the genetic code dictates that a choice must be made between (1) a mixture of probes reflecting all codon combinations, and (2) a single longer "optimal" probe. The second strategy is examined in detail. The frequency of sequences matching a given probe by chance alone can be determined and also the frequency of sequences closely resembling the probe and contributing to the hybridization background. Gene banks cannot be treated as random associations of the four nucleotides, and probe sequences deduced from amino acid sequence data occur more often than predicted by chance alone. Probe lengths must be increased to confer the necessary specificity. Examination of hybrids formed between unique homologous probes and their cognate targets reveals that short stretches of perfect homology occurring by chance make a significant contribution to the hybridization background. Statistical methods for improving homology are examined, taking human coding sequences as an example, and considerations of codon utilization and dinucleotide frequencies yield an overall homology of greater than 82%. Recommendations for probe design and hybridization are presented, and the choice between using multiple probes reflecting all codon possibilities and a unique optimal probe is discussed.

  2. BMPR1B mutation causes Pierre Robin sequence

    PubMed Central

    Yao, Xu; Zhang, Rong; Yang, Hui; Zhao, Rui; Guo, Jihong; Jin, Ke; Mei, Haibo; Luo, Yongqi; Zhao, Liu; Tu, Ming; Zhu, Yimin

    2017-01-01

    Background We investigated a large family with Pierre Robin sequence (PRS). Aim of the study This study aims to determine the genetic cause of PRS. Results The reciprocal translocation t(4;6)(q22;p21) was identified to be segregated with PRS in a three-generation family. Whole-genome sequencing and Sanger sequencing successfully detected breakpoints in the intragenic regions of BMRP1B and GRM4. We hypothesized that PRS in this family was caused by (i) haploinsufficiency for BMPR1B or (ii) a gain of function mechanism mediated by the BMPR1B-GRM4 fusion gene. In an unrelated family, we identified another BMPR1B-splicing mutation that co-segregated with PRS. Conclusion We detected two BMPR1B mutations in two unrelated PRS families, suggesting that BMPR1B disruption is probably a cause of human PRS. Methods GTG banding, comparative genomic hybridization, whole-genome sequencing, and Sanger sequencing were performed to identify the gene causing PRS. PMID:28418932

  3. Generation and analysis of expressed sequence tags in the extreme large genomes Lilium and Tulipa

    PubMed Central

    2012-01-01

    Background Bulbous flowers such as lily and tulip (Liliaceae family) are monocot perennial herbs that are economically very important ornamental plants worldwide. However, there are hardly any genetic studies performed and genomic resources are lacking. To build genomic resources and develop tools to speed up the breeding in both crops, next generation sequencing was implemented. We sequenced and assembled transcriptomes of four lily and five tulip genotypes using 454 pyro-sequencing technology. Results Successfully, we developed the first set of 81,791 contigs with an average length of 514 bp for tulip, and enriched the very limited number of 3,329 available ESTs (Expressed Sequence Tags) for lily with 52,172 contigs with an average length of 555 bp. The contigs together with singletons covered on average 37% of lily and 39% of tulip estimated transcriptome. Mining lily and tulip sequence data for SSRs (Simple Sequence Repeats) showed that di-nucleotide repeats were twice more abundant in UTRs (UnTranslated Regions) compared to coding regions, while tri-nucleotide repeats were equally spread over coding and UTR regions. Two sets of single nucleotide polymorphism (SNP) markers suitable for high throughput genotyping were developed. In the first set, no SNPs flanking the target SNP (50 bp on either side) were allowed. In the second set, one SNP in the flanking regions was allowed, which resulted in a 2 to 3 fold increase in SNP marker numbers compared with the first set. Orthologous groups between the two flower bulbs: lily and tulip (12,017 groups) and among the three monocot species: lily, tulip, and rice (6,900 groups) were determined using OrthoMCL. Orthologous groups were screened for common SNP markers and EST-SSRs to study synteny between lily and tulip, which resulted in 113 common SNP markers and 292 common EST-SSR. Lily and tulip contigs generated were annotated and described according to Gene Ontology terminology. Conclusions Two transcriptome sets

  4. Dynamics of domain coverage of the protein sequence universe

    PubMed Central

    2012-01-01

    Background The currently known protein sequence space consists of millions of sequences in public databases and is rapidly expanding. Assigning sequences to families leads to a better understanding of protein function and the nature of the protein universe. However, a large portion of the current protein space remains unassigned and is referred to as its “dark matter”. Results Here we suggest that true size of “dark matter” is much larger than stated by current definitions. We propose an approach to reducing the size of “dark matter” by identifying and subtracting regions in protein sequences that are not likely to contain any domain. Conclusions Recent improvements in computational domain modeling result in a decrease, albeit slowly, in the relative size of “dark matter”; however, its absolute size increases substantially with the growth of sequence data. PMID:23157439

  5. Genetic diversity analysis of cyanogenic potential (CNp) of root among improved genotypes of cassava using simple sequence repeat markers.

    PubMed

    Moyib, O K; Mkumbira, J; Odunola, O A; Dixon, A G

    2012-12-01

    Cyanogenic potential (CNp) of cassava constitutes a serious problem for over 500 million people who rely on the crop as their main source of calories. Genetic diversity is a key to successful crop improvement for breeding new improved variability for target traits. Forty-three improved genotypes of cassava developed by International Institute of Tropical Agriculture (ITA), Ibadan, were characterized for CNp trait using 35 Simple Sequence.Repeat (SSR) markers. Essential colorimetry picric test was used for evaluation of CNp on a color scale of 1 to 14. The CNp scores obtained ranged from 3 to 9, with a mean score of 5.48 (+/- 0.09) based on Statistical Analysis System (SAS) package. TMS M98/ 0068 (4.0 +/- 0.25) was identified as the best genotype with low CNp while TMS M98/0028 (7.75 +/- 0.25) was the worst. The 43 genotypes were assigned into 7 phenotypic groups based on rank-sum analysis in SAS. Dissimilarity analysis representatives for windows generated a phylogenetic tree with 5 clusters which represented hybridizing groups. Each of the clusters (except 4) contained low CNp genotypes that could be used for improving the high CNp genotypes in the same or near cluster. The scatter plot of the genotypes showed that there was little or no demarcation for phenotypic CNp groupings in the molecular groupings. The result of this study demonstrated that SSR markers are powerful tools for the assessment of genetic variability, and proper identification and selection of parents for genetic improvement of low CNp trait among the IITA cassava collection.

  6. LISTA, a comprehensive compilation of nucleotide sequences encoding proteins from the yeast Saccharomyces.

    PubMed Central

    Linder, P; Dölz, R; Mossé, M O; Lazowska, J; Slonimski, P P

    1993-01-01

    The amount of nucleotide sequence data is increasing exponentially. We therefore made an effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. Each sequence has been attributed a single genetic name and in the case of allelic duplicated sequences, synonyms are given, if necessary. For the nomenclature we have introduced a standard principle for naming gene sequences based on priority rules. We have also applied a simple method to distinguish duplicated sequences of one and the same gene from non-allelic sequences of duplicated genes. By using these principles we have sorted out a lot of confusion in the literature and databanks. Along with the genetic name, the mnemonic from the EMBL databank, the codon bias, reference of the publication of the sequence and the EMBL accession numbers are included in each entry. PMID:8332521

  7. Characterization of background concentrations of contaminants using a mixture of normal distributions.

    PubMed

    Qian, Song S; Lyons, Regan E

    2006-10-01

    We present a Bayesian approach for characterizing background contaminant concentration distributions using data from sites that may have been contaminated. Our method, focused on estimation, resolves several technical problems of the existing methods sanctioned by the U.S. Environmental Protection Agency (USEPA) (a hypothesis testing based method), resulting in a simple and quick procedure for estimating background contaminant concentrations. The proposed Bayesian method is applied to two data sets from a federal facility regulated under the Resource Conservation and Restoration Act. The results are compared to background distributions identified using existing methods recommended by the USEPA. The two data sets represent low and moderate levels of censorship in the data. Although an unbiased estimator is elusive, we show that the proposed Bayesian estimation method will have a smaller bias than the EPA recommended method.

  8. Prediction of Transport Properties of Permeants through Polymer Films. A Simple Gravimetric Experiment.

    ERIC Educational Resources Information Center

    Britton, L. N.; And Others

    1988-01-01

    Considers the applicability of the simple emersion/weight-gain method for predicting diffusion coefficients, solubilities, and permeation rates of chemicals in polymers that do not undergo physical and chemical deterioration. Presents the theoretical background, procedures and typical results related to this activity. (CW)

  9. Terminator oligo blocking efficiently eliminates rRNA from Drosophila small RNA sequencing libraries.

    PubMed

    Wickersheim, Michelle L; Blumenstiel, Justin P

    2013-11-01

    A large number of methods are available to deplete ribosomal RNA reads from high-throughput RNA sequencing experiments. Such methods are critical for sequencing Drosophila small RNAs between 20 and 30 nucleotides because size selection is not typically sufficient to exclude the highly abundant class of 30 nucleotide 2S rRNA. Here we demonstrate that pre-annealing terminator oligos complimentary to Drosophila 2S rRNA prior to 5' adapter ligation and reverse transcription efficiently depletes 2S rRNA sequences from the sequencing reaction in a simple and inexpensive way. This depletion is highly specific and is achieved with minimal perturbation of miRNA and piRNA profiles.

  10. Label-Free Sensitive Detection of DNA Methyltransferase by Target-Induced Hyperbranched Amplification with Zero Background Signal.

    PubMed

    Zhang, Yan; Wang, Xin-Yan; Zhang, Qianyi; Zhang, Chun-Yang

    2017-11-21

    DNA methyltransferases (MTases) may specifically recognize the short palindromic sequences and transfer a methyl group from S-adenosyl-l-methionine to target cytosine/adenine. The aberrant DNA methylation is linked to the abnormal DNA MTase activity, and some DNA MTases have become promising targets of anticancer/antimicrobial drugs. However, the reported DNA MTase assays often involve laborious operation, expensive instruments, and radio-labeled substrates. Here, we develop a simple and label-free fluorescent method to sensitively detect DNA adenine methyltransferase (Dam) on the basis of terminal deoxynucleotidyl transferase (TdT)-activated Endonuclease IV (Endo IV)-assisted hyperbranched amplification. We design a hairpin probe with a palindromic sequence in the stem as the substrate and a NH 2 -modified 3' end for the prevention of nonspecific amplification. The substrate may be methylated by Dam and subsequently cleaved by DpnI, producing three single-stranded DNAs, two of which with 3'-OH termini may be amplified by hyperbranched amplification to generate a distinct fluorescence signal. Because high exactitude of TdT enables the amplification only in the presence of free 3'-OH termini and Endo IV only hydrolyzes the intact apurinic/apyrimidinic sites in double-stranded DNAs, zero background signal can be achieved. This method exhibits excellent selectivity and high sensitivity with a limit of detection of 0.003 U/mL for pure Dam and 9.61 × 10 -6 mg/mL for Dam in E. coli cells. Moreover, it can be used to screen the Dam inhibitors, holding great potentials in disease diagnosis and drug development.

  11. On the necessity of dissecting sequence similarity scores into segment-specific contributions for inferring protein homology, function prediction and annotation

    PubMed Central

    2014-01-01

    Background Protein sequence similarities to any types of non-globular segments (coiled coils, low complexity regions, transmembrane regions, long loops, etc. where either positional sequence conservation is the result of a very simple, physically induced pattern or rather integral sequence properties are critical) are pertinent sources for mistaken homologies. Regretfully, these considerations regularly escape attention in large-scale annotation studies since, often, there is no substitute to manual handling of these cases. Quantitative criteria are required to suppress events of function annotation transfer as a result of false homology assignments. Results The sequence homology concept is based on the similarity comparison between the structural elements, the basic building blocks for conferring the overall fold of a protein. We propose to dissect the total similarity score into fold-critical and other, remaining contributions and suggest that, for a valid homology statement, the fold-relevant score contribution should at least be significant on its own. As part of the article, we provide the DissectHMMER software program for dissecting HMMER2/3 scores into segment-specific contributions. We show that DissectHMMER reproduces HMMER2/3 scores with sufficient accuracy and that it is useful in automated decisions about homology for instructive sequence examples. To generalize the dissection concept for cases without 3D structural information, we find that a dissection based on alignment quality is an appropriate surrogate. The approach was applied to a large-scale study of SMART and PFAM domains in the space of seed sequences and in the space of UniProt/SwissProt. Conclusions Sequence similarity core dissection with regard to fold-critical and other contributions systematically suppresses false hits and, additionally, recovers previously obscured homology relationships such as the one between aquaporins and formate/nitrite transporters that, so far, was only

  12. Rapid Diagnostics of Onboard Sequences

    NASA Technical Reports Server (NTRS)

    Starbird, Thomas W.; Morris, John R.; Shams, Khawaja S.; Maimone, Mark W.

    2012-01-01

    Keeping track of sequences onboard a spacecraft is challenging. When reviewing Event Verification Records (EVRs) of sequence executions on the Mars Exploration Rover (MER), operators often found themselves wondering which version of a named sequence the EVR corresponded to. The lack of this information drastically impacts the operators diagnostic capabilities as well as their situational awareness with respect to the commands the spacecraft has executed, since the EVRs do not provide argument values or explanatory comments. Having this information immediately available can be instrumental in diagnosing critical events and can significantly enhance the overall safety of the spacecraft. This software provides auditing capability that can eliminate that uncertainty while diagnosing critical conditions. Furthermore, the Restful interface provides a simple way for sequencing tools to automatically retrieve binary compiled sequence SCMFs (Space Command Message Files) on demand. It also enables developers to change the underlying database, while maintaining the same interface to the existing applications. The logging capabilities are also beneficial to operators when they are trying to recall how they solved a similar problem many days ago: this software enables automatic recovery of SCMF and RML (Robot Markup Language) sequence files directly from the command EVRs, eliminating the need for people to find and validate the corresponding sequences. To address the lack of auditing capability for sequences onboard a spacecraft during earlier missions, extensive logging support was added on the Mars Science Laboratory (MSL) sequencing server. This server is responsible for generating all MSL binary SCMFs from RML input sequences. The sequencing server logs every SCMF it generates into a MySQL database, as well as the high-level RML file and dictionary name inputs used to create the SCMF. The SCMF is then indexed by a hash value that is automatically included in all command

  13. Masking as an effective quality control method for next-generation sequencing data analysis.

    PubMed

    Yun, Sajung; Yun, Sijung

    2014-12-13

    Next generation sequencing produces base calls with low quality scores that can affect the accuracy of identifying simple nucleotide variation calls, including single nucleotide polymorphisms and small insertions and deletions. Here we compare the effectiveness of two data preprocessing methods, masking and trimming, and the accuracy of simple nucleotide variation calls on whole-genome sequence data from Caenorhabditis elegans. Masking substitutes low quality base calls with 'N's (undetermined bases), whereas trimming removes low quality bases that results in a shorter read lengths. We demonstrate that masking is more effective than trimming in reducing the false-positive rate in single nucleotide polymorphism (SNP) calling. However, both of the preprocessing methods did not affect the false-negative rate in SNP calling with statistical significance compared to the data analysis without preprocessing. False-positive rate and false-negative rate for small insertions and deletions did not show differences between masking and trimming. We recommend masking over trimming as a more effective preprocessing method for next generation sequencing data analysis since masking reduces the false-positive rate in SNP calling without sacrificing the false-negative rate although trimming is more commonly used currently in the field. The perl script for masking is available at http://code.google.com/p/subn/. The sequencing data used in the study were deposited in the Sequence Read Archive (SRX450968 and SRX451773).

  14. Two Simple and Efficient Algorithms to Compute the SP-Score Objective Function of a Multiple Sequence Alignment.

    PubMed

    Ranwez, Vincent

    2016-01-01

    Multiple sequence alignment (MSA) is a crucial step in many molecular analyses and many MSA tools have been developed. Most of them use a greedy approach to construct a first alignment that is then refined by optimizing the sum of pair score (SP-score). The SP-score estimation is thus a bottleneck for most MSA tools since it is repeatedly required and is time consuming. Given an alignment of n sequences and L sites, I introduce here optimized solutions reaching O(nL) time complexity for affine gap cost, instead of O(n2L), which are easy to implement.

  15. Suppression of background noise in a transonic wind-tunnel test section

    NASA Technical Reports Server (NTRS)

    Schutzenhofer, L. A.; Howard, P. W.

    1975-01-01

    Some exploratory tests were recently performed in the transonic test section of the NASA Marshall Space Flight Center 14-in. wind tunnel to suppress the background noise. In these tests, the perforated walls of the test section were covered with fine wire screens. The screens eliminated the edge tones generated by the holes in the perforated walls and significantly reduced the tunnel background noise. The tunnel noise levels were reduced to such a degree by this simple modification at Mach numbers 0.75, 0.9, 1.1, 1.2, and 1.46 that the fluctuating pressure levels of a turbulent boundary layer could be measured on a 5-deg half-angle cone.

  16. Improvements in Technique of NMR Imaging and NMR Diffusion Measurements in the Presence of Background Gradients.

    NASA Astrophysics Data System (ADS)

    Lian, Jianyu

    In this work, modification of the cosine current distribution rf coil, PCOS, has been introduced and tested. The coil produces a very homogeneous rf magnetic field, and it is inexpensive to build and easy to tune for multiple resonance frequency. The geometrical parameters of the coil are optimized to produce the most homogeneous rf field over a large volume. To avoid rf field distortion when the coil length is comparable to a quarter wavelength, a parallel PCOS coil is proposed and discussed. For testing rf coils and correcting B _1 in NMR experiments, a simple, rugged and accurate NMR rf field mapping technique has been developed. The method has been tested and used in 1D, 2D, 3D and in vivo rf mapping experiments. The method has been proven to be very useful in the design of rf coils. To preserve the linear relation between rf output applied on an rf coil and modulating input for an rf modulating -amplifying system of NMR imaging spectrometer, a quadrature feedback loop is employed in an rf modulator with two orthogonal rf channels to correct the amplitude and phase non-linearities caused by the rf components in the rf system. The modulator is very linear over a large range and it can generate an arbitrary rf shape. A diffusion imaging sequence has been developed for measuring and imaging diffusion in the presence of background gradients. Cross terms between the diffusion sensitizing gradients and background gradients or imaging gradients can complicate diffusion measurement and make the interpretation of NMR diffusion data ambiguous, but these have been eliminated in this method. Further, the background gradients has been measured and imaged. A dipole random distribution model has been established to study background magnetic fields Delta B and background magnetic gradients G_0 produced by small particles in a sample when it is in a B_0 field. From this model, the minimum distance that a spin can approach a particle can be determined by measuring

  17. First complete genome sequence of infectious laryngotracheitis virus

    PubMed Central

    2011-01-01

    Background Infectious laryngotracheitis virus (ILTV) is an alphaherpesvirus that causes acute respiratory disease in chickens worldwide. To date, only one complete genomic sequence of ILTV has been reported. This sequence was generated by concatenating partial sequences from six different ILTV strains. Thus, the full genomic sequence of a single (individual) strain of ILTV has not been determined previously. This study aimed to use high throughput sequencing technology to determine the complete genomic sequence of a live attenuated vaccine strain of ILTV. Results The complete genomic sequence of the Serva vaccine strain of ILTV was determined, annotated and compared to the concatenated ILTV reference sequence. The genome size of the Serva strain was 152,628 bp, with a G + C content of 48%. A total of 80 predicted open reading frames were identified. The Serva strain had 96.5% DNA sequence identity with the concatenated ILTV sequence. Notably, the concatenated ILTV sequence was found to lack four large regions of sequence, including 528 bp and 594 bp of sequence in the UL29 and UL36 genes, respectively, and two copies of a 1,563 bp sequence in the repeat regions. Considerable differences in the size of the predicted translation products of 4 other genes (UL54, UL30, UL37 and UL38) were also identified. More than 530 single-nucleotide polymorphisms (SNPs) were identified. Most SNPs were located within three genomic regions, corresponding to sequence from the SA-2 ILTV vaccine strain in the concatenated ILTV sequence. Conclusions This is the first complete genomic sequence of an individual ILTV strain. This sequence will facilitate future comparative genomic studies of ILTV by providing an appropriate reference sequence for the sequence analysis of other ILTV strains. PMID:21501528

  18. Processing sequence annotation data using the Lua programming language.

    PubMed

    Ueno, Yutaka; Arita, Masanori; Kumagai, Toshitaka; Asai, Kiyoshi

    2003-01-01

    The data processing language in a graphical software tool that manages sequence annotation data from genome databases should provide flexible functions for the tasks in molecular biology research. Among currently available languages we adopted the Lua programming language. It fulfills our requirements to perform computational tasks for sequence map layouts, i.e. the handling of data containers, symbolic reference to data, and a simple programming syntax. Upon importing a foreign file, the original data are first decomposed in the Lua language while maintaining the original data schema. The converted data are parsed by the Lua interpreter and the contents are stored in our data warehouse. Then, portions of annotations are selected and arranged into our catalog format to be depicted on the sequence map. Our sequence visualization program was successfully implemented, embedding the Lua language for processing of annotation data and layout script. The program is available at http://staff.aist.go.jp/yutaka.ueno/guppy/.

  19. Generating constrained randomized sequences: item frequency matters.

    PubMed

    French, Robert M; Perruchet, Pierre

    2009-11-01

    All experimental psychologists understand the importance of randomizing lists of items. However, randomization is generally constrained, and these constraints-in particular, not allowing immediately repeated items-which are designed to eliminate particular biases, frequently engender others. We describe a simple Monte Carlo randomization technique that solves a number of these problems. However, in many experimental settings, we are concerned not only with the number and distribution of items but also with the number and distribution of transitions between items. The algorithm mentioned above provides no control over this. We therefore introduce a simple technique that uses transition tables for generating correctly randomized sequences. We present an analytic method of producing item-pair frequency tables and item-pair transitional probability tables when immediate repetitions are not allowed. We illustrate these difficulties and how to overcome them, with reference to a classic article on word segmentation in infants. Finally, we provide free access to an Excel file that allows users to generate transition tables with up to 10 different item types, as well as to generate appropriately distributed randomized sequences of any length without immediately repeated elements. This file is freely available from http://leadserv.u-bourgogne.fr/IMG/xls/TransitionMatrix.xls.

  20. Identification of Human Lineage-Specific Transcriptional Coregulators Enabled by a Glossary of Binding Modules and Tunable Genomic Backgrounds.

    PubMed

    Mariani, Luca; Weinand, Kathryn; Vedenko, Anastasia; Barrera, Luis A; Bulyk, Martha L

    2017-09-27

    Transcription factors (TFs) control cellular processes by binding specific DNA motifs to modulate gene expression. Motif enrichment analysis of regulatory regions can identify direct and indirect TF binding sites. Here, we created a glossary of 108 non-redundant TF-8mer "modules" of shared specificity for 671 metazoan TFs from publicly available and new universal protein binding microarray data. Analysis of 239 ENCODE TF chromatin immunoprecipitation sequencing datasets and associated RNA sequencing profiles suggest the 8mer modules are more precise than position weight matrices in identifying indirect binding motifs and their associated tethering TFs. We also developed GENRE (genomically equivalent negative regions), a tunable tool for construction of matched genomic background sequences for analysis of regulatory regions. GENRE outperformed four state-of-the-art approaches to background sequence construction. We used our TF-8mer glossary and GENRE in the analysis of the indirect binding motifs for the co-occurrence of tethering factors, suggesting novel TF-TF interactions. We anticipate that these tools will aid in elucidating tissue-specific gene-regulatory programs. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. The first genetic map of a synthesized allohexaploid Brassica with A, B and C genomes based on simple sequence repeat markers.

    PubMed

    Yang, S; Chen, S; Geng, X X; Yan, G; Li, Z Y; Meng, J L; Cowling, W A; Zhou, W J

    2016-04-01

    We present the first genetic map of an allohexaploid Brassica species, based on segregating microsatellite markers in a doubled haploid mapping population generated from a hybrid between two hexaploid parents. This study reports the first genetic map of trigenomic Brassica. A doubled haploid mapping population consisting of 189 lines was obtained via microspore culture from a hybrid H16-1 derived from a cross between two allohexaploid Brassica lines (7H170-1 and Y54-2). Simple sequence repeat primer pairs specific to the A genome (107), B genome (44) and C genome (109) were used to construct a genetic linkage map of the population. Twenty-seven linkage groups were resolved from 274 polymorphic loci on the A genome (109), B genome (49) and C genome (116) covering a total genetic distance of 3178.8 cM with an average distance between markers of 11.60 cM. This is the first genetic framework map for the artificially synthesized Brassica allohexaploids. The linkage groups represent the expected complement of chromosomes in the A, B and C genomes from the original diploid and tetraploid parents. This framework linkage map will be valuable for QTL analysis and future genetic improvement of a new allohexaploid Brassica species, and in improving our understanding of the genetic control of meiosis in new polyploids.

  2. Typing Clostridium difficile strains based on tandem repeat sequences

    PubMed Central

    2009-01-01

    Background Genotyping of epidemic Clostridium difficile strains is necessary to track their emergence and spread. Portability of genotyping data is desirable to facilitate inter-laboratory comparisons and epidemiological studies. Results This report presents results from a systematic screen for variation in repetitive DNA in the genome of C. difficile. We describe two tandem repeat loci, designated 'TR6' and 'TR10', which display extensive sequence variation that may be useful for sequence-based strain typing. Based on an investigation of 154 C. difficile isolates comprising 75 ribotypes, tandem repeat sequencing demonstrated excellent concordance with widely used PCR ribotyping and equal discriminatory power. Moreover, tandem repeat sequences enabled the reconstruction of the isolates' largely clonal population structure and evolutionary history. Conclusion We conclude that sequence analysis of the two repetitive loci introduced here may be highly useful for routine typing of C. difficile. Tandem repeat sequence typing resolves phylogenetic diversity to a level equivalent to PCR ribotypes. DNA sequences may be stored in databases accessible over the internet, obviating the need for the exchange of reference strains. PMID:19133124

  3. Use of inter-simple sequence repeats and amplified fragment length polymorphisms to analyze genetic relationships among small grain-infecting species of ustilago.

    PubMed

    Menzies, J G; Bakkeren, G; Matheson, F; Procunier, J D; Woods, S

    2003-02-01

    ABSTRACT In the smut fungi, few features are available for use as taxonomic criteria (spore size, shape, morphology, germination type, and host range). DNA-based molecular techniques are useful in expanding the traits considered in determining relationships among these fungi. We examined the phylogenetic relationships among seven species of Ustilago (U. avenae, U. bullata, U. hordei, U. kolleri, U. nigra, U. nuda, and U. tritici) using inter-simple sequence repeats (ISSRs) and amplified fragment length polymorphisms (AFLPs) to compare their DNA profiles. Fifty-four isolates of different Ustilago spp. were analyzed using ISSR primers, and 16 isolates of Ustilago were studied using AFLP primers. The variability among isolates within species was low for all species except U. bullata. The isolates of U. bullata, U. nuda, and U. tritici were well separated and our data supports their speciation. U. avenae and U. kolleri isolates did not separate from each other and there was little variability between these species. U. hordei and U. nigra isolates also showed little variability between species, but the isolates from each species grouped together. Our data suggest that U. avenae and U. kolleri are monophyletic and should be considered one species, as should U. hordei and U. nigra.

  4. Probing the Intergalactic Magnetic Field with the Anisotropy of the Extragalactic Gamma-Ray Background

    NASA Technical Reports Server (NTRS)

    Venters, T. M.; Pavlidou, V.

    2012-01-01

    The intergalactic magnetic field (IGMF) may leave an imprint on the anisotropy properties of the extragalactic gamma-ray background, through its effect on electromagnetic cascades triggered by interactions between very high energy photons and the extragalactic background light. A strong IGMF will deflect secondary particles produced in these cascades and will thus tend to isotropize lower energy cascade photons, thus inducing a modulation in the anisotropy energy spectrum of the gamma-ray background. Here we present a simple, proof-of-concept calculation of the magnitude of this effect and demonstrate that the two extreme cases (zero IGMF and IGMF strong enough to completely isotropize cascade photons) would be separable by ten years of Fermi observations and reasonable model parameters for the gamma-ray background. The anisotropy energy spectrum of the Fermi gamma-ray background could thus be used as a probe of the IGMF strength.

  5. Motion detection and compensation in infrared retinal image sequences.

    PubMed

    Scharcanski, J; Schardosim, L R; Santos, D; Stuchi, A

    2013-01-01

    Infrared image data captured by non-mydriatic digital retinography systems often are used in the diagnosis and treatment of the diabetic macular edema (DME). Infrared illumination is less aggressive to the patient retina, and retinal studies can be carried out without pupil dilation. However, sequences of infrared eye fundus images of static scenes, tend to present pixel intensity fluctuations in time, and noisy and background illumination changes pose a challenge to most motion detection methods proposed in the literature. In this paper, we present a retinal motion detection method that is adaptive to background noise and illumination changes. Our experimental results indicate that this method is suitable for detecting retinal motion in infrared image sequences, and compensate the detected motion, which is relevant in retinal laser treatment systems for DME. Copyright © 2013 Elsevier Ltd. All rights reserved.

  6. A Writing Intervention to Teach Simple Sentences and Descriptive Paragraphs to Adolescents with Writing Difficulties

    ERIC Educational Resources Information Center

    Datchuk, Shawn M.; Kubina, Richard M., Jr.

    2017-01-01

    The present study used a multiple-baseline, single-case experimental design to investigate the effects of a multicomponent intervention on construction of simple sentences and word sequences. The intervention entailed sequential delivery of sentence instruction and frequency building to a performance criterion and paragraph instruction.…

  7. 40 CFR 86.230-94 - Test sequence: general requirements.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... testing. (2) The ambient temperature reported shall be a simple average of the test cell temperatures... cell temperature shall be 20 °F±3 °F (−7 °C±1.7 °C) when measured in accordance with paragraph (e)(2... approximately level during all phases of the test sequence to prevent abnormal fuel distribution. (e) Engine...

  8. Inter-Simple Sequence Repeat Data Reveals High Genetic Diversity in Wild Populations of the Narrowly Distributed Endemic Lilium regale in the Minjiang River Valley of China

    PubMed Central

    Wu, Zhu-hua; Shi, Jisen; Xi, Meng-li; Jiang, Fu-xing; Deng, Ming-wen; Dayanandan, Selvadurai

    2015-01-01

    Lilium regale E.H. Wilson is endemic to a narrow geographic area in the Minjiang River valley in southwestern China, and is considered an important germplasm for breeding commercially valuable lily varieties, due to its vigorous growth, resistance to diseases and tolerance for low moisture. We analyzed the genetic diversity of eight populations of L. regale sampled across the entire natural distribution range of the species using Inter-Simple Sequence Repeat markers. The genetic diversity (expected heterozygosity= 0.3356) was higher than those reported for other narrowly distributed endemic plants. The levels of inbreeding (F st = 0.1897) were low, and most of the genetic variability was found to be within (80.91%) than amongpopulations (19.09%). An indirect estimate of historical levels of gene flow (N m =1.0678) indicated high levels of gene flow among populations. The eight analyzed populations clustered into three genetically distinct groups. Based on these results, we recommend conservation of large populations representing these three genetically distinct groups. PMID:25799495

  9. A simple model for strong ground motions and response spectra

    USGS Publications Warehouse

    Safak, Erdal; Mueller, Charles; Boatwright, John

    1988-01-01

    A simple model for the description of strong ground motions is introduced. The model shows that response spectra can be estimated by using only four parameters of the ground motion, the RMS acceleration, effective duration and two corner frequencies that characterize the effective frequency band of the motion. The model is windowed band-limited white noise, and is developed by studying the properties of two functions, cumulative squared acceleration in the time domain, and cumulative squared amplitude spectrum in the frequency domain. Applying the methods of random vibration theory, the model leads to a simple analytical expression for the response spectra. The accuracy of the model is checked by using the ground motion recordings from the aftershock sequences of two different earthquakes and simulated accelerograms. The results show that the model gives a satisfactory estimate of the response spectra.

  10. Forensic Loci Allele Database (FLAD): Automatically generated, permanent identifiers for sequenced forensic alleles.

    PubMed

    Van Neste, Christophe; Van Criekinge, Wim; Deforce, Dieter; Van Nieuwerburgh, Filip

    2016-01-01

    It is difficult to predict if and when massively parallel sequencing of forensic STR loci will replace capillary electrophoresis as the new standard technology in forensic genetics. The main benefits of sequencing are increased multiplexing scales and SNP detection. There is not yet a consensus on how sequenced profiles should be reported. We present the Forensic Loci Allele Database (FLAD) service, made freely available on http://forensic.ugent.be/FLAD/. It offers permanent identifiers for sequenced forensic alleles (STR or SNP) and their microvariants for use in forensic allele nomenclature. Analogous to Genbank, its aim is to provide permanent identifiers for forensically relevant allele sequences. Researchers that are developing forensic sequencing kits or are performing population studies, can register on http://forensic.ugent.be/FLAD/ and add loci and allele sequences with a short and simple application interface (API). Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  11. Cassini Mission Sequence Subsystem (MSS)

    NASA Technical Reports Server (NTRS)

    Alland, Robert

    2011-01-01

    This paper describes my work with the Cassini Mission Sequence Subsystem (MSS) team during the summer of 2011. It gives some background on the motivation for this project and describes the expected benefit to the Cassini program. It then introduces the two tasks that I worked on - an automatic system auditing tool and a series of corrections to the Cassini Sequence Generator (SEQ_GEN) - and the specific objectives these tasks were to accomplish. Next, it details the approach I took to meet these objectives and the results of this approach, followed by a discussion of how the outcome of the project compares with my initial expectations. The paper concludes with a summary of my experience working on this project, lists what the next steps are, and acknowledges the help of my Cassini colleagues.

  12. Reduced 3,4'-bipyrazoles from a simple pyrazole precursor: synthetic sequence, molecular structures and supramolecular assembly.

    PubMed

    Cuartas, Viviana; Insuasty, Braulio; Cobo, Justo; Glidewell, Christopher

    2017-10-01

    The reaction of 5-chloro-3-methyl-1-phenyl-1H-pyrazole-4-carbaldehyde and N-benzylmethylamine under microwave irradiation gives 5-[benzyl(methyl)amino]-3-methyl-1-phenyl-1H-pyrazole-4-carbaldehyde, C 19 H 19 N 3 O, (I). Subsequent reactions under basic conditions, between (I) and a range of acetophenones, yield the corresponding chalcones. These undergo cyclocondensation reactions with hydrazine to produce reduced bipyrazoles which can be N-formylated with formic acid or N-acetylated with acetic anhydride. The structures of (I) and of representative examples from this reaction sequence are reported, namely the chalcone (E)-3-{5-[benzyl(methyl)amino]-3-methyl-1-phenyl-1H-pyrazol-4-yl}-1-(4-bromophenyl)prop-2-en-1-one, C 27 H 24 BrN 3 O, (II), the N-formyl derivative (3RS)-5'-[benzyl(methyl)amino]-3'-methyl-1',5-diphenyl-3,4-dihydro-1'H,2H-[3,4'-bipyrazole]-2-carbaldehyde, C 28 H 27 N 5 O, (III), and the N-acetyl derivative (3RS)-2-acetyl-5'-[benzyl(methyl)amino]-5-(4-methoxyphenyl)-3'-methyl-1'-phenyl-3,4-dihydro-1'H,2H-[3,4'-bipyrazole], which crystallizes as the ethanol 0.945-solvate, C 30 H 31 N 5 O 2 ·0.945C 2 H 6 O, (IV). There is significant delocalization of charge from the benzyl(methyl)amino substituent onto the carbonyl group in (I), but not in (II). In each of (III) and (IV), the reduced pyrazole ring is modestly puckered into an envelope conformation. The molecules of (I) are linked by a combination of C-H...N and C-H...π(arene) hydrogen bonds to form a simple chain of rings; those of (III) are linked by a combination of C-H...O and C-H...N hydrogen bonds to form sheets of R 2 2 (8) and R 6 6 (42) rings, and those of (IV) are linked by a combination of O-H...N and C-H...O hydrogen bonds to form a ribbon of edge-fused R 2 4 (16) and R 4 4 (24) rings.

  13. Quantiprot - a Python package for quantitative analysis of protein sequences.

    PubMed

    Konopka, Bogumił M; Marciniak, Marta; Dyrka, Witold

    2017-07-17

    The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where sequences can be related to each other and differences can be meaningfully interpreted. Quantiprot is a software package in Python, which provides a simple and consistent interface to multiple methods for quantitative characterization of protein sequences. The package can be used to calculate dozens of characteristics directly from sequences or using physico-chemical properties of amino acids. Besides basic measures, Quantiprot performs quantitative analysis of recurrence and determinism in the sequence, calculates distribution of n-grams and computes the Zipf's law coefficient. We propose three main fields of application of the Quantiprot package. First, quantitative characteristics can be used in alignment-free similarity searches, and in clustering of large and/or divergent sequence sets. Second, a feature space defined by quantitative properties can be used in comparative studies of protein families and organisms. Third, the feature space can be used for evaluating generative models, where large number of sequences generated by the model can be compared to actually observed sequences.

  14. Why barcode? High-throughput multiplex sequencing of mitochondrial genomes for molecular systematics.

    PubMed

    Timmermans, M J T N; Dodsworth, S; Culverwell, C L; Bocak, L; Ahrens, D; Littlewood, D T J; Pons, J; Vogler, A P

    2010-11-01

    Mitochondrial genome sequences are important markers for phylogenetics but taxon sampling remains sporadic because of the great effort and cost required to acquire full-length sequences. Here, we demonstrate a simple, cost-effective way to sequence the full complement of protein coding mitochondrial genes from pooled samples using the 454/Roche platform. Multiplexing was achieved without the need for expensive indexing tags ('barcodes'). The method was trialled with a set of long-range polymerase chain reaction (PCR) fragments from 30 species of Coleoptera (beetles) sequenced in a 1/16th sector of a sequencing plate. Long contigs were produced from the pooled sequences with sequencing depths ranging from ∼10 to 100× per contig. Species identity of individual contigs was established via three 'bait' sequences matching disparate parts of the mitochondrial genome obtained by conventional PCR and Sanger sequencing. This proved that assembly of contigs from the sequencing pool was correct. Our study produced sequences for 21 nearly complete and seven partial sets of protein coding mitochondrial genes. Combined with existing sequences for 25 taxa, an improved estimate of basal relationships in Coleoptera was obtained. The procedure could be employed routinely for mitochondrial genome sequencing at the species level, to provide improved species 'barcodes' that currently use the cox1 gene only.

  15. Sequence Similarity Presenter: a tool for the graphic display of similarities of long sequences for use in presentations.

    PubMed

    Fröhlich, K U

    1994-04-01

    A new method for the presentation of alignments of long sequences is described. The degree of identity for the aligned sequences is averaged for sections of a fixed number of residues. The resulting values are converted to shades of gray, with white corresponding to lack of identity and black corresponding to perfect identity. A sequence alignment is represented as a bar filled with varying shades of gray. The display is compact and allows for a fast and intuitive recognition of the distribution of regions with a high similarity. It is well suited for the presentation of alignments of long sequences, e.g. of protein superfamilies, in plenary lectures. The method is implemented as a HyperCard stack for Apple Macintosh computers. Several options for the modification of the output are available (e.g. background reduction, size of the summation window, consideration of amino acid similarity, inclusion of graphic markers to indicate specific domains). The output is a PostScript file which can be printed, imported as EPS or processed further with Adobe Illustrator.

  16. Tagmentation on Microbeads: Restore Long-Range DNA Sequence Information Using Next Generation Sequencing with Library Prepared by Surface-Immobilized Transposomes.

    PubMed

    Chen, He; Yao, Jiacheng; Fu, Yusi; Pang, Yuhong; Wang, Jianbin; Huang, Yanyi

    2018-04-11

    The next generation sequencing (NGS) technologies have been rapidly evolved and applied to various research fields, but they often suffer from losing long-range information due to short library size and read length. Here, we develop a simple, cost-efficient, and versatile NGS library preparation method, called tagmentation on microbeads (TOM). This method is capable of recovering long-range information through tagmentation mediated by microbead-immobilized transposomes. Using transposomes with DNA barcodes to identically label adjacent sequences during tagmentation, we can restore inter-read connection of each fragment from original DNA molecule by fragment-barcode linkage after sequencing. In our proof-of-principle experiment, more than 4.5% of the reads are linked with their adjacent reads, and the longest linkage is over 1112 bp. We demonstrate TOM with eight barcodes, but the number of barcodes can be scaled up by an ultrahigh complexity construction. We also show this method has low amplification bias and effectively fits the applications to identify copy number variations.

  17. Climatic influence of background and volcanic stratosphere aerosol models

    NASA Technical Reports Server (NTRS)

    Deschamps, P. Y.; Herman, M.; Lenoble, J.; Tanre, D.

    1982-01-01

    A simple modelization of the earth atmosphere system including tropospheric and stratospheric aerosols has been derived and tested. Analytical expressions are obtained for the albedo variation due to a thin stratospheric aerosol layer. Also outlined are the physical procedures and the respective influence of the main parameters: aerosol optical thickness, single scattering albedo and asymmetry factor, and sublayer albedo. The method is applied to compute the variation of the zonal and planetary albedos due to a stratospheric layer of background H2SO4 particles and of volcanic ash.

  18. SSR_pipeline--computer software for the identification of microsatellite sequences from paired-end Illumina high-throughput DNA sequence data

    USGS Publications Warehouse

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (SSRs; for example, microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains three analysis modules along with a fourth control module that can be used to automate analyses of large volumes of data. The modules are used to (1) identify the subset of paired-end sequences that pass quality standards, (2) align paired-end reads into a single composite DNA sequence, and (3) identify sequences that possess microsatellites conforming to user specified parameters. Each of the three separate analysis modules also can be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc). All modules are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, Windows). The program suite relies on a compiled Python extension module to perform paired-end alignments. Instructions for compiling the extension from source code are provided in the documentation. Users who do not have Python installed on their computers or who do not have the ability to compile software also may choose to download packaged executable files. These files include all Python scripts, a copy of the compiled extension module, and a minimal installation of Python in a single binary executable. See program documentation for more information.

  19. Sequencing and comparative genomic analysis of 1227 Felis catus cDNA sequences enriched for developmental, clinical and nutritional phenotypes

    PubMed Central

    2012-01-01

    Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742

  20. High-throughput sequence alignment using Graphics Processing Units

    PubMed Central

    Schatz, Michael C; Trapnell, Cole; Delcher, Arthur L; Varshney, Amitabh

    2007-01-01

    Background The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. Results This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs) in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA) from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. Conclusion MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU. PMID:18070356

  1. Riboswitch-based sensor in low optical background

    NASA Astrophysics Data System (ADS)

    Harbaugh, Svetlana V.; Davidson, Molly E.; Chushak, Yaroslav G.; Kelley-Loughnane, Nancy; Stone, Morley O.

    2008-08-01

    Riboswitches are a type of natural genetic control element that use untranslated sequence in the RNA to recognize and bind to small molecules that regulate expression of that gene. Creation of synthetic riboswitches to novel ligands depends on the ability to screen for analyte binding sensitivity and specificity. In our work, we have coupled a synthetic riboswitch to an optical reporter assay based on fluorescence resonance energy transfer (FRET) between two genetically-coded fluorescent proteins. Specifically, a theophylline-sensitive riboswitch was placed upstream of the Tobacco Etch Virus (TEV) protease coding sequence, and a FRET-based construct, BFP-eGFP or eGFP-REACh, was linked by a peptide encoding the recognition sequence for TEV protease. Cells expressing the riboswitch showed a marked optical difference in fluorescence emission in the presence of theophylline. However, the BFP-eGFP FRET pair posses significant optical background that reduces the sensitivity of a FRET-based assay. To improve the optical assay, we designed a nonfluorescent yellow fluorescent protein (YFP) mutant called REACh (for Resonance Energy-Accepting Chromoprotein) as the FRET acceptor for eGFP. The advantage of using an eGFP-REACh pair is the elimination of acceptor fluorescence which leads to an improved detection of FRET via better signal-to-noise ratio. The EGFP-REACh fusion protein was constructed with the TEV protease cleavage site; thus upon TEV translation, cleavage occurs diminishing REACh quenching and increasing eGFP emission resulting in a 4.5-fold improvement in assay sensitivity.

  2. Transcriptome analysis by strand-specific sequencing of complementary DNA

    PubMed Central

    Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey

    2009-01-01

    High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online. PMID:19620212

  3. Transcriptome analysis by strand-specific sequencing of complementary DNA.

    PubMed

    Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey

    2009-10-01

    High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online.

  4. Parallel sequencing lives, or what makes large sequencing projects successful

    PubMed Central

    Cuartero, Yasmina; Stadhouders, Ralph; Graf, Thomas; Marti-Renom, Marc A; Beato, Miguel

    2017-01-01

    Abstract T47D_rep2 and b1913e6c1_51720e9cf were 2 Hi-C samples. They were born and processed at the same time, yet their fates were very different. The life of b1913e6c1_51720e9cf was simple and fruitful, while that of T47D_rep2 was full of accidents and sorrow. At the heart of these differences lies the fact that b1913e6c1_51720e9cf was born under a lab culture of Documentation, Automation, Traceability, and Autonomy and compliance with the FAIR Principles. Their lives are a lesson for those who wish to embark on the journey of managing high-throughput sequencing data. PMID:29048533

  5. Parallel sequencing lives, or what makes large sequencing projects successful.

    PubMed

    Quilez, Javier; Vidal, Enrique; Dily, François Le; Serra, François; Cuartero, Yasmina; Stadhouders, Ralph; Graf, Thomas; Marti-Renom, Marc A; Beato, Miguel; Filion, Guillaume

    2017-11-01

    T47D_rep2 and b1913e6c1_51720e9cf were 2 Hi-C samples. They were born and processed at the same time, yet their fates were very different. The life of b1913e6c1_51720e9cf was simple and fruitful, while that of T47D_rep2 was full of accidents and sorrow. At the heart of these differences lies the fact that b1913e6c1_51720e9cf was born under a lab culture of Documentation, Automation, Traceability, and Autonomy and compliance with the FAIR Principles. Their lives are a lesson for those who wish to embark on the journey of managing high-throughput sequencing data. © The Author 2017. Published by Oxford University Press.

  6. Simple methods for the 3' biotinylation of RNA.

    PubMed

    Moritz, Bodo; Wahle, Elmar

    2014-03-01

    Biotinylation of RNA allows its tight coupling to streptavidin and is thus useful for many types of experiments, e.g., pull-downs. Here we describe three simple techniques for biotinylating the 3' ends of RNA molecules generated by chemical or enzymatic synthesis. First, extension with either the Schizosaccharomyces pombe noncanonical poly(A) polymerase Cid1 or Escherichia coli poly(A) polymerase and N6-biotin-ATP is simple, efficient, and generally applicable independently of the 3'-end sequences of the RNA molecule to be labeled. However, depending on the enzyme and the reaction conditions, several or many biotinylated nucleotides are incorporated. Second, conditions are reported under which splint-dependent ligation by T4 DNA ligase can be used to join biotinylated and, presumably, other chemically modified DNA oligonucleotides to RNA 3' ends even if these are heterogeneous as is typical for products of enzymatic synthesis. Third, we describe the use of 29 DNA polymerase for a template-directed fill-in reaction that uses biotin-dUTP and, thanks to the enzyme's proofreading activity, can cope with more extended 3' heterogeneities.

  7. Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing.

    PubMed

    Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

    2018-02-01

    A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  8. Genetic diversity and population structure analysis in Perilla frutescens from Northern areas of China based on simple sequence repeats.

    PubMed

    Ma, S J; Sa, K J; Hong, T K; Lee, J K

    2017-09-21

    In this study, 21 simple sequence repeat (SSR) markers were used to evaluate the genetic diversity and population structure among 77 Perilla accessions from high-latitude and middle-latitude areas of China. Ninety-five alleles were identified with an average of 4.52 alleles per locus. The average polymorphic information content (PIC) and genetic diversity values were 0.346 and 0.372, respectively. The level of genetic diversity and PIC value for cultivated accessions of Perilla frutescens var. frutescens from middle-latitude areas were higher than accessions from high-latitude areas. Based on the dendrogram of unweighted pair group method with arithmetic mean (UPGMA), all accessions were classified into four major groups with a genetic similarity of 46%. All accessions of the cultivated var. frutescens were discriminated from the cultivated P. frutescens var. crispa. Furthermore, most accessions of the cultivated var. frutescens collected in high-latitude and middle-latitude areas were distinguished depending on their geographical location. However, the geographical locations of several accessions of the cultivated var. frutescens have no relation with their positions in the UPGMA dendrogram and population structure. This result implies that the diffusion of accessions of the cultivated Perilla crop in the northern areas of China might be through multiple routes. On the population structure analysis, 77 Perilla accessions were divided into Group I, Group II, and an admixed group based on a membership probability threshold of 0.8. Finally, the findings in this study can provide useful theoretical knowledge for further study on the population structure and genetic diversity of Perilla and benefit for Perilla crop breeding and germplasm conservation.

  9. Preschool-aged children have difficulty constructing and interpreting simple utterances composed of graphic symbols.

    PubMed

    Sutton, Ann; Trudeau, Natacha; Morford, Jill; Rios, Monica; Poirier, Marie-Andrée

    2010-01-01

    Children who require augmentative and alternative communication (AAC) systems while they are in the process of acquiring language face unique challenges because they use graphic symbols for communication. In contrast to the situation of typically developing children, they use different modalities for comprehension (auditory) and expression (visual). This study explored the ability of three- and four-year-old children without disabilities to perform tasks involving sequences of graphic symbols. Thirty participants were asked to transpose spoken simple sentences into graphic symbols by selecting individual symbols corresponding to the spoken words, and to interpret graphic symbol utterances by selecting one of four photographs corresponding to a sequence of three graphic symbols. The results showed that these were not simple tasks for the participants, and few of them performed in the expected manner - only one in transposition, and only one-third of participants in interpretation. Individual response strategies in some cases lead to contrasting response patterns. Children at this age level have not yet developed the skills required to deal with graphic symbols even though they have mastered the corresponding spoken language structures.

  10. Using Graphical Notations to Assess Children's Experiencing of Simple and Complex Musical Fragments

    ERIC Educational Resources Information Center

    Verschaffel, Lieven; Reybrouck, Mark; Janssens, Marjan; Van Dooren, Wim

    2010-01-01

    The aim of this study was to analyze children's graphical notations as external representations of their experiencing when listening to simple sonic stimuli and complex musical fragments. More specifically, we assessed the impact of four factors on children's notations: age, musical background, complexity of the fragment, and most salient…

  11. Validation of Genotyping-By-Sequencing Analysis in Populations of Tetraploid Alfalfa by 454 Sequencing

    PubMed Central

    Rocher, Solen; Jean, Martine; Castonguay, Yves; Belzile, François

    2015-01-01

    Genotyping-by-sequencing (GBS) is a relatively low-cost high throughput genotyping technology based on next generation sequencing and is applicable to orphan species with no reference genome. A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations. GBS was performed on ApeKI libraries using DNA from 48 genotypes each of two heterogeneous populations of tetraploid alfalfa (Medicago sativa spp. sativa): the synthetic cultivar Apica (ATF0) and a derived population (ATF5) obtained after five cycles of recurrent selection for superior tolerance to freezing (TF). Nearly 400 million reads were obtained from two lanes of an Illumina HiSeq 2000 sequencer and analyzed with the Universal Network-Enabled Analysis Kit (UNEAK) pipeline designed for species with no reference genome. Following the application of whole dataset-level filters, 11,694 single nucleotide polymorphism (SNP) loci were obtained. About 60% had a significant match on the Medicago truncatula syntenic genome. The accuracy of allelic ratios and genotype calls based on GBS data was directly assessed using 454 sequencing on a subset of SNP loci scored in eight plant samples. Sequencing depth in this study was not sufficient for accurate tetraploid allelic dosage, but reliable genotype calls based on diploid allelic dosage were obtained when using additional quality filtering. Principal Component Analysis of SNP loci in plant samples revealed that a small proportion (<5%) of the genetic variability assessed by GBS is able to differentiate ATF0 and ATF5. Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids. PMID:26115486

  12. Quantitative comparison between a multiecho sequence and a single-echo sequence for susceptibility-weighted phase imaging.

    PubMed

    Gilbert, Guillaume; Savard, Geneviève; Bard, Céline; Beaudoin, Gilles

    2012-06-01

    The aim of this study was to investigate the benefits arising from the use of a multiecho sequence for susceptibility-weighted phase imaging using a quantitative comparison with a standard single-echo acquisition. Four healthy adult volunteers were imaged on a clinical 3-T system using a protocol comprising two different three-dimensional susceptibility-weighted gradient-echo sequences: a standard single-echo sequence and a multiecho sequence. Both sequences were repeated twice in order to evaluate the local noise contribution by a subtraction of the two acquisitions. For the multiecho sequence, the phase information from each echo was independently unwrapped, and the background field contribution was removed using either homodyne filtering or the projection onto dipole fields method. The phase information from all echoes was then combined using a weighted linear regression. R2 maps were also calculated from the multiecho acquisitions. The noise standard deviation in the reconstructed phase images was evaluated for six manually segmented regions of interest (frontal white matter, posterior white matter, globus pallidus, putamen, caudate nucleus and lateral ventricle). The use of the multiecho sequence for susceptibility-weighted phase imaging led to a reduction of the noise standard deviation for all subjects and all regions of interest investigated in comparison to the reference single-echo acquisition. On average, the noise reduction ranged from 18.4% for the globus pallidus to 47.9% for the lateral ventricle. In addition, the amount of noise reduction was found to be strongly inversely correlated to the estimated R2 value (R=-0.92). In conclusion, the use of a multiecho sequence is an effective way to decrease the noise contribution in susceptibility-weighted phase images, while preserving both contrast and acquisition time. The proposed approach additionally permits the calculation of R2 maps. Copyright © 2012 Elsevier Inc. All rights reserved.

  13. Escherichia coli promoter sequences predict in vitro RNA polymerase selectivity.

    PubMed

    Mulligan, M E; Hawley, D K; Entriken, R; McClure, W R

    1984-01-11

    We describe a simple algorithm for computing a homology score for Escherichia coli promoters based on DNA sequence alone. The homology score was related to 31 values, measured in vitro, of RNA polymerase selectivity, which we define as the product KBk2, the apparent second order rate constant for open complex formation. We found that promoter strength could be predicted to within a factor of +/-4.1 in KBk2 over a range of 10(4) in the same parameter. The quantitative evaluation was linked to an automated (Apple II) procedure for searching and evaluating possible promoters in DNA sequence files.

  14. On the normalization of the minimum free energy of RNAs by sequence length.

    PubMed

    Trotta, Edoardo

    2014-01-01

    The minimum free energy (MFE) of ribonucleic acids (RNAs) increases at an apparent linear rate with sequence length. Simple indices, obtained by dividing the MFE by the number of nucleotides, have been used for a direct comparison of the folding stability of RNAs of various sizes. Although this normalization procedure has been used in several studies, the relationship between normalized MFE and length has not yet been investigated in detail. Here, we demonstrate that the variation of MFE with sequence length is not linear and is significantly biased by the mathematical formula used for the normalization procedure. For this reason, the normalized MFEs strongly decrease as hyperbolic functions of length and produce unreliable results when applied for the comparison of sequences with different sizes. We also propose a simple modification of the normalization formula that corrects the bias enabling the use of the normalized MFE for RNAs longer than 40 nt. Using the new corrected normalized index, we analyzed the folding free energies of different human RNA families showing that most of them present an average MFE density more negative than expected for a typical genomic sequence. Furthermore, we found that a well-defined and restricted range of MFE density characterizes each RNA family, suggesting the use of our corrected normalized index to improve RNA prediction algorithms. Finally, in coding and functional human RNAs the MFE density appears scarcely correlated with sequence length, consistent with a negligible role of thermodynamic stability demands in determining RNA size.

  15. Flow cytometry for enrichment and titration in massively parallel DNA sequencing

    PubMed Central

    Sandberg, Julia; Ståhl, Patrik L.; Ahmadian, Afshin; Bjursell, Magnus K.; Lundeberg, Joakim

    2009-01-01

    Massively parallel DNA sequencing is revolutionizing genomics research throughout the life sciences. However, the reagent costs and labor requirements in current sequencing protocols are still substantial, although improvements are continuously being made. Here, we demonstrate an effective alternative to existing sample titration protocols for the Roche/454 system using Fluorescence Activated Cell Sorting (FACS) technology to determine the optimal DNA-to-bead ratio prior to large-scale sequencing. Our method, which eliminates the need for the costly pilot sequencing of samples during titration is capable of rapidly providing accurate DNA-to-bead ratios that are not biased by the quantification and sedimentation steps included in current protocols. Moreover, we demonstrate that FACS sorting can be readily used to highly enrich fractions of beads carrying template DNA, with near total elimination of empty beads and no downstream sacrifice of DNA sequencing quality. Automated enrichment by FACS is a simple approach to obtain pure samples for bead-based sequencing systems, and offers an efficient, low-cost alternative to current enrichment protocols. PMID:19304748

  16. Long-Term Predictive and Feedback Encoding of Motor Signals in the Simple Spike Discharge of Purkinje Cells

    PubMed Central

    Popa, Laurentiu S.; Streng, Martha L.

    2017-01-01

    Abstract Most hypotheses of cerebellar function emphasize a role in real-time control of movements. However, the cerebellum’s use of current information to adjust future movements and its involvement in sequencing, working memory, and attention argues for predicting and maintaining information over extended time windows. The present study examines the time course of Purkinje cell discharge modulation in the monkey (Macaca mulatta) during manual, pseudo-random tracking. Analysis of the simple spike firing from 183 Purkinje cells during tracking reveals modulation up to 2 s before and after kinematics and position error. Modulation significance was assessed against trial shuffled firing, which decoupled simple spike activity from behavior and abolished long-range encoding while preserving data statistics. Position, velocity, and position errors have the most frequent and strongest long-range feedforward and feedback modulations, with less common, weaker long-term correlations for speed and radial error. Position, velocity, and position errors can be decoded from the population simple spike firing with considerable accuracy for even the longest predictive (-2000 to -1500 ms) and feedback (1500 to 2000 ms) epochs. Separate analysis of the simple spike firing in the initial hold period preceding tracking shows similar long-range feedforward encoding of the upcoming movement and in the final hold period feedback encoding of the just completed movement, respectively. Complex spike analysis reveals little long-term modulation with behavior. We conclude that Purkinje cell simple spike discharge includes short- and long-range representations of both upcoming and preceding behavior that could underlie cerebellar involvement in error correction, working memory, and sequencing. PMID:28413823

  17. Using SQL Databases for Sequence Similarity Searching and Analysis.

    PubMed

    Pearson, William R; Mackey, Aaron J

    2017-09-13

    Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  18. A simple second-order digital phase-locked loop.

    NASA Technical Reports Server (NTRS)

    Tegnelia, C. R.

    1972-01-01

    A simple second-order digital phase-locked loop has been designed for the Viking Orbiter 1975 command system. Excluding analog-to-digital conversion, implementation of the loop requires only an adder/subtractor, two registers, and a correctable counter with control logic. The loop considers only the polarity of phase error and corrects system clocks according to a filtered sequence of this polarity. The loop is insensitive to input gain variation, and therefore offers the advantage of stable performance over long life. Predictable performance is guaranteed by extreme reliability of acquisition, yet in the steady state the loop produces only a slight degradation with respect to analog loop performance.

  19. Fluorogenic DNA Sequencing in PDMS Microreactors

    PubMed Central

    Sims, Peter A.; Greenleaf, William J.; Duan, Haifeng; Xie, X. Sunney

    2012-01-01

    We have developed a multiplex sequencing-by-synthesis method combining terminal-phosphate labeled fluorogenic nucleotides (TPLFNs) and resealable microreactors. In the presence of phosphatase, the incorporation of a non-fluorescent TPLFN into a DNA primer by DNA polymerase results in a fluorophore. We immobilize DNA templates within polydimethylsiloxane (PDMS) microreactors, sequentially introduce one of the four identically labeled TPLFNs, seal the microreactors, allow template-directed TPLFN incorporation, and measure the signal from the fluorophores trapped in the microreactors. This workflow allows sequencing in a manner akin to pyrosequencing but without constant monitoring of each microreactor. With cycle times of <10 minutes, we demonstrate 30 base reads with ∼99% raw accuracy. “Fluorogenic pyrosequencing” combines benefits of pyrosequencing, such as rapid turn-around, native DNA generation, and single-color detection, with benefits of fluorescence-based approaches, such as highly sensitive detection and simple parallelization. PMID:21666670

  20. An Isotopic Dilution Experiment Using Liquid Scintillation: A Simple Two-System, Two-Phase Analysis.

    ERIC Educational Resources Information Center

    Moehs, Peter J.; Levine, Samuel

    1982-01-01

    A simple isotonic, dilution analysis whose principles apply to methods of more complex radioanalyses is described. Suitable for clinical and instrumental analysis chemistry students, experimental manipulations are kept to a minimum involving only aqueous extraction before counting. Background information, procedures, and results are discussed.…

  1. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    USGS Publications Warehouse

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  2. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data.

    PubMed

    Miller, Mark P; Knaus, Brian J; Mullins, Thomas D; Haig, Susan M

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25 bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  3. Solving Assembly Sequence Planning using Angle Modulated Simulated Kalman Filter

    NASA Astrophysics Data System (ADS)

    Mustapa, Ainizar; Yusof, Zulkifli Md.; Adam, Asrul; Muhammad, Badaruddin; Ibrahim, Zuwairie

    2018-03-01

    This paper presents an implementation of Simulated Kalman Filter (SKF) algorithm for optimizing an Assembly Sequence Planning (ASP) problem. The SKF search strategy contains three simple steps; predict-measure-estimate. The main objective of the ASP is to determine the sequence of component installation to shorten assembly time or save assembly costs. Initially, permutation sequence is generated to represent each agent. Each agent is then subjected to a precedence matrix constraint to produce feasible assembly sequence. Next, the Angle Modulated SKF (AMSKF) is proposed for solving ASP problem. The main idea of the angle modulated approach in solving combinatorial optimization problem is to use a function, g(x), to create a continuous signal. The performance of the proposed AMSKF is compared against previous works in solving ASP by applying BGSA, BPSO, and MSPSO. Using a case study of ASP, the results show that AMSKF outperformed all the algorithms in obtaining the best solution.

  4. Development of chromosome-specific markers with high polymorphism for allotetraploid cotton based on genome-wide characterization of simple sequence repeats in diploid cottons (Gossypium arboreum L. and Gossypium raimondii Ulbrich).

    PubMed

    Lu, Cairui; Zou, Changsong; Zhang, Youping; Yu, Daoqian; Cheng, Hailiang; Jiang, Pengfei; Yang, Wencui; Wang, Qiaolian; Feng, Xiaoxu; Prosper, Mtawa Andrew; Guo, Xiaoping; Song, Guoli

    2015-02-06

    Tetraploid cotton contains two sets of homologous chromosomes, the At- and Dt-subgenomes. Consequently, many markers in cotton were mapped to multiple positions during linkage genetic map construction, posing a challenge to anchoring linkage groups and mapping economically-important genes to particular chromosomes. Chromosome-specific markers could solve this problem. Recently, the genomes of two diploid species were sequenced whose progenitors were putative contributors of the At- and Dt-subgenomes to tetraploid cotton. These sequences provide a powerful tool for developing chromosome-specific markers given the high level of synteny among tetraploid and diploid cotton genomes. In this study, simple sequence repeats (SSRs) on each chromosome in the two diploid genomes were characterized. Chromosome-specific SSRs were developed by comparative analysis and proved to distinguish chromosomes. A total of 200,744 and 142,409 SSRs were detected on the 13 chromosomes of Gossypium arboreum L. and Gossypium raimondii Ulbrich, respectively. Chromosome-specific SSRs were obtained by comparing SSR flanking sequences from each chromosome with those from the other 25 chromosomes. The average was 7,996 per chromosome. To confirm their chromosome specificity, these SSRs were used to distinguish two homologous chromosomes in tetraploid cotton through linkage group construction. The chromosome-specific SSRs and previously-reported chromosome markers were grouped together, and no marker mapped to another homologous chromosome, proving that the chromosome-specific SSRs were unique and could distinguish homologous chromosomes in tetraploid cotton. Because longer dinucleotide AT-rich repeats were the most polymorphic in previous reports, the SSRs on each chromosome were sorted by motif type and repeat length for convenient selection. The primer sequences of all chromosome-specific SSRs were also made publicly available. Chromosome-specific SSRs are efficient tools for chromosome

  5. PCR-based approach to SINE isolation: simple and complex SINEs.

    PubMed

    Borodulina, Olga R; Kramerov, Dmitri A

    2005-04-11

    Highly repeated copies of short interspersed elements (SINEs) occur in eukaryotic genomes. The distribution of each SINE family is usually restricted to some genera, families, or orders. SINEs have an RNA polymerase III internal promoter, which is composed of boxes A and B. Here we propose a method for isolation of novel SINE families based on genomic DNA PCR with oligonucleotide identical to box A as a primer. Cloning of the size-heterogeneous PCR-products and sequencing of their terminal regions allow determination of SINE structure. Using this approach, two novel SINE families, Rhin-1 and Das-1, from the genomes of great horseshoe bat (Rhinolophus ferrumequinum) and nine-banded armadillo (Dasypus novemcinctus), respectively, were isolated and studied. The distribution of Rhin-1 is restricted to two of six bat families tested. Copies of this SINE are characterized by frequent internal insertions and significant length (200-270 bp). Das-1 being only 90 bp in length is one of the shortest SINEs known. Most of Das-1 nucleotide sequences demonstrate significant similarity to alanine tRNA which appears to be an evolutionary progenitor of this SINE. Together with three other known SINEs (ID, Vic-1, and CYN), Das-1 constitutes a group of simple SINEs. Interestingly, three SINE families of this group are alanine tRNA-derived. Most probably, this tRNA gave rise to short and simple but successful SINEs several times during mammalian evolution.

  6. Population structure of rice varieties used in Turkish rice breeding programs determined using simple-sequence repeat and inter-primer binding site-retrotransposon data.

    PubMed

    Cömertpay, G; Baloch, F S; Derya, M; Andeden, E E; Alsaleh, A; Sürek, H; Özkan, H

    2016-02-19

    Effective breeding programs based on genetic diversity are needed to broaden the genetic basis of rice (Oryza sativa L.) in Turkey. In this study, 81 commercial varieties from seven countries were studied in order to estimate the genomic relationships among them using nine inter-primer binding site (iPBS)-retrotransposon and 17 simple-sequence repeat (SSR) markers. A total of 59 alleles for the SSR markers and 96 bands for the iPBS-retrotransposon markers were detected, with an average of 3.47 and 10.6 per locus, respectively. Each of the varieties could be unequivocally identified by the SSR and iPBS-retrotransposon profiles. The iPBS-retrotransposon- and SSR-based clustering were identical and closely mirrored each other, with a significantly high correlation (r = 0.73). A neighbor-joining cluster based on the combined SSR and iPBS-retrotransposon data divided the rice varieties into three clusters. The population structure was determined using the STRUCTURE software, and three populations (K = 3) were identified among the varieties studied, showing that the diversity harbored by Turkish rice varieties is low. The results indicate that iPBS-retrotransposon markers are a very powerful technique to determine the genetic diversity of rice varieties.

  7. Genetic analysis and association of simple sequence repeat markers with storage root yield, dry matter, starch and β-carotene content in sweetpotato.

    PubMed

    Yada, Benard; Brown-Guedira, Gina; Alajo, Agnes; Ssemakula, Gorrettie N; Owusu-Mensah, Eric; Carey, Edward E; Mwanga, Robert O M; Yencho, G Craig

    2017-03-01

    Molecular markers are needed for enhancing the development of elite sweetpotato ( Ipomoea batatas (L.) Lam) cultivars with a wide range of commercially important traits in sub-Saharan Africa. This study was conducted to estimate the heritability and determine trait correlations of storage root yield, dry matter, starch and β-carotene content in a cross between 'New Kawogo' × 'Beauregard'. The study was also conducted to identify simple sequence repeat (SSR) markers associated with these traits. A total of 287 progeny and the parents were evaluated for two seasons at three sites in Uganda and genotyped with 250 SSR markers. Broad sense heritability (H 2 ) for storage root yield, dry matter, starch and β-carotene content were 0.24, 0.68, 0.70 and 0.90, respectively. Storage root β-carotene content was negatively correlated with dry matter (r = -0.59, P < 0.001) and starch (r = -0.93, P < 0.001) content, while storage root yield was positively correlated with dry matter (r = 0.57, P = 0.029) and starch (r = 0.41, P = 0.008) content. Through logistic regression, a total of 12, 4, 6 and 8 SSR markers were associated with storage root yield, dry matter, starch and β-carotene content, respectively. The SSR markers used in this study may be useful for quantitative trait loci analysis and selection for these traits in future.

  8. Genetic background effects in quantitative genetics: gene-by-system interactions.

    PubMed

    Sardi, Maria; Gasch, Audrey P

    2018-04-11

    Proper cell function depends on networks of proteins that interact physically and functionally to carry out physiological processes. Thus, it seems logical that the impact of sequence variation in one protein could be significantly influenced by genetic variants at other loci in a genome. Nonetheless, the importance of such genetic interactions, known as epistasis, in explaining phenotypic variation remains a matter of debate in genetics. Recent work from our lab revealed that genes implicated from an association study of toxin tolerance in Saccharomyces cerevisiae show extensive interactions with the genetic background: most implicated genes, regardless of allele, are important for toxin tolerance in only one of two tested strains. The prevalence of background effects in our study adds to other reports of widespread genetic-background interactions in model organisms. We suggest that these effects represent many-way interactions with myriad features of the cellular system that vary across classes of individuals. Such gene-by-system interactions may influence diverse traits and require new modeling approaches to accurately represent genotype-phenotype relationships across individuals.

  9. Automatic seed selection for segmentation of liver cirrhosis in laparoscopic sequences

    NASA Astrophysics Data System (ADS)

    Sinha, Rahul; Marcinczak, Jan Marek; Grigat, Rolf-Rainer

    2014-03-01

    For computer aided diagnosis based on laparoscopic sequences, image segmentation is one of the basic steps which define the success of all further processing. However, many image segmentation algorithms require prior knowledge which is given by interaction with the clinician. We propose an automatic seed selection algorithm for segmentation of liver cirrhosis in laparoscopic sequences which assigns each pixel a probability of being cirrhotic liver tissue or background tissue. Our approach is based on a trained classifier using SIFT and RGB features with PCA. Due to the unique illumination conditions in laparoscopic sequences of the liver, a very low dimensional feature space can be used for classification via logistic regression. The methodology is evaluated on 718 cirrhotic liver and background patches that are taken from laparoscopic sequences of 7 patients. Using a linear classifier we achieve a precision of 91% in a leave-one-patient-out cross-validation. Furthermore, we demonstrate that with logistic probability estimates, seeds with high certainty of being cirrhotic liver tissue can be obtained. For example, our precision of liver seeds increases to 98.5% if only seeds with more than 95% probability of being liver are used. Finally, these automatically selected seeds can be used as priors in Graph Cuts which is demonstrated in this paper.

  10. A coarse-grained biophysical model of sequence evolution and the population size dependence of the speciation rate

    PubMed Central

    Khatri, Bhavin S.; Goldstein, Richard A.

    2015-01-01

    Speciation is fundamental to understanding the huge diversity of life on Earth. Although still controversial, empirical evidence suggests that the rate of speciation is larger for smaller populations. Here, we explore a biophysical model of speciation by developing a simple coarse-grained theory of transcription factor-DNA binding and how their co-evolution in two geographically isolated lineages leads to incompatibilities. To develop a tractable analytical theory, we derive a Smoluchowski equation for the dynamics of binding energy evolution that accounts for the fact that natural selection acts on phenotypes, but variation arises from mutations in sequences; the Smoluchowski equation includes selection due to both gradients in fitness and gradients in sequence entropy, which is the logarithm of the number of sequences that correspond to a particular binding energy. This simple consideration predicts that smaller populations develop incompatibilities more quickly in the weak mutation regime; this trend arises as sequence entropy poises smaller populations closer to incompatible regions of phenotype space. These results suggest a generic coarse-grained approach to evolutionary stochastic dynamics, allowing realistic modelling at the phenotypic level. PMID:25936759

  11. Consistency of VDJ Rearrangement and Substitution Parameters Enables Accurate B Cell Receptor Sequence Annotation.

    PubMed

    Ralph, Duncan K; Matsen, Frederick A

    2016-01-01

    VDJ rearrangement and somatic hypermutation work together to produce antibody-coding B cell receptor (BCR) sequences for a remarkable diversity of antigens. It is now possible to sequence these BCRs in high throughput; analysis of these sequences is bringing new insight into how antibodies develop, in particular for broadly-neutralizing antibodies against HIV and influenza. A fundamental step in such sequence analysis is to annotate each base as coming from a specific one of the V, D, or J genes, or from an N-addition (a.k.a. non-templated insertion). Previous work has used simple parametric distributions to model transitions from state to state in a hidden Markov model (HMM) of VDJ recombination, and assumed that mutations occur via the same process across sites. However, codon frame and other effects have been observed to violate these parametric assumptions for such coding sequences, suggesting that a non-parametric approach to modeling the recombination process could be useful. In our paper, we find that indeed large modern data sets suggest a model using parameter-rich per-allele categorical distributions for HMM transition probabilities and per-allele-per-position mutation probabilities, and that using such a model for inference leads to significantly improved results. We present an accurate and efficient BCR sequence annotation software package using a novel HMM "factorization" strategy. This package, called partis (https://github.com/psathyrella/partis/), is built on a new general-purpose HMM compiler that can perform efficient inference given a simple text description of an HMM.

  12. SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences

    PubMed Central

    Kurgan, Lukasz; Cios, Krzysztof; Chen, Ke

    2008-01-01

    Background Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction. Results SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors. Conclusion The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of

  13. Comparative performance of the BGISEQ-500 vs Illumina HiSeq2500 sequencing platforms for palaeogenomic sequencing

    PubMed Central

    Mak, Sarah Siu Tze; Gopalakrishnan, Shyam; Carøe, Christian; Geng, Chunyu; Liu, Shanlin; Sinding, Mikkel-Holger S; Kuderna, Lukas F K; Zhang, Wenwei; Fu, Shujin; Vieira, Filipe G; Germonpré, Mietje; Bocherens, Hervé; Fedorov, Sergey; Petersen, Bent; Sicheritz-Pontén, Thomas; Marques-Bonet, Tomas; Zhang, Guojie; Jiang, Hui; Gilbert, M Thomas P

    2017-01-01

    Abstract Ancient DNA research has been revolutionized following development of next-generation sequencing platforms. Although a number of such platforms have been applied to ancient DNA samples, the Illumina series are the dominant choice today, mainly because of high production capacities and short read production. Recently a potentially attractive alternative platform for palaeogenomic data generation has been developed, the BGISEQ-500, whose sequence output are comparable with the Illumina series. In this study, we modified the standard BGISEQ-500 library preparation specifically for use on degraded DNA, then directly compared the sequencing performance and data quality of the BGISEQ-500 to the Illumina HiSeq2500 platform on DNA extracted from 8 historic and ancient dog and wolf samples. The data generated were largely comparable between sequencing platforms, with no statistically significant difference observed for parameters including level (P = 0.371) and average sequence length (P = 0718) of endogenous nuclear DNA, sequence GC content (P = 0.311), double-stranded DNA damage rate (v. 0.309), and sequence clonality (P = 0.093). Small significant differences were found in single-strand DNA damage rate (δS; slightly lower for the BGISEQ-500, P = 0.011) and the background rate of difference from the reference genome (θ; slightly higher for BGISEQ-500, P = 0.012). This may result from the differences in amplification cycles used to polymerase chain reaction–amplify the libraries. A significant difference was also observed in the mitochondrial DNA percentages recovered (P = 0.018), although we believe this is likely a stochastic effect relating to the extremely low levels of mitochondria that were sequenced from 3 of the samples with overall very low levels of endogenous DNA. Although we acknowledge that our analyses were limited to animal material, our observations suggest that the BGISEQ-500 holds the potential to represent a valid and potentially valuable

  14. Development of Scoring Functions for Antibody Sequence Assessment and Optimization

    PubMed Central

    Seeliger, Daniel

    2013-01-01

    Antibody development is still associated with substantial risks and difficulties as single mutations can radically change molecule properties like thermodynamic stability, solubility or viscosity. Since antibody generation methodologies cannot select and optimize for molecule properties which are important for biotechnological applications, careful sequence analysis and optimization is necessary to develop antibodies that fulfil the ambitious requirements of future drugs. While efforts to grab the physical principles of undesired molecule properties from the very bottom are becoming increasingly powerful, the wealth of publically available antibody sequences provides an alternative way to develop early assessment strategies for antibodies using a statistical approach which is the objective of this paper. Here, publically available sequences were used to develop heuristic potentials for the framework regions of heavy and light chains of antibodies of human and murine origin. The potentials take into account position dependent probabilities of individual amino acids but also conditional probabilities which are inevitable for sequence assessment and optimization. It is shown that the potentials derived from human sequences clearly distinguish between human sequences and sequences from mice and, hence, can be used as a measure of humaness which compares a given sequence with the phenotypic pool of human sequences instead of comparing sequence identities to germline genes. Following this line, it is demonstrated that, using the developed potentials, humanization of an antibody can be described as a simple mathematical optimization problem and that the in-silico generated framework variants closely resemble native sequences in terms of predicted immunogenicity. PMID:24204701

  15. Mining and Development of Novel SSR Markers Using Next Generation Sequencing (NGS) Data in Plants.

    PubMed

    Taheri, Sima; Lee Abdullah, Thohirah; Yusop, Mohd Rafii; Hanafi, Mohamed Musa; Sahebi, Mahbod; Azizi, Parisa; Shamshiri, Redmond Ramin

    2018-02-13

    Microsatellites, or simple sequence repeats (SSRs), are one of the most informative and multi-purpose genetic markers exploited in plant functional genomics. However, the discovery of SSRs and development using traditional methods are laborious, time-consuming, and costly. Recently, the availability of high-throughput sequencing technologies has enabled researchers to identify a substantial number of microsatellites at less cost and effort than traditional approaches. Illumina is a noteworthy transcriptome sequencing technology that is currently used in SSR marker development. Although 454 pyrosequencing datasets can be used for SSR development, this type of sequencing is no longer supported. This review aims to present an overview of the next generation sequencing, with a focus on the efficient use of de novo transcriptome sequencing (RNA-Seq) and related tools for mining and development of microsatellites in plants.

  16. A Simple Exact Error Rate Analysis for DS-CDMA with Arbitrary Pulse Shape in Flat Nakagami Fading

    NASA Astrophysics Data System (ADS)

    Rahman, Mohammad Azizur; Sasaki, Shigenobu; Kikuchi, Hisakazu; Harada, Hiroshi; Kato, Shuzo

    A simple exact error rate analysis is presented for random binary direct sequence code division multiple access (DS-CDMA) considering a general pulse shape and flat Nakagami fading channel. First of all, a simple model is developed for the multiple access interference (MAI). Based on this, a simple exact expression of the characteristic function (CF) of MAI is developed in a straight forward manner. Finally, an exact expression of error rate is obtained following the CF method of error rate analysis. The exact error rate so obtained can be much easily evaluated as compared to the only reliable approximate error rate expression currently available, which is based on the Improved Gaussian Approximation (IGA).

  17. Sub-band/transform compression of video sequences

    NASA Technical Reports Server (NTRS)

    Sauer, Ken; Bauer, Peter

    1992-01-01

    The progress on compression of video sequences is discussed. The overall goal of the research was the development of data compression algorithms for high-definition television (HDTV) sequences, but most of our research is general enough to be applicable to much more general problems. We have concentrated on coding algorithms based on both sub-band and transform approaches. Two very fundamental issues arise in designing a sub-band coder. First, the form of the signal decomposition must be chosen to yield band-pass images with characteristics favorable to efficient coding. A second basic consideration, whether coding is to be done in two or three dimensions, is the form of the coders to be applied to each sub-band. Computational simplicity is of essence. We review the first portion of the year, during which we improved and extended some of the previous grant period's results. The pyramid nonrectangular sub-band coder limited to intra-frame application is discussed. Perhaps the most critical component of the sub-band structure is the design of bandsplitting filters. We apply very simple recursive filters, which operate at alternating levels on rectangularly sampled, and quincunx sampled images. We will also cover the techniques we have studied for the coding of the resulting bandpass signals. We discuss adaptive three-dimensional coding which takes advantage of the detection algorithm developed last year. To this point, all the work on this project has been done without the benefit of motion compensation (MC). Motion compensation is included in many proposed codecs, but adds significant computational burden and hardware expense. We have sought to find a lower-cost alternative featuring a simple adaptation to motion in the form of the codec. In sequences of high spatial detail and zooming or panning, it appears that MC will likely be necessary for the proposed quality and bit rates.

  18. Technical Considerations for Reduced Representation Bisulfite Sequencing with Multiplexed Libraries

    PubMed Central

    Chatterjee, Aniruddha; Rodger, Euan J.; Stockwell, Peter A.; Weeks, Robert J.; Morison, Ian M.

    2012-01-01

    Reduced representation bisulfite sequencing (RRBS), which couples bisulfite conversion and next generation sequencing, is an innovative method that specifically enriches genomic regions with a high density of potential methylation sites and enables investigation of DNA methylation at single-nucleotide resolution. Recent advances in the Illumina DNA sample preparation protocol and sequencing technology have vastly improved sequencing throughput capacity. Although the new Illumina technology is now widely used, the unique challenges associated with multiplexed RRBS libraries on this platform have not been previously described. We have made modifications to the RRBS library preparation protocol to sequence multiplexed libraries on a single flow cell lane of the Illumina HiSeq 2000. Furthermore, our analysis incorporates a bioinformatics pipeline specifically designed to process bisulfite-converted sequencing reads and evaluate the output and quality of the sequencing data generated from the multiplexed libraries. We obtained an average of 42 million paired-end reads per sample for each flow-cell lane, with a high unique mapping efficiency to the reference human genome. Here we provide a roadmap of modifications, strategies, and trouble shooting approaches we implemented to optimize sequencing of multiplexed libraries on an a RRBS background. PMID:23193365

  19. An Activation-Based Model of Routine Sequence Errors

    DTIC Science & Technology

    2015-04-01

    part of the ACT-R frame- work (e.g., Anderson, 1983), we adopt a newer, richer no- tion of priming as part of our approach ( Harrison & Trafton, 2010...2014). Other models of routine sequence errors, such as the in- teractive activation network ( IAN ) model (Cooper & Shal- lice, 2006) and the simple...error patterns that results from an interface layout shift. The ideas behind our expanded priming approach, however, could apply to IAN , which uses

  20. Modeling of prepregs during automated draping sequences

    NASA Astrophysics Data System (ADS)

    Krogh, Christian; Glud, Jens A.; Jakobsen, Johnny

    2017-10-01

    The behavior of wowen prepreg fabric during automated draping sequences is investigated. A drape tool under development with an arrangement of grippers facilitates the placement of a woven prepreg fabric in a mold. It is essential that the draped configuration is free from wrinkles and other defects. The present study aims at setting up a virtual draping framework capable of modeling the draping process from the initial flat fabric to the final double curved shape and aims at assisting the development of an automated drape tool. The virtual draping framework consists of a kinematic mapping algorithm used to generate target points on the mold which are used as input to a draping sequence planner. The draping sequence planner prescribes the displacement history for each gripper in the drape tool and these displacements are then applied to each gripper in a transient model of the draping sequence. The model is based on a transient finite element analysis with the material's constitutive behavior currently being approximated as linear elastic orthotropic. In-plane tensile and bias-extension tests as well as bending tests are conducted and used as input for the model. The virtual draping framework shows a good potential for obtaining a better understanding of the drape process and guide the development of the drape tool. However, results obtained from using the framework on a simple test case indicate that the generation of draping sequences is non-trivial.

  1. In silico segmentations of lentivirus envelope sequences

    PubMed Central

    Boissin-Quillon, Aurélia; Piau, Didier; Leroux, Caroline

    2007-01-01

    Background The gene encoding the envelope of lentiviruses exhibits a considerable plasticity, particularly the region which encodes the surface (SU) glycoprotein. Interestingly, mutations do not appear uniformly along the sequence of SU, but they are clustered in restricted areas, called variable (V) regions, which are interspersed with relatively more stable regions, called constant (C) regions. We look for specific signatures of C/V regions, using hidden Markov models constructed with SU sequences of the equine, human, small ruminant and simian lentiviruses. Results Our models yield clear and accurate delimitations of the C/V regions, when the test set and the training set were made up of sequences of the same lentivirus, but also when they were made up of sequences of different lentiviruses. Interestingly, the models predicted the different regions of lentiviruses such as the bovine and feline lentiviruses, not used in the training set. Models based on composite training sets produce accurate segmentations of sequences of all these lentiviruses. Conclusion Our results suggest that each C/V region has a specific statistical oligonucleotide composition, and that the C (respectively V) regions of one of these lentiviruses are statistically more similar to the C (respectively V) regions of the other lentiviruses, than to the V (respectively C) regions of the same lentivirus. PMID:17376229

  2. The sequence of sequencers: The history of sequencing DNA

    PubMed Central

    Heather, James M.; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. PMID:26554401

  3. Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

    PubMed

    Mackey, Aaron J; Pearson, William R

    2004-10-01

    Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.

  4. Exome Sequencing in Suspected Monogenic Dyslipidemias

    PubMed Central

    Stitziel, Nathan O.; Peloso, Gina M.; Abifadel, Marianne; Cefalu, Angelo B.; Fouchier, Sigrid; Motazacker, M. Mahdi; Tada, Hayato; Larach, Daniel B.; Awan, Zuhier; Haller, Jorge F.; Pullinger, Clive R.; Varret, Mathilde; Rabès, Jean-Pierre; Noto, Davide; Tarugi, Patrizia; Kawashiri, Masa-aki; Nohara, Atsushi; Yamagishi, Masakazu; Risman, Marjorie; Deo, Rahul; Ruel, Isabelle; Shendure, Jay; Nickerson, Deborah A.; Wilson, James G.; Rich, Stephen S.; Gupta, Namrata; Farlow, Deborah N.; Neale, Benjamin M.; Daly, Mark J.; Kane, John P.; Freeman, Mason W.; Genest, Jacques; Rader, Daniel J.; Mabuchi, Hiroshi; Kastelein, John J.P.; Hovingh, G. Kees; Averna, Maurizio R.; Gabriel, Stacey; Boileau, Catherine; Kathiresan, Sekar

    2015-01-01

    Background Exome sequencing is a promising tool for gene mapping in Mendelian disorders. We utilized this technique in an attempt to identify novel genes underlying monogenic dyslipidemias. Methods and Results We performed exome sequencing on 213 selected family members from 41 kindreds with suspected Mendelian inheritance of extreme levels of low-density lipoprotein (LDL) cholesterol (after candidate gene sequencing excluded known genetic causes for high LDL cholesterol families) or high-density lipoprotein (HDL) cholesterol. We used standard analytic approaches to identify candidate variants and also assigned a polygenic score to each individual in order to account for their burden of common genetic variants known to influence lipid levels. In nine families, we identified likely pathogenic variants in known lipid genes (ABCA1, APOB, APOE, LDLR, LIPA, and PCSK9); however, we were unable to identify obvious genetic etiologies in the remaining 32 families despite follow-up analyses. We identified three factors that limited novel gene discovery: (1) imperfect sequencing coverage across the exome hid potentially causal variants; (2) large numbers of shared rare alleles within families obfuscated causal variant identification; and (3) individuals from 15% of families carried a significant burden of common lipid-related alleles, suggesting complex inheritance can masquerade as monogenic disease. Conclusions We identified the genetic basis of disease in nine of 41 families; however, none of these represented novel gene discoveries. Our results highlight the promise and limitations of exome sequencing as a discovery technique in suspected monogenic dyslipidemias. Considering the confounders identified may inform the design of future exome sequencing studies. PMID:25632026

  5. High-Throughput Analysis of T-DNA Location and Structure Using Sequence Capture.

    PubMed

    Inagaki, Soichi; Henry, Isabelle M; Lieberman, Meric C; Comai, Luca

    2015-01-01

    Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA-genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously, using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. Our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.

  6. Simple Kidney Cysts

    MedlinePlus

    ... Solitary Kidney Your Kidneys & How They Work Simple Kidney Cysts What are simple kidney cysts? Simple kidney cysts are abnormal, fluid-filled ... that form in the kidneys. What are the kidneys and what do they do? The kidneys are ...

  7. in silico Whole Genome Sequencer & Analyzer (iWGS): A Computational Pipeline to Guide the Design and Analysis of de novo Genome Sequencing Studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhou, Xiaofan; Peris, David; Kominek, Jacek

    The availability of genomes across the tree of life is highly biased toward vertebrates, pathogens, human disease models, and organisms with relatively small and simple genomes. Recent progress in genomics has enabled the de novo decoding of the genome of virtually any organism, greatly expanding its potential for understanding the biology and evolution of the full spectrum of biodiversity. The increasing diversity of sequencing technologies, assays, and de novo assembly algorithms have augmented the complexity of de novo genome sequencing projects in nonmodel organisms. To reduce the costs and challenges in de novo genome sequencing projects and streamline their experimentalmore » design and analysis, we developed iWGS (in silico Whole Genome Sequencer and Analyzer), an automated pipeline for guiding the choice of appropriate sequencing strategy and assembly protocols. iWGS seamlessly integrates the four key steps of a de novo genome sequencing project: data generation (through simulation), data quality control, de novo assembly, and assembly evaluation and validation. The last three steps can also be applied to the analysis of real data. iWGS is designed to enable the user to have great flexibility in testing the range of experimental designs available for genome sequencing projects, and supports all major sequencing technologies and popular assembly tools. Three case studies illustrate how iWGS can guide the design of de novo genome sequencing projects, and evaluate the performance of a wide variety of user-specified sequencing strategies and assembly protocols on genomes of differing architectures. iWGS, along with a detailed documentation, is freely available at https://github.com/zhouxiaofan1983/iWGS.« less

  8. in silico Whole Genome Sequencer & Analyzer (iWGS): A Computational Pipeline to Guide the Design and Analysis of de novo Genome Sequencing Studies

    DOE PAGES

    Zhou, Xiaofan; Peris, David; Kominek, Jacek; ...

    2016-09-16

    The availability of genomes across the tree of life is highly biased toward vertebrates, pathogens, human disease models, and organisms with relatively small and simple genomes. Recent progress in genomics has enabled the de novo decoding of the genome of virtually any organism, greatly expanding its potential for understanding the biology and evolution of the full spectrum of biodiversity. The increasing diversity of sequencing technologies, assays, and de novo assembly algorithms have augmented the complexity of de novo genome sequencing projects in nonmodel organisms. To reduce the costs and challenges in de novo genome sequencing projects and streamline their experimentalmore » design and analysis, we developed iWGS (in silico Whole Genome Sequencer and Analyzer), an automated pipeline for guiding the choice of appropriate sequencing strategy and assembly protocols. iWGS seamlessly integrates the four key steps of a de novo genome sequencing project: data generation (through simulation), data quality control, de novo assembly, and assembly evaluation and validation. The last three steps can also be applied to the analysis of real data. iWGS is designed to enable the user to have great flexibility in testing the range of experimental designs available for genome sequencing projects, and supports all major sequencing technologies and popular assembly tools. Three case studies illustrate how iWGS can guide the design of de novo genome sequencing projects, and evaluate the performance of a wide variety of user-specified sequencing strategies and assembly protocols on genomes of differing architectures. iWGS, along with a detailed documentation, is freely available at https://github.com/zhouxiaofan1983/iWGS.« less

  9. Socio-Economic Background and Access to Internet as Correlates of Students' Achievement in Agricultural Science

    ERIC Educational Resources Information Center

    Adegoke, Sunday Paul; Osokoya, Modupe M.

    2015-01-01

    This study investigated access to internet and socio-economic background as correlates of students' achievement in Agricultural Science among selected Senior Secondary Schools Two Students in Ogbomoso South and North Local Government Areas. The study adopted multi-stage sampling technique. Simple random sampling was used to select 30 students from…

  10. Development of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L.) Millspaugh

    PubMed Central

    2011-01-01

    Background Pigeonpea [Cajanus cajan (L.) Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic) markers. We report a comprehensive set of validated genic simple sequence repeat (SSR) markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. Results In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%), hexa- (2.62%), tetra- (1.67%) and pentanucleotide (0.76%) repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. Conclusion We developed 550 validated genic

  11. Impact of the HIV-1 genetic background and HIV-1 population size on the evolution of raltegravir resistance.

    PubMed

    Fun, Axel; Leitner, Thomas; Vandekerckhove, Linos; Däumer, Martin; Thielen, Alexander; Buchholz, Bernd; Hoepelman, Andy I M; Gisolf, Elizabeth H; Schipper, Pauline J; Wensing, Annemarie M J; Nijhuis, Monique

    2018-01-05

    Emergence of resistance against integrase inhibitor raltegravir in human immunodeficiency virus type 1 (HIV-1) patients is generally associated with selection of one of three signature mutations: Y143C/R, Q148K/H/R or N155H, representing three distinct resistance pathways. The mechanisms that drive selection of a specific pathway are still poorly understood. We investigated the impact of the HIV-1 genetic background and population dynamics on the emergence of raltegravir resistance. Using deep sequencing we analyzed the integrase coding sequence (CDS) in longitudinal samples from five patients who initiated raltegravir plus optimized background therapy at viral loads > 5000 copies/ml. To investigate the role of the HIV-1 genetic background we created recombinant viruses containing the viral integrase coding region from pre-raltegravir samples from two patients in whom raltegravir resistance developed through different pathways. The in vitro selections performed with these recombinant viruses were designed to mimic natural population bottlenecks. Deep sequencing analysis of the viral integrase CDS revealed that the virological response to raltegravir containing therapy inversely correlated with the relative amount of unique sequence variants that emerged suggesting diversifying selection during drug pressure. In 4/5 patients multiple signature mutations representing different resistance pathways were observed. Interestingly, the resistant population can consist of a single resistant variant that completely dominates the population but also of multiple variants from different resistance pathways that coexist in the viral population. We also found evidence for increased diversification after stronger bottlenecks. In vitro selections with low viral titers, mimicking population bottlenecks, revealed that both recombinant viruses and HXB2 reference virus were able to select mutations from different resistance pathways, although typically only one resistance pathway

  12. Terrestrial Background Reduction in RPM Systems by Direct Internal Shielding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Robinson, Sean M.; Ashbaker, Eric D.; Schweppe, John E.

    2008-11-19

    Gamma-ray detection systems that are close to the earth or other sources of background radiation often require shielding, especially when trying to detect a relatively weak source. One particular case of interest that we address in this paper is that encountered by the Radiation Portal Monitors (RPMs) systems placed at border-crossing Ports of Entry (POE). These RPM systems are used to screen for illicit radiological materials, and they are often placed in situations where terrestrial background is large. In such environments, it is desirable to consider simple physical modifications that could be implemented to reduce the effects from background radiationmore » without affecting the flow of traffic and the normal operation of the portal. Simple modifications include adding additional shielding to the environment, either inside or outside the apparatus. Previous work [2] has shown the utility of some of these shielding configurations for increasing the Signal to Noise Ratio (SNR) of gross-counting RPMs. Because the total cost for purchasing and installing RPM systems can be quite expensive, in the range of hundreds of thousands of dollars for each cargo-screening installation, these shielding variations may offer increases in detection capability for relatively small cost. Several modifications are considered here in regard to their real-world applicability, and are meant to give a general idea of the effectiveness of the schemes used to reduce background for both gross-counting and spectroscopic detectors. These scenarios are modeled via the Monte-Carlo N-Particle (MCNP) code package [1] for ease of altering shielding configurations, as well as enacting unusual scenarios prior to prototyping in the field. The objective of this paper is to provide results representative of real modifications that could enhance the sensitivity of this, as well as the next generation of radiation detectors. The models used in this work were designed to provide the most general

  13. Spatiotemporal models for the simulation of infrared backgrounds

    NASA Astrophysics Data System (ADS)

    Wilkes, Don M.; Cadzow, James A.; Peters, R. Alan, II; Li, Xingkang

    1992-09-01

    It is highly desirable for designers of automatic target recognizers (ATRs) to be able to test their algorithms on targets superimposed on a wide variety of background imagery. Background imagery in the infrared spectrum is expensive to gather from real sources, consequently, there is a need for accurate models for producing synthetic IR background imagery. We have developed a model for such imagery that will do the following: Given a real, infrared background image, generate another image, distinctly different from the one given, that has the same general visual characteristics as well as the first and second-order statistics of the original image. The proposed model consists of a finite impulse response (FIR) kernel convolved with an excitation function, and histogram modification applied to the final solution. A procedure for deriving the FIR kernel using a signal enhancement algorithm has been developed, and the histogram modification step is a simple memoryless nonlinear mapping that imposes the first order statistics of the original image onto the synthetic one, thus the overall model is a linear system cascaded with a memoryless nonlinearity. It has been found that the excitation function relates to the placement of features in the image, the FIR kernel controls the sharpness of the edges and the global spectrum of the image, and the histogram controls the basic coloration of the image. A drawback to this method of simulating IR backgrounds is that a database of actual background images must be collected in order to produce accurate FIR and histogram models. If this database must include images of all types of backgrounds obtained at all times of the day and all times of the year, the size of the database would be prohibitive. In this paper we propose improvements to the model described above that enable time-dependent modeling of the IR background. This approach can greatly reduce the number of actual IR backgrounds that are required to produce a

  14. The sequence of sequencers: The history of sequencing DNA.

    PubMed

    Heather, James M; Chain, Benjamin

    2016-01-01

    Determining the order of nucleic acid residues in biological samples is an integral component of a wide variety of research applications. Over the last fifty years large numbers of researchers have applied themselves to the production of techniques and technologies to facilitate this feat, sequencing DNA and RNA molecules. This time-scale has witnessed tremendous changes, moving from sequencing short oligonucleotides to millions of bases, from struggling towards the deduction of the coding sequence of a single gene to rapid and widely available whole genome sequencing. This article traverses those years, iterating through the different generations of sequencing technology, highlighting some of the key discoveries, researchers, and sequences along the way. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  15. On the Normalization of the Minimum Free Energy of RNAs by Sequence Length

    PubMed Central

    Trotta, Edoardo

    2014-01-01

    The minimum free energy (MFE) of ribonucleic acids (RNAs) increases at an apparent linear rate with sequence length. Simple indices, obtained by dividing the MFE by the number of nucleotides, have been used for a direct comparison of the folding stability of RNAs of various sizes. Although this normalization procedure has been used in several studies, the relationship between normalized MFE and length has not yet been investigated in detail. Here, we demonstrate that the variation of MFE with sequence length is not linear and is significantly biased by the mathematical formula used for the normalization procedure. For this reason, the normalized MFEs strongly decrease as hyperbolic functions of length and produce unreliable results when applied for the comparison of sequences with different sizes. We also propose a simple modification of the normalization formula that corrects the bias enabling the use of the normalized MFE for RNAs longer than 40 nt. Using the new corrected normalized index, we analyzed the folding free energies of different human RNA families showing that most of them present an average MFE density more negative than expected for a typical genomic sequence. Furthermore, we found that a well-defined and restricted range of MFE density characterizes each RNA family, suggesting the use of our corrected normalized index to improve RNA prediction algorithms. Finally, in coding and functional human RNAs the MFE density appears scarcely correlated with sequence length, consistent with a negligible role of thermodynamic stability demands in determining RNA size. PMID:25405875

  16. Validation of the Simple Shoulder Test in a Portuguese-Brazilian Population. Is the Latent Variable Structure and Validation of the Simple Shoulder Test Stable across Cultures?

    PubMed Central

    Neto, Jose Osni Bruggemann; Gesser, Rafael Lehmkuhl; Steglich, Valdir; Bonilauri Ferreira, Ana Paula; Gandhi, Mihir; Vissoci, João Ricardo Nickenig; Pietrobon, Ricardo

    2013-01-01

    Background The validation of widely used scales facilitates the comparison across international patient samples. The objective of this study was to translate, culturally adapt and validate the Simple Shoulder Test into Brazilian Portuguese. Also we test the stability of factor analysis across different cultures. Objective The objective of this study was to translate, culturally adapt and validate the Simple Shoulder Test into Brazilian Portuguese. Also we test the stability of factor analysis across different cultures. Methods The Simple Shoulder Test was translated from English into Brazilian Portuguese, translated back into English, and evaluated for accuracy by an expert committee. It was then administered to 100 patients with shoulder conditions. Psychometric properties were analyzed including factor analysis, internal reliability, test-retest reliability at seven days, and construct validity in relation to the Short Form 36 health survey (SF-36). Results Factor analysis demonstrated a three factor solution. Cronbach’s alpha was 0.82. Test-retest reliability index as measured by intra-class correlation coefficient (ICC) was 0.84. Associations were observed in the hypothesized direction with all subscales of SF-36 questionnaire. Conclusion The Simple Shoulder Test translation and cultural adaptation to Brazilian-Portuguese demonstrated adequate factor structure, internal reliability, and validity, ultimately allowing for its use in the comparison with international patient samples. PMID:23675436

  17. Complete genome sequence of Pelosinus sp. strain UFO1 assembled using single-molecule real-time DNA sequencing technology

    DOE PAGES

    Brown, Steven D.; Utturkar, Sagar M.; Magnuson, Timothy S.; ...

    2014-09-04

    Pelosinus fermentans strain R7 was isolated from Russian kaolin clays as the type strain and it can reduce Fe(III) during fermentative growth (1). Draft genome sequences for P. fermentans R7 and four strains from Hanford, Washington, USA, have been published (2–4). The P. fermentans 16S rRNA sequence dominated the lactate-based enrichment cultures from three geochemically contrasting soils from the Melton Branch Watershed, Oak Ridge, Tennessee, USA (5) and also at another stimulated, uraniumcontaminated field site near Oak Ridge (6). For the current work, strain UFO1 was isolated from pristine sediments at a background field site in Oak Ridge and characterizedmore » as facilitating U(VI) reduction and precipitation with phosphate (7).« less

  18. TOPPE: A framework for rapid prototyping of MR pulse sequences.

    PubMed

    Nielsen, Jon-Fredrik; Noll, Douglas C

    2018-06-01

    To introduce a framework for rapid prototyping of MR pulse sequences. We propose a simple file format, called "TOPPE", for specifying all details of an MR imaging experiment, such as gradient and radiofrequency waveforms and the complete scan loop. In addition, we provide a TOPPE file "interpreter" for GE scanners, which is a binary executable that loads TOPPE files and executes the sequence on the scanner. We also provide MATLAB scripts for reading and writing TOPPE files and previewing the sequence prior to hardware execution. With this setup, the task of the pulse sequence programmer is reduced to creating TOPPE files, eliminating the need for hardware-specific programming. No sequence-specific compilation is necessary; the interpreter only needs to be compiled once (for every scanner software upgrade). We demonstrate TOPPE in three different applications: k-space mapping, non-Cartesian PRESTO whole-brain dynamic imaging, and myelin mapping in the brain using inhomogeneous magnetization transfer. We successfully implemented and executed the three example sequences. By simply changing the various TOPPE sequence files, a single binary executable (interpreter) was used to execute several different sequences. The TOPPE file format is a complete specification of an MR imaging experiment, based on arbitrary sequences of a (typically small) number of unique modules. Along with the GE interpreter, TOPPE comprises a modular and flexible platform for rapid prototyping of new pulse sequences. Magn Reson Med 79:3128-3134, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.

  19. Comparative analyses of simple sequence repeats (SSRs) in 23 mosquito species genomes: Identification, characterization and distribution (Diptera: Culicidae).

    PubMed

    Wang, Xiao-Ting; Zhang, Yu-Juan; Qiao, Liang; Chen, Bin

    2018-02-27

    Simple sequence repeats (SSRs) exist in both eukaryotic and prokaryotic genomes and are the most popular genetic markers, but the SSRs of mosquito genomes are still not well understood. In this study, we identified and analyzed the SSRs in 23 mosquito species using Drosophila melanogaster as reference at the whole-genome level. The results show that SSR numbers (33 076-560 175/genome) and genome sizes (574.57-1342.21 Mb) are significantly positively correlated (R 2 = 0.8992, P < 0.01), but the correlation in individual species varies in these mosquito species. In six types of SSR, mono- to trinucleotide SSRs are dominant with cumulative percentages of 95.14%-99.00% and densities of 195.65/Mb-787.51/Mb, whereas tetra- to hexanucleotide SSRs are rare with 1.12%-4.22% and 3.76/Mb-40.23/Mb. The (A/T)n, (AC/GT)n and (AGC/GCT)n are the most frequent motifs in mononucleotide, dinucleotide and trinucleotide SSRs, respectively, and the motif frequencies of tetra- to hexanucleotide SSRs appear to be species-specific. The 10-20 bp length of SSRs are dominant with the number of 110 561 ± 93 482 and the frequency of 87.25% ± 5.73% on average, and the number and frequency decline with the increase of length. Most SSRs (83.34% ± 7.72%) are located in intergenic regions, followed by intron regions (11.59% ± 5.59%), exon regions (3.74% ± 1.95%), and untranslated regions (1.32% ± 1.39%). The mono-, di- and trinucleotide SSRs are the main SSRs in both gene regions (98.55% ± 0.85%) and exon regions (99.27% ± 0.52%). An average of 42.52% of total genes contains SSRs, and the preference for SSR occurrence in different gene subcategories are species-specific. The study provides useful insights into the SSR diversity, characteristics and distribution in 23 mosquito species of genomes. © 2018 Institute of Zoology, Chinese Academy of Sciences.

  20. Effects of Background Music on Objective and Subjective Performance Measures in an Auditory BCI.

    PubMed

    Zhou, Sijie; Allison, Brendan Z; Kübler, Andrea; Cichocki, Andrzej; Wang, Xingyu; Jin, Jing

    2016-01-01

    Several studies have explored brain computer interface (BCI) systems based on auditory stimuli, which could help patients with visual impairments. Usability and user satisfaction are important considerations in any BCI. Although background music can influence emotion and performance in other task environments, and many users may wish to listen to music while using a BCI, auditory, and other BCIs are typically studied without background music. Some work has explored the possibility of using polyphonic music in auditory BCI systems. However, this approach requires users with good musical skills, and has not been explored in online experiments. Our hypothesis was that an auditory BCI with background music would be preferred by subjects over a similar BCI without background music, without any difference in BCI performance. We introduce a simple paradigm (which does not require musical skill) using percussion instrument sound stimuli and background music, and evaluated it in both offline and online experiments. The result showed that subjects preferred the auditory BCI with background music. Different performance measures did not reveal any significant performance effect when comparing background music vs. no background. Since the addition of background music does not impair BCI performance but is preferred by users, auditory (and perhaps other) BCIs should consider including it. Our study also indicates that auditory BCIs can be effective even if the auditory channel is simultaneously otherwise engaged.

  1. Linear model for fast background subtraction in oligonucleotide microarrays.

    PubMed

    Kroll, K Myriam; Barkema, Gerard T; Carlon, Enrico

    2009-11-16

    One important preprocessing step in the analysis of microarray data is background subtraction. In high-density oligonucleotide arrays this is recognized as a crucial step for the global performance of the data analysis from raw intensities to expression values. We propose here an algorithm for background estimation based on a model in which the cost function is quadratic in a set of fitting parameters such that minimization can be performed through linear algebra. The model incorporates two effects: 1) Correlated intensities between neighboring features in the chip and 2) sequence-dependent affinities for non-specific hybridization fitted by an extended nearest-neighbor model. The algorithm has been tested on 360 GeneChips from publicly available data of recent expression experiments. The algorithm is fast and accurate. Strong correlations between the fitted values for different experiments as well as between the free-energy parameters and their counterparts in aqueous solution indicate that the model captures a significant part of the underlying physical chemistry.

  2. BIPAD: A web server for modeling bipartite sequence elements

    PubMed Central

    Bi, Chengpeng; Rogan, Peter K

    2006-01-01

    Background Many dimeric protein complexes bind cooperatively to families of bipartite nucleic acid sequence elements, which consist of pairs of conserved half-site sequences separated by intervening distances that vary among individual sites. Results We introduce the Bipad Server [1], a web interface to predict sequence elements embedded within unaligned sequences. Either a bipartite model, consisting of a pair of one-block position weight matrices (PWM's) with a gap distribution, or a single PWM matrix for contiguous single block motifs may be produced. The Bipad program performs multiple local alignment by entropy minimization and cyclic refinement using a stochastic greedy search strategy. The best models are refined by maximizing incremental information contents among a set of potential models with varying half site and gap lengths. Conclusion The web service generates information positional weight matrices, identifies binding site motifs, graphically represents the set of discovered elements as a sequence logo, and depicts the gap distribution as a histogram. Server performance was evaluated by generating a collection of bipartite models for distinct DNA binding proteins. PMID:16503993

  3. Billions of basepairs of recently expanded, repetitive sequences are eliminated from the somatic genome during copepod development

    PubMed Central

    2014-01-01

    Background Chromatin diminution is the programmed deletion of DNA from presomatic cell or nuclear lineages during development, producing single organisms that contain two different nuclear genomes. Phylogenetically diverse taxa undergo chromatin diminution — some ciliates, nematodes, copepods, and vertebrates. In cyclopoid copepods, chromatin diminution occurs in taxa with massively expanded germline genomes; depending on species, germline genome sizes range from 15 – 75 Gb, 12–74 Gb of which are lost from pre-somatic cell lineages at germline – soma differentiation. This is more than an order of magnitude more sequence than is lost from other taxa. To date, the sequences excised from copepods have not been analyzed using large-scale genomic datasets, and the processes underlying germline genomic gigantism in this clade, as well as the functional significance of chromatin diminution, have remained unknown. Results Here, we used high-throughput genomic sequencing and qPCR to characterize the germline and somatic genomes of Mesocyclops edax, a freshwater cyclopoid copepod with a germline genome of ~15 Gb and a somatic genome of ~3 Gb. We show that most of the excised DNA consists of repetitive sequences that are either 1) verifiable transposable elements (TEs), or 2) non-simple repeats of likely TE origin. Repeat elements in both genomes are skewed towards younger (i.e. less divergent) elements. Excised DNA is a non-random sample of the germline repeat element landscape; younger elements, and high frequency DNA transposons and LINEs, are disproportionately eliminated from the somatic genome. Conclusions Our results suggest that germline genome expansion in M. edax reflects explosive repeat element proliferation, and that billions of base pairs of such repeats are deleted from the somatic genome every generation. Thus, we hypothesize that chromatin diminution is a mechanism that controls repeat element load, and that this load can evolve to be divergent

  4. Metavisitor, a Suite of Galaxy Tools for Simple and Rapid Detection and Discovery of Viruses in Deep Sequence Data

    PubMed Central

    Vernick, Kenneth D.

    2017-01-01

    Metavisitor is a software package that allows biologists and clinicians without specialized bioinformatics expertise to detect and assemble viral genomes from deep sequence datasets. The package is composed of a set of modular bioinformatic tools and workflows that are implemented in the Galaxy framework. Using the graphical Galaxy workflow editor, users with minimal computational skills can use existing Metavisitor workflows or adapt them to suit specific needs by adding or modifying analysis modules. Metavisitor works with DNA, RNA or small RNA sequencing data over a range of read lengths and can use a combination of de novo and guided approaches to assemble genomes from sequencing reads. We show that the software has the potential for quick diagnosis as well as discovery of viruses from a vast array of organisms. Importantly, we provide here executable Metavisitor use cases, which increase the accessibility and transparency of the software, ultimately enabling biologists or clinicians to focus on biological or medical questions. PMID:28045932

  5. Chromosome arm-specific BAC end sequences permit comparative analysis of homoeologous chromosomes and genomes of polyploid wheat

    PubMed Central

    2012-01-01

    Background Bread wheat, one of the world’s staple food crops, has the largest, highly repetitive and polyploid genome among the cereal crops. The wheat genome holds the key to crop genetic improvement against challenges such as climate change, environmental degradation, and water scarcity. To unravel the complex wheat genome, the International Wheat Genome Sequencing Consortium (IWGSC) is pursuing a chromosome- and chromosome arm-based approach to physical mapping and sequencing. Here we report on the use of a BAC library made from flow-sorted telosomic chromosome 3A short arm (t3AS) for marker development and analysis of sequence composition and comparative evolution of homoeologous genomes of hexaploid wheat. Results The end-sequencing of 9,984 random BACs from a chromosome arm 3AS-specific library (TaaCsp3AShA) generated 11,014,359 bp of high quality sequence from 17,591 BAC-ends with an average length of 626 bp. The sequence represents 3.2% of t3AS with an average DNA sequence read every 19 kb. Overall, 79% of the sequence consisted of repetitive elements, 1.38% as coding regions (estimated 2,850 genes) and another 19% of unknown origin. Comparative sequence analysis suggested that 70-77% of the genes present in both 3A and 3B were syntenic with model species. Among the transposable elements, gypsy/sabrina (12.4%) was the most abundant repeat and was significantly more frequent in 3A compared to homoeologous chromosome 3B. Twenty novel repetitive sequences were also identified using de novo repeat identification. BESs were screened to identify simple sequence repeats (SSR) and transposable element junctions. A total of 1,057 SSRs were identified with a density of one per 10.4 kb, and 7,928 junctions between transposable elements (TE) and other sequences were identified with a density of one per 1.39 kb. With the objective of enhancing the marker density of chromosome 3AS, oligonucleotide primers were successfully designed from 758 SSRs and 695

  6. Platinum(II)-Oligonucleotide Coordination Based Aptasensor for Simple and Selective Detection of Platinum Compounds.

    PubMed

    Cai, Sheng; Tian, Xueke; Sun, Lianli; Hu, Haihong; Zheng, Shirui; Jiang, Huidi; Yu, Lushan; Zeng, Su

    2015-10-20

    Wide use of platinum-based chemotherapeutic regimens for the treatment for carcinoma calls for a simple and selective detection of platinum compound in biological samples. On the basis of the platinum(II)-base pair coordination, a novel type of aptameric platform for platinum detection has been introduced. This chemiluminescence (CL) aptasensor consists of a designed streptavidin (SA) aptamer sequence in which several base pairs were replaced by G-G mismatches. Only in the presence of platinum, coordination occurs between the platinum and G-G base pairs as opposed to the hydrogen-bonded G-C base pairs, which leads to SA aptamer sequence activation, resulting in their binding to SA coated magnetic beads. These Pt-DNA coordination events were monitored by a simple and direct luminol-peroxide CL reaction through horseradish peroxidase (HRP) catalysis with a strong chemiluminescence emission. The validated ranges of quantification were 0.12-240 μM with a limit of detection of 60 nM and selectivity over other metal ions. This assay was also successfully used in urine sample determination. It will be a promising candidate for the detection of platinum in biomedical and environmental samples.

  7. Prof. Hayashi's work on the pre-main sequence evolution and brown dwarfs

    NASA Astrophysics Data System (ADS)

    Nakano, Takenori

    2012-09-01

    Prof. Hayashi's work on the evolution of stars in the pre-main sequence stage is reviewed. The historical background and the process of finding the Hayashi phase are mentioned. The work on the evolution of low-mass stars is also reviewed including the determination of the bottom of the main sequence and evolution of brown dwarfs, and comparison is made with the other works in the same period.

  8. De Novo Transcriptome Sequencing Reveals Important Molecular Networks and Metabolic Pathways of the Plant, Chlorophytum borivilianum

    PubMed Central

    Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir

    2013-01-01

    Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum. PMID:24376689

  9. De Novo transcriptome sequencing reveals important molecular networks and metabolic pathways of the plant, Chlorophytum borivilianum.

    PubMed

    Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir

    2013-01-01

    Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum.

  10. Simple projects guidebook : federal-aid procedure for simple projects

    DOT National Transportation Integrated Search

    2002-06-01

    Experience has shown that a simple project generally 1) does not have any right-of-way involvement and 2) has a Programmatic Categorical Exclusion or Categorical Exclusion environmental determination. Page 7 outlines the definition of simple projects...

  11. A novel diagnostic method for malaria using loop-mediated isothermal amplification (LAMP) and MinION™ nanopore sequencer.

    PubMed

    Imai, Kazuo; Tarumoto, Norihito; Misawa, Kazuhisa; Runtuwene, Lucky Ronald; Sakai, Jun; Hayashida, Kyoko; Eshita, Yuki; Maeda, Ryuichiro; Tuda, Josef; Murakami, Takashi; Maesaki, Shigefumi; Suzuki, Yutaka; Yamagishi, Junya; Maeda, Takuya

    2017-09-13

    A simple and accurate molecular diagnostic method for malaria is urgently needed due to the limitations of conventional microscopic examination. In this study, we demonstrate a new diagnostic procedure for human malaria using loop mediated isothermal amplification (LAMP) and the MinION™ nanopore sequencer. We generated specific LAMP primers targeting the 18S-rRNA gene of all five human Plasmodium species including two P. ovale subspecies (P. falciparum, P. vivax, P. ovale wallikeri, P. ovale curtisi, P. knowlesi and P. malariae) and examined human blood samples collected from 63 malaria patients in Indonesia. Additionally, we performed amplicon sequencing of our LAMP products using MinION™ nanopore sequencer to identify each Plasmodium species. Our LAMP method allowed amplification of all targeted 18S-rRNA genes of the reference plasmids with detection limits of 10-100 copies per reaction. Among the 63 clinical samples, 54 and 55 samples were positive by nested PCR and our LAMP method, respectively. Identification of the Plasmodium species by LAMP amplicon sequencing analysis using the MinION™ was consistent with the reference plasmid sequences and the results of nested PCR. Our diagnostic method combined with LAMP and MinION™ could become a simple and accurate tool for the identification of human Plasmodium species, even in resource-limited situations.

  12. Research Techniques Made Simple: High-Throughput Sequencing of the T-Cell Receptor.

    PubMed

    Matos, Tiago R; de Rie, Menno A; Teunissen, Marcel B M

    2017-06-01

    High-throughput sequencing (HTS) of the T-cell receptor (TCR) is a rapidly advancing technique that allows sensitive and accurate identification and quantification of every distinct T-cell clone present within any biological sample. The relative frequency of each individual clone within the full T-cell repertoire can also be studied. HTS is essential to expand our knowledge on the diversity of the TCR repertoire in homeostasis or under pathologic conditions, as well as to understand the kinetics of antigen-specific T-cell responses that lead to protective immunity (i.e., vaccination) or immune-related disorders (i.e., autoimmunity and cancer). HTS can be tailored for personalized medicine, having the potential to monitor individual responses to therapeutic interventions and show prognostic and diagnostic biomarkers. In this article, we briefly review the methodology, advances, and limitations of HTS of the TCR and describe emerging applications of this technique in the field of investigative dermatology. We highlight studying the pathogenesis of T cells in allergic dermatitis and the application of HTS of the TCR in diagnosing, detecting recurrence early, and monitoring responses to therapy in cutaneous T-cell lymphoma. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  13. A simple physical model for deep moonquake occurrence times

    USGS Publications Warehouse

    Weber, R.C.; Bills, B.G.; Johnson, C.L.

    2010-01-01

    The physical process that results in moonquakes is not yet fully understood. The periodic occurrence times of events from individual clusters are clearly related to tidal stress, but also exhibit departures from the temporal regularity this relationship would seem to imply. Even simplified models that capture some of the relevant physics require a large number of variables. However, a single, easily accessible variable - the time interval I(n) between events - can be used to reveal behavior not readily observed using typical periodicity analyses (e.g., Fourier analyses). The delay-coordinate (DC) map, a particularly revealing way to display data from a time series, is a map of successive intervals: I(n+. 1) plotted vs. I(n). We use a DC approach to characterize the dynamics of moonquake occurrence. Moonquake-like DC maps can be reproduced by combining sequences of synthetic events that occur with variable probability at tidal periods. Though this model gives a good description of what happens, it has little physical content, thus providing only little insight into why moonquakes occur. We investigate a more mechanistic model. In this study, we present a series of simple models of deep moonquake occurrence, with consideration of both tidal stress and stress drop during events. We first examine the behavior of inter-event times in a delay-coordinate context, and then examine the output, in that context, of a sequence of simple models of tidal forcing and stress relief. We find, as might be expected, that the stress relieved by moonquakes influences their occurrence times. Our models may also provide an explanation for the opposite-polarity events observed at some clusters. ?? 2010.

  14. HIV-1 low copy viral sequencing-A prototype assay.

    PubMed

    Mellberg, Tomas; Krabbe, Jon; Gisslén, Magnus; Svennerholm, Bo

    2016-01-01

    In HIV-1 patients with low viral burden, sequencing is often problematic, yet important. This study presents a sensitive, sub-type independent system for sequencing of low level viremia. Sequencing data from 32 HIV-1 infected patients with low level viremia were collected longitudinally. A combination of ViroSeq® HIV-1 Genotyping System and an in-house nesting protocol was used. Eight sub-types were represented. The success-rate of amplification of both PR and RT in the same sample was 100% in samples with viral loads above 100 copies/ml. Below 100 copies/ml, this study managed to amplify both regions in 7/13 (54%) samples. The assays were able to amplify either PR or RT in all sub-types included but one sub-type A specimen. In conclusion, this study presents a promising, simple assay to increase the ability to perform HIV-1 resistance testing at low level viremia. This is a prototype assay and the method needs further testing to evaluate clinical performance.

  15. Use of PCR with Sequence-specific Primers for High-Resolution Human Leukocyte Antigen Typing of Patients with Narcolepsy

    PubMed Central

    Woo, Hye In; Joo, Eun Yeon; Lee, Kyung Wha

    2012-01-01

    Background Narcolepsy is a neurologic disorder characterized by excessive daytime sleepiness, symptoms of abnormal rapid eye movement (REM) sleep, and a strong association with HLA-DRB1*1501, -DQA1*0102, and -DQB1*0602. Here, we investigated the clinico-physical characteristics of Korean patients with narcolepsy, their HLA types, and the clinical utility of high-resolution PCR with sequence-specific primers (PCR-SSP) as a simple typing method for identifying DRB1*15/16, DQA1, and DQB1 alleles. Methods The study population consisted of 67 consecutively enrolled patients having unexplained daytime sleepiness and diagnosed narcolepsy based on clinical and neurological findings. Clinical data and the results of the multiple sleep latency test and polysomnography were reviewed, and HLA typing was performed using both high-resolution PCR-SSP and sequence-based typing (SBT). Results The 44 narcolepsy patients with cataplexy displayed significantly higher frequencies of DRB1*1501 (Pc= 0.003), DQA1*0102 (Pc=0.001), and DQB1*0602 (Pc=0.014) than the patients without cataplexy. Among patients carrying DRB1*1501-DQB1*0602 or DQA1*0102, the frequencies of a mean REM sleep latency of less than 20 min in nocturnal polysomnography and clinical findings, including sleep paralysis and hypnagogic hallucination were significantly higher. SBT and PCR-SSP showed 100% concordance for high-resolution typing of DRB1*15/16 alleles and DQA1 and DQB1 loci. Conclusions The clinical characteristics and somnographic findings of narcolepsy patients were associated with specific HLA alleles, including DRB1*1501, DQA1*0102, and DQB1*0602. Application of high-resolution PCR-SSP, a reliable and simple method, for both allele- and locus-specific HLA typing of DRB1*15/16, DQA1, and DQB1 would be useful for characterizing clinical status among subjects with narcolepsy. PMID:22259780

  16. Translation, Cultural Adaptation and Validation of the Simple Shoulder Test to Spanish

    PubMed Central

    Arcuri, Francisco; Barclay, Fernando; Nacul, Ivan

    2015-01-01

    Background: The validation of widely used scales facilitates the comparison across international patient samples. Objective: The objective was to translate, culturally adapt and validate the Simple Shoulder Test into Argentinian Spanish. Methods: The Simple Shoulder Test was translated from English into Argentinian Spanish by two independent translators, translated back into English and evaluated for accuracy by an expert committee to correct the possible discrepancies. It was then administered to 50 patients with different shoulder conditions.Psycometric properties were analyzed including internal consistency, measured with Cronbach´s Alpha, test-retest reliability at 15 days with the interclass correlation coefficient. Results: The internal consistency, validation, was an Alpha of 0,808, evaluated as good. The test-retest reliability index as measured by intra-class correlation coefficient (ICC) was 0.835, evaluated as excellent. Conclusion: The Simple Shoulder Test translation and it´s cultural adaptation to Argentinian-Spanish demonstrated adequate internal reliability and validity, ultimately allowing for its use in the comparison with international patient samples.

  17. Simple Shared Motifs (SSM) in conserved region of promoters: a new approach to identify co-regulation patterns.

    PubMed

    Gruel, Jérémy; LeBorgne, Michel; LeMeur, Nolwenn; Théret, Nathalie

    2011-09-12

    Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks.

  18. Validation of Pooled Whole-Genome Re-Sequencing in Arabidopsis lyrata.

    PubMed

    Fracassetti, Marco; Griffin, Philippa C; Willi, Yvonne

    2015-01-01

    Sequencing pooled DNA of multiple individuals from a population instead of sequencing individuals separately has become popular due to its cost-effectiveness and simple wet-lab protocol, although some criticism of this approach remains. Here we validated a protocol for pooled whole-genome re-sequencing (Pool-seq) of Arabidopsis lyrata libraries prepared with low amounts of DNA (1.6 ng per individual). The validation was based on comparing single nucleotide polymorphism (SNP) frequencies obtained by pooling with those obtained by individual-based Genotyping By Sequencing (GBS). Furthermore, we investigated the effect of sample number, sequencing depth per individual and variant caller on population SNP frequency estimates. For Pool-seq data, we compared frequency estimates from two SNP callers, VarScan and Snape; the former employs a frequentist SNP calling approach while the latter uses a Bayesian approach. Results revealed concordance correlation coefficients well above 0.8, confirming that Pool-seq is a valid method for acquiring population-level SNP frequency data. Higher accuracy was achieved by pooling more samples (25 compared to 14) and working with higher sequencing depth (4.1× per individual compared to 1.4× per individual), which increased the concordance correlation coefficient to 0.955. The Bayesian-based SNP caller produced somewhat higher concordance correlation coefficients, particularly at low sequencing depth. We recommend pooling at least 25 individuals combined with sequencing at a depth of 100× to produce satisfactory frequency estimates for common SNPs (minor allele frequency above 0.05).

  19. BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons.

    PubMed

    Alikhan, Nabil-Fareed; Petty, Nicola K; Ben Zakour, Nouri L; Beatson, Scott A

    2011-08-08

    Visualisation of genome comparisons is invaluable for helping to determine genotypic differences between closely related prokaryotes. New visualisation and abstraction methods are required in order to improve the validation, interpretation and communication of genome sequence information; especially with the increasing amount of data arising from next-generation sequencing projects. Visualising a prokaryote genome as a circular image has become a powerful means of displaying informative comparisons of one genome to a number of others. Several programs, imaging libraries and internet resources already exist for this purpose, however, most are either limited in the number of comparisons they can show, are unable to adequately utilise draft genome sequence data, or require a knowledge of command-line scripting for implementation. Currently, there is no freely available desktop application that enables users to rapidly visualise comparisons between hundreds of draft or complete genomes in a single image. BLAST Ring Image Generator (BRIG) can generate images that show multiple prokaryote genome comparisons, without an arbitrary limit on the number of genomes compared. The output image shows similarity between a central reference sequence and other sequences as a set of concentric rings, where BLAST matches are coloured on a sliding scale indicating a defined percentage identity. Images can also include draft genome assembly information to show read coverage, assembly breakpoints and collapsed repeats. In addition, BRIG supports the mapping of unassembled sequencing reads against one or more central reference sequences. Many types of custom data and annotations can be shown using BRIG, making it a versatile approach for visualising a range of genomic comparison data. BRIG is readily accessible to any user, as it assumes no specialist computational knowledge and will perform all required file parsing and BLAST comparisons automatically. There is a clear need for a user

  20. Brain activation during anticipation of sound sequences.

    PubMed

    Leaver, Amber M; Van Lare, Jennifer; Zielinski, Brandon; Halpern, Andrea R; Rauschecker, Josef P

    2009-02-25

    Music consists of sound sequences that require integration over time. As we become familiar with music, associations between notes, melodies, and entire symphonic movements become stronger and more complex. These associations can become so tight that, for example, hearing the end of one album track can elicit a robust image of the upcoming track while anticipating it in total silence. Here, we study this predictive "anticipatory imagery" at various stages throughout learning and investigate activity changes in corresponding neural structures using functional magnetic resonance imaging. Anticipatory imagery (in silence) for highly familiar naturalistic music was accompanied by pronounced activity in rostral prefrontal cortex (PFC) and premotor areas. Examining changes in the neural bases of anticipatory imagery during two stages of learning conditional associations between simple melodies, however, demonstrates the importance of fronto-striatal connections, consistent with a role of the basal ganglia in "training" frontal cortex (Pasupathy and Miller, 2005). Another striking change in neural resources during learning was a shift between caudal PFC earlier to rostral PFC later in learning. Our findings regarding musical anticipation and sound sequence learning are highly compatible with studies of motor sequence learning, suggesting common predictive mechanisms in both domains.

  1. Is simple nephrectomy truly simple? Comparison with the radical alternative.

    PubMed

    Connolly, S S; O'Brien, M Frank; Kunni, I M; Phelan, E; Conroy, R; Thornhill, J A; Grainger, R

    2011-03-01

    The Oxford English dictionary defines the term "simple" as "easily done" and "uncomplicated". We tested the validity of this terminology in relation to open nephrectomy surgery. Retrospective review of 215 patients undergoing open, simple (n = 89) or radical (n = 126) nephrectomy in a single university-affiliated institution between 1998 and 2002. Operative time (OT), estimated blood loss (EBL), operative complications (OC) and length of stay in hospital (LOS) were analysed. Statistical analysis employed Fisher's exact test and Stata Release 8.2. Simple nephrectomy was associated with shorter OT (mean 126 vs. 144 min; p = 0.002), reduced EBL (mean 729 vs. 859 cc; p = 0.472), lower OC (9 vs. 17%; 0.087), and more brief LOS (mean 6 vs. 8 days; p < 0.001). All parameters suggest favourable outcome for the simple nephrectomy group, supporting the use of this terminology. This implies "simple" nephrectomies are truly easier to perform with less complication than their radical counterpart.

  2. SIMPLE: An Introduction.

    ERIC Educational Resources Information Center

    Endres, Frank L.

    Symbolic Interactive Matrix Processing Language (SIMPLE) is a conversational matrix-oriented source language suited to a batch or a time-sharing environment. The two modes of operation of SIMPLE are conversational mode and programing mode. This program uses a TAURUS time-sharing system and cathode ray terminals or teletypes. SIMPLE performs all…

  3. Separating Putative Pathogens from Background Contamination with Principal Orthogonal Decomposition: Evidence for Leptospira in the Ugandan Neonatal Septisome

    PubMed Central

    Schiff, Steven J.; Kiwanuka, Julius; Riggio, Gina; Nguyen, Lan; Mu, Kevin; Sproul, Emily; Bazira, Joel; Mwanga-Amumpaire, Juliet; Tumusiime, Dickson; Nyesigire, Eunice; Lwanga, Nkangi; Bogale, Kaleb T.; Kapur, Vivek; Broach, James R.; Morton, Sarah U.; Warf, Benjamin C.; Poss, Mary

    2016-01-01

    Neonatal sepsis (NS) is responsible for over 1 million yearly deaths worldwide. In the developing world, NS is often treated without an identified microbial pathogen. Amplicon sequencing of the bacterial 16S rRNA gene can be used to identify organisms that are difficult to detect by routine microbiological methods. However, contaminating bacteria are ubiquitous in both hospital settings and research reagents and must be accounted for to make effective use of these data. In this study, we sequenced the bacterial 16S rRNA gene obtained from blood and cerebrospinal fluid (CSF) of 80 neonates presenting with NS to the Mbarara Regional Hospital in Uganda. Assuming that patterns of background contamination would be independent of pathogenic microorganism DNA, we applied a novel quantitative approach using principal orthogonal decomposition to separate background contamination from potential pathogens in sequencing data. We designed our quantitative approach contrasting blood, CSF, and control specimens and employed a variety of statistical random matrix bootstrap hypotheses to estimate statistical significance. These analyses demonstrate that Leptospira appears present in some infants presenting within 48 h of birth, indicative of infection in utero, and up to 28 days of age, suggesting environmental exposure. This organism cannot be cultured in routine bacteriological settings and is enzootic in the cattle that often live in close proximity to the rural peoples of western Uganda. Our findings demonstrate that statistical approaches to remove background organisms common in 16S sequence data can reveal putative pathogens in small volume biological samples from newborns. This computational analysis thus reveals an important medical finding that has the potential to alter therapy and prevention efforts in a critically ill population. PMID:27379237

  4. The Influence of Parental Background on Students' Academic Performance in Physics in WASSCS 2000-2005

    ERIC Educational Resources Information Center

    Ebong, Samuel T.

    2015-01-01

    The study investigated parental background on student's academic performance in secondary schools in Abak local government, Akwa Ibom State, Nigeria. A survey design was adopted for the study. One thousand four hundred and forty (1440) senior secondary three (SS3) Physics students were drawn by simple random sampling from 12 Schools, six (6) each…

  5. The Complexity of Background Clutter Affects Nectar Bat Use of Flower Odor and Shape Cues.

    PubMed

    Muchhala, Nathan; Serrano, Diana

    2015-01-01

    Given their small size and high metabolism, nectar bats need to be able to quickly locate flowers during foraging bouts. Chiropterophilous plants depend on these bats for their reproduction, thus they also benefit if their flowers can be easily located, and we would expect that floral traits such as odor and shape have evolved to maximize detection by bats. However, relatively little is known about the importance of different floral cues during foraging bouts. In the present study, we undertook a set of flight cage experiments with two species of nectar bats (Anoura caudifer and A. geoffroyi) and artificial flowers to compare the importance of shape and scent cues in locating flowers. In a training phase, a bat was presented an artificial flower with a given shape and scent, whose position was constantly shifted to prevent reliance on spatial memory. In the experimental phase, two flowers were presented, one with the training-flower scent and one with the training-flower shape. For each experimental repetition, we recorded which flower was located first, and then shifted flower positions. Additionally, experiments were repeated in a simple environment, without background clutter, or a complex environment, with a background of leaves and branches. Results demonstrate that bats visit either flower indiscriminately with simple backgrounds, with no significant difference in terms of whether they visit the training-flower odor or training-flower shape first. However, in a complex background olfaction was the most important cue; scented flowers were consistently located first. This suggests that for well-exposed flowers, without obstruction from clutter, vision and/or echolocation are sufficient in locating them. In more complex backgrounds, nectar bats depend more heavily on olfaction during foraging bouts.

  6. The Complexity of Background Clutter Affects Nectar Bat Use of Flower Odor and Shape Cues

    PubMed Central

    Muchhala, Nathan; Serrano, Diana

    2015-01-01

    Given their small size and high metabolism, nectar bats need to be able to quickly locate flowers during foraging bouts. Chiropterophilous plants depend on these bats for their reproduction, thus they also benefit if their flowers can be easily located, and we would expect that floral traits such as odor and shape have evolved to maximize detection by bats. However, relatively little is known about the importance of different floral cues during foraging bouts. In the present study, we undertook a set of flight cage experiments with two species of nectar bats (Anoura caudifer and A. geoffroyi) and artificial flowers to compare the importance of shape and scent cues in locating flowers. In a training phase, a bat was presented an artificial flower with a given shape and scent, whose position was constantly shifted to prevent reliance on spatial memory. In the experimental phase, two flowers were presented, one with the training-flower scent and one with the training-flower shape. For each experimental repetition, we recorded which flower was located first, and then shifted flower positions. Additionally, experiments were repeated in a simple environment, without background clutter, or a complex environment, with a background of leaves and branches. Results demonstrate that bats visit either flower indiscriminately with simple backgrounds, with no significant difference in terms of whether they visit the training-flower odor or training-flower shape first. However, in a complex background olfaction was the most important cue; scented flowers were consistently located first. This suggests that for well-exposed flowers, without obstruction from clutter, vision and/or echolocation are sufficient in locating them. In more complex backgrounds, nectar bats depend more heavily on olfaction during foraging bouts. PMID:26445216

  7. High-throughput sequencing of forensic genetic samples using punches of FTA cards with buccal swabs.

    PubMed

    Kampmann, Marie-Louise; Buchard, Anders; Børsting, Claus; Morling, Niels

    2016-01-01

    Here, we demonstrate that punches from buccal swab samples preserved on FTA cards can be used for high-throughput DNA sequencing, also known as massively parallel sequencing (MPS). We typed 44 reference samples with the HID-Ion AmpliSeq Identity Panel using washed 1.2 mm punches from FTA cards with buccal swabs and compared the results with those obtained with DNA extracted using the EZ1 DNA Investigator Kit. Concordant profiles were obtained for all samples. Our protocol includes simple punch, wash, and PCR steps, reducing cost and hands-on time in the laboratory. Furthermore, it facilitates automation of DNA sequencing.

  8. The span of correlations in dolphin whistle sequences

    NASA Astrophysics Data System (ADS)

    Ferrer-i-Cancho, Ramon; McCowan, Brenda

    2012-06-01

    Long-range correlations are found in symbolic sequences from human language, music and DNA. Determining the span of correlations in dolphin whistle sequences is crucial for shedding light on their communicative complexity. Dolphin whistles share various statistical properties with human words, i.e. Zipf's law for word frequencies (namely that the probability of the ith most frequent word of a text is about i-α) and a parallel of the tendency of more frequent words to have more meanings. The finding of Zipf's law for word frequencies in dolphin whistles has been the topic of an intense debate on its implications. One of the major arguments against the relevance of Zipf's law in dolphin whistles is that it is not possible to distinguish the outcome of a die-rolling experiment from that of a linguistic or communicative source producing Zipf's law for word frequencies. Here we show that statistically significant whistle-whistle correlations extend back to the second previous whistle in the sequence, using a global randomization test, and to the fourth previous whistle, using a local randomization test. None of these correlations are expected by a die-rolling experiment and other simple explanations of Zipf's law for word frequencies, such as Simon's model, that produce sequences of unpredictable elements.

  9. Genetic analysis and association of simple sequence repeat markers with storage root yield, dry matter, starch and β-carotene content in sweetpotato

    PubMed Central

    Yada, Benard; Brown-Guedira, Gina; Alajo, Agnes; Ssemakula, Gorrettie N.; Owusu-Mensah, Eric; Carey, Edward E.; Mwanga, Robert O.M.; Yencho, G. Craig

    2017-01-01

    Molecular markers are needed for enhancing the development of elite sweetpotato (Ipomoea batatas (L.) Lam) cultivars with a wide range of commercially important traits in sub-Saharan Africa. This study was conducted to estimate the heritability and determine trait correlations of storage root yield, dry matter, starch and β-carotene content in a cross between ‘New Kawogo’ × ‘Beauregard’. The study was also conducted to identify simple sequence repeat (SSR) markers associated with these traits. A total of 287 progeny and the parents were evaluated for two seasons at three sites in Uganda and genotyped with 250 SSR markers. Broad sense heritability (H2) for storage root yield, dry matter, starch and β-carotene content were 0.24, 0.68, 0.70 and 0.90, respectively. Storage root β-carotene content was negatively correlated with dry matter (r = −0.59, P < 0.001) and starch (r = −0.93, P < 0.001) content, while storage root yield was positively correlated with dry matter (r = 0.57, P = 0.029) and starch (r = 0.41, P = 0.008) content. Through logistic regression, a total of 12, 4, 6 and 8 SSR markers were associated with storage root yield, dry matter, starch and β-carotene content, respectively. The SSR markers used in this study may be useful for quantitative trait loci analysis and selection for these traits in future. PMID:28588391

  10. Construction and sequence sampling of deep-coverage, large-insert BAC libraries for three model lepidopteran species

    PubMed Central

    Wu, Chengcang; Proestou, Dina; Carter, Dorothy; Nicholson, Erica; Santos, Filippe; Zhao, Shaying; Zhang, Hong-Bin; Goldsmith, Marian R

    2009-01-01

    Background Manduca sexta, Heliothis virescens, and Heliconius erato represent three widely-used insect model species for genomic and fundamental studies in Lepidoptera. Large-insert BAC libraries of these insects are critical resources for many molecular studies, including physical mapping and genome sequencing, but not available to date. Results We report the construction and characterization of six large-insert BAC libraries for the three species and sampling sequence analysis of the genomes. The six BAC libraries were constructed with two restriction enzymes, two libraries for each species, and each has an average clone insert size ranging from 152–175 kb. We estimated that the genome coverage of each library ranged from 6–9 ×, with the two combined libraries of each species being equivalent to 13.0–16.3 × haploid genomes. The genome coverage, quality and utility of the libraries were further confirmed by library screening using 6~8 putative single-copy probes. To provide a first glimpse into these genomes, we sequenced and analyzed the BAC ends of ~200 clones randomly selected from the libraries of each species. The data revealed that the genomes are AT-rich, contain relatively small fractions of repeat elements with a majority belonging to the category of low complexity repeats, and are more abundant in retro-elements than DNA transposons. Among the species, the H. erato genome is somewhat more abundant in repeat elements and simple repeats than those of M. sexta and H. virescens. The BLAST analysis of the BAC end sequences suggested that the evolution of the three genomes is widely varied, with the genome of H. virescens being the most conserved as a typical lepidopteran, whereas both genomes of H. erato and M. sexta appear to have evolved significantly, resulting in a higher level of species- or evolutionary lineage-specific sequences. Conclusion The high-quality and large-insert BAC libraries of the insects, together with the identified BACs

  11. Prediction of Human Activity by Discovering Temporal Sequence Patterns.

    PubMed

    Li, Kang; Fu, Yun

    2014-08-01

    Early prediction of ongoing human activity has become more valuable in a large variety of time-critical applications. To build an effective representation for prediction, human activities can be characterized by a complex temporal composition of constituent simple actions and interacting objects. Different from early detection on short-duration simple actions, we propose a novel framework for long -duration complex activity prediction by discovering three key aspects of activity: Causality, Context-cue, and Predictability. The major contributions of our work include: (1) a general framework is proposed to systematically address the problem of complex activity prediction by mining temporal sequence patterns; (2) probabilistic suffix tree (PST) is introduced to model causal relationships between constituent actions, where both large and small order Markov dependencies between action units are captured; (3) the context-cue, especially interactive objects information, is modeled through sequential pattern mining (SPM), where a series of action and object co-occurrence are encoded as a complex symbolic sequence; (4) we also present a predictive accumulative function (PAF) to depict the predictability of each kind of activity. The effectiveness of our approach is evaluated on two experimental scenarios with two data sets for each: action-only prediction and context-aware prediction. Our method achieves superior performance for predicting global activity classes and local action units.

  12. Single-cell whole exome and targeted sequencing in NPM1/FLT3 positive pediatric acute myeloid leukemia.

    PubMed

    Walter, Christiane; Pozzorini, Christian; Reinhardt, Katarina; Geffers, Robert; Xu, Zhenyu; Reinhardt, Dirk; von Neuhoff, Nils; Hanenberg, Helmut

    2018-02-01

    The small portion of leukemic stem cells (LSCs) in acute myeloid leukemia (AML) present in children and adolescents is often masked by the high background of AML blasts and normal hematopoietic cells. The aim of the current study was to establish a simple workflow for reliable genetic analysis of single LSC-enriched blasts from pediatric patients. For three AMLs with mutations in nucleophosmin 1 and/or fms-like tyrosine kinase 3, we performed whole genome amplification on sorted single-cell DNA followed by whole exome sequencing (WES). The corresponding bulk bone marrow DNAs were also analyzed by WES and by targeted sequencing (TS) that included 54 genes associated with myeloid malignancies. Analysis revealed that read coverage statistics were comparable between single-cell and bulk WES data, indicating high-quality whole genome amplification. From 102 single-cell variants, 72 single nucleotide variants and insertions or deletions (70%) were consistently found in the two bulk DNA analyses. Variants reliably detected in single cells were also present in TS. However, initial screening by WES with read counts between 50-72× failed to detect rare AML subclones in the bulk DNAs. In summary, our study demonstrated that single-cell WES combined with bulk DNA TS is a promising tool set for detecting AML subclones and possibly LSCs. © 2017 Wiley Periodicals, Inc.

  13. Application of next-generation sequencing for rapid marker development in molecular plant breeding: a case study on anthracnose disease resistance in Lupinus angustifolius L.

    PubMed Central

    2012-01-01

    Background In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS) technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin) as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA) sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Results Twenty informative plants from a cross of RxS (disease resistant x susceptible) in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM), and are now replacing the markers previously developed by a traditional DNA fingerprinting method for

  14. Meteor localization via statistical analysis of spatially temporal fluctuations in image sequences

    NASA Astrophysics Data System (ADS)

    Kukal, Jaromír.; Klimt, Martin; Šihlík, Jan; Fliegel, Karel

    2015-09-01

    Meteor detection is one of the most important procedures in astronomical imaging. Meteor path in Earth's atmosphere is traditionally reconstructed from double station video observation system generating 2D image sequences. However, the atmospheric turbulence and other factors cause spatially-temporal fluctuations of image background, which makes the localization of meteor path more difficult. Our approach is based on nonlinear preprocessing of image intensity using Box-Cox and logarithmic transform as its particular case. The transformed image sequences are then differentiated along discrete coordinates to obtain statistical description of sky background fluctuations, which can be modeled by multivariate normal distribution. After verification and hypothesis testing, we use the statistical model for outlier detection. Meanwhile the isolated outlier points are ignored, the compact cluster of outliers indicates the presence of meteoroids after ignition.

  15. High-throughput analysis of T-DNA location and structure using sequence capture

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.

    Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less

  16. High-throughput analysis of T-DNA location and structure using sequence capture

    DOE PAGES

    Inagaki, Soichi; Henry, Isabelle M.; Lieberman, Meric C.; ...

    2015-10-07

    Agrobacterium-mediated transformation of plants with T-DNA is used both to introduce transgenes and for mutagenesis. Conventional approaches used to identify the genomic location and the structure of the inserted T-DNA are laborious and high-throughput methods using next-generation sequencing are being developed to address these problems. Here, we present a cost-effective approach that uses sequence capture targeted to the T-DNA borders to select genomic DNA fragments containing T-DNA—genome junctions, followed by Illumina sequencing to determine the location and junction structure of T-DNA insertions. Multiple probes can be mixed so that transgenic lines transformed with different T-DNA types can be processed simultaneously,more » using a simple, index-based pooling approach. We also developed a simple bioinformatic tool to find sequence read pairs that span the junction between the genome and T-DNA or any foreign DNA. We analyzed 29 transgenic lines of Arabidopsis thaliana, each containing inserts from 4 different T-DNA vectors. We determined the location of T-DNA insertions in 22 lines, 4 of which carried multiple insertion sites. Additionally, our analysis uncovered a high frequency of unconventional and complex T-DNA insertions, highlighting the needs for high-throughput methods for T-DNA localization and structural characterization. Transgene insertion events have to be fully characterized prior to use as commercial products. As a result, our method greatly facilitates the first step of this characterization of transgenic plants by providing an efficient screen for the selection of promising lines.« less

  17. A Gibbs sampler for motif detection in phylogenetically close sequences

    NASA Astrophysics Data System (ADS)

    Siddharthan, Rahul; van Nimwegen, Erik; Siggia, Eric

    2004-03-01

    Genes are regulated by transcription factors that bind to DNA upstream of genes and recognize short conserved ``motifs'' in a random intergenic ``background''. Motif-finders such as the Gibbs sampler compare the probability of these short sequences being represented by ``weight matrices'' to the probability of their arising from the background ``null model'', and explore this space (analogous to a free-energy landscape). But closely related species may show conservation not because of functional sites but simply because they have not had sufficient time to diverge, so conventional methods will fail. We introduce a new Gibbs sampler algorithm that accounts for common ancestry when searching for motifs, while requiring minimal ``prior'' assumptions on the number and types of motifs, assessing the significance of detected motifs by ``tracking'' clusters that stay together. We apply this scheme to motif detection in sporulation-cycle genes in the yeast S. cerevisiae, using recent sequences of other closely-related Saccharomyces species.

  18. A Simple Artificial Life Model Explains Irrational Behavior in Human Decision-Making

    PubMed Central

    Feher da Silva, Carolina; Baldo, Marcus Vinícius Chrysóstomo

    2012-01-01

    Although praised for their rationality, humans often make poor decisions, even in simple situations. In the repeated binary choice experiment, an individual has to choose repeatedly between the same two alternatives, where a reward is assigned to one of them with fixed probability. The optimal strategy is to perseverate with choosing the alternative with the best expected return. Whereas many species perseverate, humans tend to match the frequencies of their choices to the frequencies of the alternatives, a sub-optimal strategy known as probability matching. Our goal was to find the primary cognitive constraints under which a set of simple evolutionary rules can lead to such contrasting behaviors. We simulated the evolution of artificial populations, wherein the fitness of each animat (artificial animal) depended on its ability to predict the next element of a sequence made up of a repeating binary string of varying size. When the string was short relative to the animats’ neural capacity, they could learn it and correctly predict the next element of the sequence. When it was long, they could not learn it, turning to the next best option: to perseverate. Animats from the last generation then performed the task of predicting the next element of a non-periodical binary sequence. We found that, whereas animats with smaller neural capacity kept perseverating with the best alternative as before, animats with larger neural capacity, which had previously been able to learn the pattern of repeating strings, adopted probability matching, being outperformed by the perseverating animats. Our results demonstrate how the ability to make predictions in an environment endowed with regular patterns may lead to probability matching under less structured conditions. They point to probability matching as a likely by-product of adaptive cognitive strategies that were crucial in human evolution, but may lead to sub-optimal performances in other environments. PMID:22563454

  19. A simple artificial life model explains irrational behavior in human decision-making.

    PubMed

    Feher da Silva, Carolina; Baldo, Marcus Vinícius Chrysóstomo

    2012-01-01

    Although praised for their rationality, humans often make poor decisions, even in simple situations. In the repeated binary choice experiment, an individual has to choose repeatedly between the same two alternatives, where a reward is assigned to one of them with fixed probability. The optimal strategy is to perseverate with choosing the alternative with the best expected return. Whereas many species perseverate, humans tend to match the frequencies of their choices to the frequencies of the alternatives, a sub-optimal strategy known as probability matching. Our goal was to find the primary cognitive constraints under which a set of simple evolutionary rules can lead to such contrasting behaviors. We simulated the evolution of artificial populations, wherein the fitness of each animat (artificial animal) depended on its ability to predict the next element of a sequence made up of a repeating binary string of varying size. When the string was short relative to the animats' neural capacity, they could learn it and correctly predict the next element of the sequence. When it was long, they could not learn it, turning to the next best option: to perseverate. Animats from the last generation then performed the task of predicting the next element of a non-periodical binary sequence. We found that, whereas animats with smaller neural capacity kept perseverating with the best alternative as before, animats with larger neural capacity, which had previously been able to learn the pattern of repeating strings, adopted probability matching, being outperformed by the perseverating animats. Our results demonstrate how the ability to make predictions in an environment endowed with regular patterns may lead to probability matching under less structured conditions. They point to probability matching as a likely by-product of adaptive cognitive strategies that were crucial in human evolution, but may lead to sub-optimal performances in other environments.

  20. An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

    PubMed Central

    Habermann, Bianca; Bebin, Anne-Gaelle; Herklotz, Stephan; Volkmer, Michael; Eckelt, Kay; Pehlke, Kerstin; Epperlein, Hans Henning; Schackert, Hans Konrad; Wiebe, Glenis; Tanaka, Elly M

    2004-01-01

    Background The ambystomatid salamander, Ambystoma mexicanum (axolotl), is an important model organism in evolutionary and regeneration research but relatively little sequence information has so far been available. This is a major limitation for molecular studies on caudate development, regeneration and evolution. To address this lack of sequence information we have generated an expressed sequence tag (EST) database for A. mexicanum. Results Two cDNA libraries, one made from stage 18-22 embryos and the other from day-6 regenerating tail blastemas, generated 17,352 sequences. From the sequenced ESTs, 6,377 contigs were assembled that probably represent 25% of the expressed genes in this organism. Sequence comparison revealed significant homology to entries in the NCBI non-redundant database. Further examination of this gene set revealed the presence of genes involved in important cell and developmental processes, including cell proliferation, cell differentiation and cell-cell communication. On the basis of these data, we have performed phylogenetic analysis of key cell-cycle regulators. Interestingly, while cell-cycle proteins such as the cyclin B family display expected evolutionary relationships, the cyclin-dependent kinase inhibitor 1 gene family shows an unusual evolutionary behavior among the amphibians. Conclusions Our analysis reveals the importance of a comprehensive sequence set from a representative of the Caudata and illustrates that the EST sequence database is a rich source of molecular, developmental and regeneration studies. To aid in data mining, the ESTs have been organized into an easily searchable database that is freely available online. PMID:15345051

  1. Multi-location wheat stripe rust QTL analysis: genetic background and epistatic interactions.

    PubMed

    Vazquez, M Dolores; Zemetra, Robert; Peterson, C James; Chen, Xianming M; Heesacker, Adam; Mundt, Christopher C

    2015-07-01

    Epistasis and genetic background were important influences on expression of stripe rust resistance in two wheat RIL populations, one with resistance conditioned by two major genes and the other conditioned by several minor QTL. Stripe rust is a foliar disease of wheat (Triticum aestivum L.) caused by the air-borne fungus Puccinia striiformis f. sp. tritici and is present in most regions around the world where commercial wheat is grown. Breeding for durable resistance to stripe rust continues to be a priority, but also is a challenge due to the complexity of interactions among resistance genes and to the wide diversity and continuous evolution of the pathogen races. The goal of this study was to detect chromosomal regions for resistance to stripe rust in two winter wheat populations, 'Tubbs'/'NSA-98-0995' (T/N) and 'Einstein'/'Tubbs' (E/T), evaluated across seven environments and mapped with diversity array technology and simple sequence repeat markers covering polymorphic regions of ≈1480 and 1117 cM, respectively. Analysis of variance for phenotypic data revealed significant (P < 0.01) genotypic differentiation for stripe rust among the recombinant inbred lines. Results for quantitative trait loci/locus (QTL) analysis in the E/T population indicated that two major QTL located in chromosomes 2AS and 6AL, with epistatic interaction between them, were responsible for the main phenotypic response. For the T/N population, eight QTL were identified, with those in chromosomes 2AL and 2BL accounting for the largest percentage of the phenotypic variance.

  2. Modeling read counts for CNV detection in exome sequencing data.

    PubMed

    Love, Michael I; Myšičková, Alena; Sun, Ruping; Kalscheuer, Vera; Vingron, Martin; Haas, Stefan A

    2011-11-08

    Varying depth of high-throughput sequencing reads along a chromosome makes it possible to observe copy number variants (CNVs) in a sample relative to a reference. In exome and other targeted sequencing projects, technical factors increase variation in read depth while reducing the number of observed locations, adding difficulty to the problem of identifying CNVs. We present a hidden Markov model for detecting CNVs from raw read count data, using background read depth from a control set as well as other positional covariates such as GC-content. The model, exomeCopy, is applied to a large chromosome X exome sequencing project identifying a list of large unique CNVs. CNVs predicted by the model and experimentally validated are then recovered using a cross-platform control set from publicly available exome sequencing data. Simulations show high sensitivity for detecting heterozygous and homozygous CNVs, outperforming normalization and state-of-the-art segmentation methods.

  3. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences

    PubMed Central

    2012-01-01

    Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR) are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas. PMID:23256920

  4. Generative technique for dynamic infrared image sequences

    NASA Astrophysics Data System (ADS)

    Zhang, Qian; Cao, Zhiguo; Zhang, Tianxu

    2001-09-01

    The generative technique of the dynamic infrared image was discussed in this paper. Because infrared sensor differs from CCD camera in imaging mechanism, it generates the infrared image by incepting the infrared radiation of scene (including target and background). The infrared imaging sensor is affected deeply by the atmospheric radiation, the environmental radiation and the attenuation of atmospheric radiation transfers. Therefore at first in this paper the imaging influence of all kinds of the radiations was analyzed and the calculation formula of radiation was provided, in addition, the passive scene and the active scene were analyzed separately. Then the methods of calculation in the passive scene were provided, and the functions of the scene model, the atmospheric transmission model and the material physical attribute databases were explained. Secondly based on the infrared imaging model, the design idea, the achievable way and the software frame for the simulation software of the infrared image sequence were introduced in SGI workstation. Under the guidance of the idea above, in the third segment of the paper an example of simulative infrared image sequences was presented, which used the sea and sky as background and used the warship as target and used the aircraft as eye point. At last the simulation synthetically was evaluated and the betterment scheme was presented.

  5. Can you sequence ecology? Metagenomics of adaptive diversification.

    PubMed

    Marx, Christopher J

    2013-01-01

    Few areas of science have benefited more from the expansion in sequencing capability than the study of microbial communities. Can sequence data, besides providing hypotheses of the functions the members possess, detect the evolutionary and ecological processes that are occurring? For example, can we determine if a species is adapting to one niche, or if it is diversifying into multiple specialists that inhabit distinct niches? Fortunately, adaptation of populations in the laboratory can serve as a model to test our ability to make such inferences about evolution and ecology from sequencing. Even adaptation to a single niche can give rise to complex temporal dynamics due to the transient presence of multiple competing lineages. If there are multiple niches, this complexity is augmented by segmentation of the population into multiple specialists that can each continue to evolve within their own niche. For a known example of parallel diversification that occurred in the laboratory, sequencing data gave surprisingly few obvious, unambiguous signs of the ecological complexity present. Whereas experimental systems are open to direct experimentation to test hypotheses of selection or ecological interaction, the difficulty in "seeing ecology" from sequencing for even such a simple system suggests translation to communities like the human microbiome will be quite challenging. This will require both improved empirical methods to enhance the depth and time resolution for the relevant polymorphisms and novel statistical approaches to rigorously examine time-series data for signs of various evolutionary and ecological phenomena within and between species.

  6. Rapid and Easy Protocol for Quantification of Next-Generation Sequencing Libraries.

    PubMed

    Hawkins, Steve F C; Guest, Paul C

    2018-01-01

    The emergence of next-generation sequencing (NGS) over the last 10 years has increased the efficiency of DNA sequencing in terms of speed, ease, and price. However, the exact quantification of a NGS library is crucial in order to obtain good data on sequencing platforms developed by the current market leader Illumina. Different approaches for DNA quantification are available currently and the most commonly used are based on analysis of the physical properties of the DNA through spectrophotometric or fluorometric methods. Although these methods are technically simple, they do not allow exact quantification as can be achieved using a real-time quantitative PCR (qPCR) approach. A qPCR protocol for DNA quantification with applications in NGS library preparation studies is presented here. This can be applied in various fields of study such as medical disorders resulting from nutritional programming disturbances.

  7. Crystal structure of simple metals at high pressures

    NASA Astrophysics Data System (ADS)

    Degtyareva, Olga

    2010-09-01

    The effects of pressure on the crystal structure of simple (or sp-) elements are analysed in terms of changes in coordination number, packing density, and interatomic distances, and general rules are established. In the polyvalent elements from groups 14-17, the covalently bonded structures tend to transform to metallic phases with a gradual increase in coordination number and packing density, a behaviour normally expected under pressure. Group 1 and 2 metallic elements, however, show a reverse trend towards structures with low packing density due to intricate changes in their electronic structure. Complex crystal structures such as host-guest and incommensurately modulated structures found in these elements are given special attention in this review in an attempt to determine their role in the observed phase-transition sequences.

  8. [Analysis of MAT1A gene mutations in a child affected with simple hypermethioninemia].

    PubMed

    Sun, Yun; Ma, Dingyuan; Wang, Yanyun; Yang, Bin; Jiang, Tao

    2017-02-10

    To detect potential mutations of MAT1A gene in a child suspected with simple hypermethioninemia by MS/MS neonatal screening. Clinical data of the child was collected. Genomic DNA was extracted by a standard method and subjected to targeted sequencing using an Ion Ampliseq TM Inherited Disease Panel. Detected mutations were verified by Sanger sequencing. The child showed no clinical features except evaluated methionine. A novel compound mutation of the MAT1A gene, i.e., c.345delA and c.529C>T, was identified in the child. His father and mother were found to be heterozygous for the c.345delA mutation and c.529C>T mutation, respectively. The compound mutation c.345delA and c.529C>T of the MAT1A gene probably underlie the disease in the child. The semi-conductor sequencing has provided an important means for the diagnosis of hereditary diseases.

  9. Chemical purification of lanthanides for low-background experiments

    NASA Astrophysics Data System (ADS)

    Boiko, R. S.

    2017-10-01

    There are many potentially active isotopes among the lanthanide elements which are possible to use for low-background experiments to search for double β decay, dark matter, to investigate rare α and β decays. These kind of experiments require very low level of radioactive contamination, but commercially available compounds of lanthanides are always contamined by uranium, thorium, radium, potassium, etc. A simple chemical method based on liquid-liquid extraction has been applied for the purification of CeO2, Nd2O3 and Gd˙2O˙3 from radioactive traces. Detailed schemes of purification procedure are described. Measurements by using HPGe spectrometry demonstrate high efficiency in K, Ra, Th, U contaminations reduction on at least one order of magnitude.

  10. Enumerating viruses by using fluorescence and the nature of the nonviral background fraction.

    PubMed

    Pollard, Peter C

    2012-09-01

    Bulk fluorescence measurements could be a faster and cheaper way of enumerating viruses than epifluorescence microscopy, flow cytometry, or transmission electron microscopy (TEM). However, since viruses are not imaged, the background fluorescence compromises the signal, and we know little about its nature. In this paper the size ranges of nucleotides that fluoresce in the presence of SYBR gold were determined for wastewater and a range of freshwater samples using a differential filtration method. Fluorescence excitation-emission matrices (FEEMs) showed that >70% of the SYBR fluorescence was in the <10-nm size fraction (background) and was not associated with intact viruses. This was confirmed using TEM. The use of FEEMs to develop a fluorescence-based method for counting viruses is an approach that is fundamentally different from the epifluorescence microscopy technique used for enumerating viruses. This high fluorescence background is currently overlooked, yet it has had a most pervasive influence on the development of a simple fluorescence-based method for quantifying viral abundance in water.

  11. A Simple Method to Determine the "R" or "S" Configuration of Molecules with an Axis of Chirality

    ERIC Educational Resources Information Center

    Wang, Cunde; Wu, Weiming

    2011-01-01

    A simple method for the "R" or "S" designation of molecules with an axis of chirality is described. The method involves projection of the substituents along the chiral axis, utilizes the Cahn-Ingold-Prelog sequence rules in assigning priority to the substituents, is easy to use, and has broad applicability. (Contains 5 figures.)

  12. SEQUENCING of TSUNAMI WAVES: Why the first wave is not always the largest?

    NASA Astrophysics Data System (ADS)

    Synolakis, C.; Okal, E.

    2016-12-01

    We discuss what contributes to the `sequencing' of tsunami waves in the far field, that is, to the distribution of the maximum sea surface amplitude inside the dominant wave packet constituting the primary arrival at a distant harbour. Based on simple models of sources for which analytical solutions are available, we show that, as range is increased, the wave pattern evolves from a regime of maximum amplitude in the first oscillation to one of delayed maximum, where the largest amplitude takes place during a subsequent oscillation. In the case of the simple, instantaneous uplift of a circular disk at the surface of an ocean of constant depth, the critical distance for transition between those patterns scales as r 30 /h2 where r0 is the radius of the disk and h the depth of the ocean. This behaviour is explained from simple arguments based on a model where sequencing results from frequency dispersion in the primary wave packet, as the width of its spectrum around its dominant period T0 becomes dispersed in time in an amount comparable to T0 , the latter being controlled by a combination of source size and ocean depth. The general concepts in this model are confirmed in the case of more realistic sources for tsunami excitation by a finite-time deformation of the ocean floor, as well as in real-life simulations of tsunamis excited by large subduction events, for which we find that the influence of fault width on the distribution of sequencing is more important than that of fault length. Finally, simulation of the major events of Chile (2010) and Japan (2011) at large arrays of virtual gauges in the Pacific Basin correctly predicts the majority of the sequencing patterns observed on DART buoys during these events. By providing insight into the evolution with time of wave amplitudes inside primary wave packets for far field tsunamis generated by large earthquakes, our results stress the importance, for civil defense authorities, of issuing warning and evacuation orders

  13. Nullomers and High Order Nullomers in Genomic Sequences

    PubMed Central

    Vergni, Davide; Santoni, Daniele

    2016-01-01

    A nullomer is an oligomer that does not occur as a subsequence in a given DNA sequence, i.e. it is an absent word of that sequence. The importance of nullomers in several applications, from drug discovery to forensic practice, is now debated in the literature. Here, we investigated the nature of nullomers, whether their absence in genomes has just a statistical explanation or it is a peculiar feature of genomic sequences. We introduced an extension of the notion of nullomer, namely high order nullomers, which are nullomers whose mutated sequences are still nullomers. We studied different aspects of them: comparison with nullomers of random sequences, CpG distribution and mean helical rise. In agreement with previous results we found that the number of nullomers in the human genome is much larger than expected by chance. Nevertheless antithetical results were found when considering a random DNA sequence preserving dinucleotide frequencies. The analysis of CpG frequencies in nullomers and high order nullomers revealed, as expected, a high CpG content but it also highlighted a strong dependence of CpG frequencies on the dinucleotide position, suggesting that nullomers have their own peculiar structure and are not simply sequences whose CpG frequency is biased. Furthermore, phylogenetic trees were built on eleven species based on both the similarities between the dinucleotide frequencies and the number of nullomers two species share, showing that nullomers are fairly conserved among close species. Finally the study of mean helical rise of nullomers sequences revealed significantly high mean rise values, reinforcing the hypothesis that those sequences have some peculiar structural features. The obtained results show that nullomers are the consequence of the peculiar structure of DNA (also including biased CpG frequency and CpGs islands), so that the hypermutability model, also taking into account CpG islands, seems to be not sufficient to explain nullomer phenomenon

  14. Improved efficiency in amplification of Escherichia coli o-antigen gene clusters using genome-wide sequence comparison

    USDA-ARS?s Scientific Manuscript database

    Background: In many bacteria including E. coli, genes encoding O-antigens are clustered in the chromosome, with a 39-bp JUMPstart sequence and gnd gene located upstream and downstream of the cluster, respectively. For determining the DNA sequence of the E. coli O-antigen gene cluster, one set of P...

  15. A technique for setting analytical thresholds in massively parallel sequencing-based forensic DNA analysis

    PubMed Central

    2017-01-01

    Amplicon (targeted) sequencing by massively parallel sequencing (PCR-MPS) is a potential method for use in forensic DNA analyses. In this application, PCR-MPS may supplement or replace other instrumental analysis methods such as capillary electrophoresis and Sanger sequencing for STR and mitochondrial DNA typing, respectively. PCR-MPS also may enable the expansion of forensic DNA analysis methods to include new marker systems such as single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) that currently are assayable using various instrumental analysis methods including microarray and quantitative PCR. Acceptance of PCR-MPS as a forensic method will depend in part upon developing protocols and criteria that define the limitations of a method, including a defensible analytical threshold or method detection limit. This paper describes an approach to establish objective analytical thresholds suitable for multiplexed PCR-MPS methods. A definition is proposed for PCR-MPS method background noise, and an analytical threshold based on background noise is described. PMID:28542338

  16. A technique for setting analytical thresholds in massively parallel sequencing-based forensic DNA analysis.

    PubMed

    Young, Brian; King, Jonathan L; Budowle, Bruce; Armogida, Luigi

    2017-01-01

    Amplicon (targeted) sequencing by massively parallel sequencing (PCR-MPS) is a potential method for use in forensic DNA analyses. In this application, PCR-MPS may supplement or replace other instrumental analysis methods such as capillary electrophoresis and Sanger sequencing for STR and mitochondrial DNA typing, respectively. PCR-MPS also may enable the expansion of forensic DNA analysis methods to include new marker systems such as single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) that currently are assayable using various instrumental analysis methods including microarray and quantitative PCR. Acceptance of PCR-MPS as a forensic method will depend in part upon developing protocols and criteria that define the limitations of a method, including a defensible analytical threshold or method detection limit. This paper describes an approach to establish objective analytical thresholds suitable for multiplexed PCR-MPS methods. A definition is proposed for PCR-MPS method background noise, and an analytical threshold based on background noise is described.

  17. Spectroscopic limits to an extragalactic far-ultraviolet background.

    PubMed

    Martin, C; Hurwitz, M; Bowyer, S

    1991-10-01

    We use a spectrum of the lowest intensity diffuse far-ultraviolet background obtained from a series of observations in a number of celestial view directions to constrain the properties of the extragalactic FUV background. The mean continuum level, IEG = 280 +/- 35 photons cm-2 s-1 angstrom-1 sr-1, was obtained in a direction with very low H I column density, and this represents a firm upper limit to any extragalactic background in the 1400-1900 angstroms band. Previous work has demonstrated that the far-ultraviolet background includes (depending on a view direction) contributions from dust-scattered Galactic light, high-ionization emission lines, two-photon emission from H II, H2 fluorescence, and the integrated light of spiral galaxies. We find no evidence in the spectrum of line or continuum features that would signify additional extragalactic components. Motivated by the observation of steep BJ and U number count distributions, we have made a detailed comparison of galaxy evolution models to optical and UV data. We find that the observations are difficult to reconcile with a dominant contribution from unclustered, starburst galaxies at low redshifts. Our measurement rules out large ionizing fluxes at z = 0, but cannot strongly constrain the QSO background light, which is expected to be 0.5%-4% of IEG. We present improved limits on radiative lifetimes of massive neutrinos. We demonstrated with a simple model that IGM radiation is unlikely to make a significant contribution to IEG. Since dust scattering could produce a significant part of the continuum in this lowest intensity spectrum, we carried out a series of tests to evaluate this possibility. We find that the spectrum of a nearby target with higher NH I, when corrected for H2 fluorescence, is very similar to the spectrum obtained in the low H I view direction. This is evidence that the majority of the continuum observed at low NH I is also dust reflection, indicating either the existence of a hitherto

  18. Foreshock and aftershocks in simple earthquake models.

    PubMed

    Kazemian, J; Tiampo, K F; Klein, W; Dominguez, R

    2015-02-27

    Many models of earthquake faults have been introduced that connect Gutenberg-Richter (GR) scaling to triggering processes. However, natural earthquake fault systems are composed of a variety of different geometries and materials and the associated heterogeneity in physical properties can cause a variety of spatial and temporal behaviors. This raises the question of how the triggering process and the structure interact to produce the observed phenomena. Here we present a simple earthquake fault model based on the Olami-Feder-Christensen and Rundle-Jackson-Brown cellular automata models with long-range interactions that incorporates a fixed percentage of stronger sites, or asperity cells, into the lattice. These asperity cells are significantly stronger than the surrounding lattice sites but eventually rupture when the applied stress reaches their higher threshold stress. The introduction of these spatial heterogeneities results in temporal clustering in the model that mimics that seen in natural fault systems along with GR scaling. In addition, we observe sequences of activity that start with a gradually accelerating number of larger events (foreshocks) prior to a main shock that is followed by a tail of decreasing activity (aftershocks). This work provides further evidence that the spatial and temporal patterns observed in natural seismicity are strongly influenced by the underlying physical properties and are not solely the result of a simple cascade mechanism.

  19. CoSMoS: Conserved Sequence Motif Search in the proteome

    PubMed Central

    Liu, Xiao I; Korde, Neeraj; Jakob, Ursula; Leichert, Lars I

    2006-01-01

    Background With the ever-increasing number of gene sequences in the public databases, generating and analyzing multiple sequence alignments becomes increasingly time consuming. Nevertheless it is a task performed on a regular basis by researchers in many labs. Results We have now created a database called CoSMoS to find the occurrences and at the same time evaluate the significance of sequence motifs and amino acids encoded in the whole genome of the model organism Escherichia coli K12. We provide a precomputed set of multiple sequence alignments for each individual E. coli protein with all of its homologues in the RefSeq database. The alignments themselves, information about the occurrence of sequence motifs together with information on the conservation of each of the more than 1.3 million amino acids encoded in the E. coli genome can be accessed via the web interface of CoSMoS. Conclusion CoSMoS is a valuable tool to identify highly conserved sequence motifs, to find regions suitable for mutational studies in functional analyses and to predict important structural features in E. coli proteins. PMID:16433915

  20. Electromagnetic signals are produced by aqueous nanostructures derived from bacterial DNA sequences.

    PubMed

    Montagnier, Luc; Aïssa, Jamal; Ferris, Stéphane; Montagnier, Jean-Luc; Lavallée, Claude

    2009-06-01

    A novel property of DNA is described: the capacity of some bacterial DNA sequences to induce electromagnetic waves at high aqueous dilutions. It appears to be a resonance phenomenon triggered by the ambient electromagnetic background of very low frequency waves. The genomic DNA of most pathogenic bacteria contains sequences which are able to generate such signals. This opens the way to the development of highly sensitive detection system for chronic bacterial infections in human and animal diseases.

  1. Computational studies of sequence-specific driving forces in peptide self-assembly

    NASA Astrophysics Data System (ADS)

    Jeon, Joohyun

    Peptides are biopolymers made from various sequences of twenty different types of amino acids, connected by peptide bonds. There are practically an infinite number of possible sequences and tremendous possible combinations of peptide-peptide interactions. Recently, an increasing number of studies have shown a stark variety of peptide self-assembled nanomaterials whose detailed structures depend on their sequences and environmental factors; these have end uses in medical and bio-electronic applications, for example. To understand the underlying physics of complex peptide self-assembly processes and to delineate sequence specific effects, in this study, I use various simulation tools spanning all-atom molecular dynamics to simple lattice models and quantify the balance of interactions in the peptide self-assembly processes. In contrast to the existing view that peptides' aggregation propensities are proportional to the net sequence hydrophobicity and inversely proportional to the net charge, I show the more nuanced effects of electrostatic interactions, including the cooperative effects between hydrophobic and electrostatic interactions. Notably, I suggest rather unexpected, yet important roles of entropies in the small scale oligomerization processes. Overall, this study broadens our understanding of the role of thermodynamic driving forces in peptide self-assembly.

  2. Web Navigation Sequences Automation in Modern Websites

    NASA Astrophysics Data System (ADS)

    Montoto, Paula; Pan, Alberto; Raposo, Juan; Bellas, Fernando; López, Javier

    Most today’s web sources are designed to be used by humans, but they do not provide suitable interfaces for software programs. That is why a growing interest has arisen in so-called web automation applications that are widely used for different purposes such as B2B integration, automated testing of web applications or technology and business watch. Previous proposals assume models for generating and reproducing navigation sequences that are not able to correctly deal with new websites using technologies such as AJAX: on one hand existing systems only allow recording simple navigation actions and, on the other hand, they are unable to detect the end of the effects caused by an user action. In this paper, we propose a set of new techniques to record and execute web navigation sequences able to deal with all the complexity existing in AJAX-based web sites. We also present an exhaustive evaluation of the proposed techniques that shows very promising results.

  3. A Probabilistic Model of Local Sequence Alignment That Simplifies Statistical Significance Estimation

    PubMed Central

    Eddy, Sean R.

    2008-01-01

    Sequence database searches require accurate estimation of the statistical significance of scores. Optimal local sequence alignment scores follow Gumbel distributions, but determining an important parameter of the distribution (λ) requires time-consuming computational simulation. Moreover, optimal alignment scores are less powerful than probabilistic scores that integrate over alignment uncertainty (“Forward” scores), but the expected distribution of Forward scores remains unknown. Here, I conjecture that both expected score distributions have simple, predictable forms when full probabilistic modeling methods are used. For a probabilistic model of local sequence alignment, optimal alignment bit scores (“Viterbi” scores) are Gumbel-distributed with constant λ = log 2, and the high scoring tail of Forward scores is exponential with the same constant λ. Simulation studies support these conjectures over a wide range of profile/sequence comparisons, using 9,318 profile-hidden Markov models from the Pfam database. This enables efficient and accurate determination of expectation values (E-values) for both Viterbi and Forward scores for probabilistic local alignments. PMID:18516236

  4. CisSERS: Customizable in silico sequence evaluation for restriction sites

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sharpe, Richard M.; Koepke, Tyson; Harper, Artemus

    High-throughput sequencing continues to produce an immense volume of information that is processed and assembled into mature sequence data. Here, data analysis tools are urgently needed that leverage the embedded DNA sequence polymorphisms and consequent changes to restriction sites or sequence motifs in a high-throughput manner to enable biological experimentation. CisSERS was developed as a standalone open source tool to analyze sequence datasets and provide biologists with individual or comparative genome organization information in terms of presence and frequency of patterns or motifs such as restriction enzymes. Predicted agarose gel visualization of the custom analyses results was also integrated tomore » enhance the usefulness of the software. CisSERS offers several novel functionalities, such as handling of large and multiple datasets in parallel, multiple restriction enzyme site detection and custom motif detection features, which are seamlessly integrated with real time agarose gel visualization. Using a simple fasta-formatted file as input, CisSERS utilizes the REBASE enzyme database. Results from CisSERSenable the user to make decisions for designing genotyping by sequencing experiments, reduced representation sequencing, 3’UTR sequencing, and cleaved amplified polymorphic sequence (CAPS) molecular markers for large sample sets. CisSERS is a java based graphical user interface built around a perl backbone. Several of the applications of CisSERS including CAPS molecular marker development were successfully validated using wet-lab experimentation. Here, we present the tool CisSERSand results from in-silico and corresponding wet-lab analyses demonstrating that CisSERS is a technology platform solution that facilitates efficient data utilization in genomics and genetics studies.« less

  5. CisSERS: Customizable in silico sequence evaluation for restriction sites

    DOE PAGES

    Sharpe, Richard M.; Koepke, Tyson; Harper, Artemus; ...

    2016-04-12

    High-throughput sequencing continues to produce an immense volume of information that is processed and assembled into mature sequence data. Here, data analysis tools are urgently needed that leverage the embedded DNA sequence polymorphisms and consequent changes to restriction sites or sequence motifs in a high-throughput manner to enable biological experimentation. CisSERS was developed as a standalone open source tool to analyze sequence datasets and provide biologists with individual or comparative genome organization information in terms of presence and frequency of patterns or motifs such as restriction enzymes. Predicted agarose gel visualization of the custom analyses results was also integrated tomore » enhance the usefulness of the software. CisSERS offers several novel functionalities, such as handling of large and multiple datasets in parallel, multiple restriction enzyme site detection and custom motif detection features, which are seamlessly integrated with real time agarose gel visualization. Using a simple fasta-formatted file as input, CisSERS utilizes the REBASE enzyme database. Results from CisSERSenable the user to make decisions for designing genotyping by sequencing experiments, reduced representation sequencing, 3’UTR sequencing, and cleaved amplified polymorphic sequence (CAPS) molecular markers for large sample sets. CisSERS is a java based graphical user interface built around a perl backbone. Several of the applications of CisSERS including CAPS molecular marker development were successfully validated using wet-lab experimentation. Here, we present the tool CisSERSand results from in-silico and corresponding wet-lab analyses demonstrating that CisSERS is a technology platform solution that facilitates efficient data utilization in genomics and genetics studies.« less

  6. Next generation DNA sequencing technology delivers valuable genetic markers for the genomic orphan legume species, Bituminaria bituminosa

    PubMed Central

    2011-01-01

    Background Bituminaria bituminosa is a perennial legume species from the Canary Islands and Mediterranean region that has potential as a drought-tolerant pasture species and as a source of pharmaceutical compounds. Three botanical varieties have previously been identified in this species: albomarginata, bituminosa and crassiuscula. B. bituminosa can be considered a genomic 'orphan' species with very few genomic resources available. New DNA sequencing technologies provide an opportunity to develop high quality molecular markers for such orphan species. Results 432,306 mRNA molecules were sampled from a leaf transcriptome of a single B. bituminosa plant using Roche 454 pyrosequencing, resulting in an average read length of 345 bp (149.1 Mbp in total). Sequences were assembled into 3,838 isotigs/contigs representing putatively unique gene transcripts. Gene ontology descriptors were identified for 3,419 sequences. Raw sequence reads containing simple sequence repeat (SSR) motifs were identified, and 240 primer pairs flanking these motifs were designed. Of 87 primer pairs developed this way, 75 (86.2%) successfully amplified primarily single fragments by PCR. Fragment analysis using 20 primer pairs in 79 accessions of B. bituminosa detected 130 alleles at 21 SSR loci. Genetic diversity analyses confirmed that variation at these SSR loci accurately reflected known taxonomic relationships in original collections of B. bituminosa and provided additional evidence that a division of the botanical variety bituminosa into two according to geographical origin (Mediterranean region and Canary Islands) may be appropriate. Evidence of cross-pollination was also found between botanical varieties within a B. bituminosa breeding programme. Conclusions B. bituminosa can no longer be considered a genomic orphan species, having now a large (albeit incomplete) repertoire of expressed gene sequences that can serve as a resource for future genetic studies. This experimental approach was

  7. Studies of the extreme ultraviolet/soft x-ray background

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stern, R.A.

    1978-01-01

    The results of an extensive sky survey of the extreme ultraviolet (EUV)/soft x-ray background are reported. The data were obtained with a focusing telescope designed and calibrated at U.C. Berkeley which observed EUV sources and the diffuse background as part of the Apollo-Soyuz mission in July, 1975. With a primary field-of-view of 2.3 + 0.1/sup 0/ FWHM and four EUV bandpass filters (16 to 25, 20 to 73, 80 to 108, and 80 to 250 eV) the EUV telescope obtained background data included in the final observational sample for 21 discrete sky locations and 11 large angular scans, as wellmore » as for a number of shorter observations. Analysis of the data reveals as intense flux above 80 eV energy, with upper limits to the background intensity given for the lower energy filters Ca 2 x 10/sup 4/ and 6 x 10/sup 2/ ph cm/sup -2/ sec/sup -1/ ster/sup -1/ eV/sup -1/ at 21 and 45 eV respectively). The 80 to 108 eV flux agrees within statistical errors with the earlier results of Cash, Malina and Stern (1976): the Apollo-Soyuz average reported intensity is 4.0 +- 1.3 ph cm/sup -2/ sec/sup -1/ ster/sup -1/ eV/sup -1/ at Ca 100 eV, or roughly a factor of ten higher than the corresponding 250 eV intensity. The uniformity of the background flux is uncertain due to limitations in the statistical accuracy of the data; upper limits to the point-to-point standard deviation of the background intensity are (..delta..I/I approximately less than 0.8 +- 0.4 (80 to 108 eV) and approximately less than 0.4 +- 0.2 (80 to 250 eV). No evidence is found for a correlation between the telescope count rate and earth-based parameters (zenith angle, sun angle, etc.) for E approximately greater than 80 eV (the lower energy bandpasses are significantly affected by scattered solar radiation. Unlike some previous claims for the soft x-ray background, no simple dependence upon galactic latitude is seen.« less

  8. Simple Real-Time PCR and Amplicon Sequencing Method for Identification of Plasmodium Species in Human Whole Blood.

    PubMed

    Lefterova, Martina I; Budvytiene, Indre; Sandlund, Johanna; Färnert, Anna; Banaei, Niaz

    2015-07-01

    Malaria is the leading identifiable cause of fever in returning travelers. Accurate Plasmodium species identification has therapy implications for P. vivax and P. ovale, which have dormant liver stages requiring primaquine. Compared to microscopy, nucleic acid tests have improved specificity for species identification and higher sensitivity for mixed infections. Here, we describe a SYBR green-based real-time PCR assay for Plasmodium species identification from whole blood, which uses a panel of reactions to detect species-specific non-18S rRNA gene targets. A pan-Plasmodium 18S rRNA target is also amplified to allow species identification or confirmation by sequencing if necessary. An evaluation of assay accuracy, performed on 76 clinical samples (56 positives using thin smear microscopy as the reference method and 20 negatives), demonstrated clinical sensitivities of 95.2% for P. falciparum (20/21 positives detected) and 100% for the Plasmodium genus (52/52), P. vivax (20/20), P. ovale (9/9), and P. malariae (6/6). The sensitivity of the P. knowlesi-specific PCR was evaluated using spiked whole blood samples (100% [10/10 detected]). The specificities of the real-time PCR primers were 94.2% for P. vivax (49/52) and 100% for P. falciparum (51/51), P. ovale (62/62), P. malariae (69/69), and P. knowlesi (52/52). Thirty-three specimens were used to test species identification by sequencing the pan-Plasmodium 18S rRNA PCR product, with correct identification in all cases. The real-time PCR assay also identified two samples with mixed P. falciparum and P. ovale infection, which was confirmed by sequencing. The assay described here can be integrated into a malaria testing algorithm in low-prevalence areas, allowing definitive Plasmodium species identification shortly after malaria diagnosis by microscopy. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  9. SeqDepot: streamlined database of biological sequences and precomputed features.

    PubMed

    Ulrich, Luke E; Zhulin, Igor B

    2014-01-15

    Assembling and/or producing integrated knowledge of sequence features continues to be an onerous and redundant task despite a large number of existing resources. We have developed SeqDepot-a novel database that focuses solely on two primary goals: (i) assimilating known primary sequences with predicted feature data and (ii) providing the most simple and straightforward means to procure and readily use this information. Access to >28.5 million sequences and 300 million features is provided through a well-documented and flexible RESTful interface that supports fetching specific data subsets, bulk queries, visualization and searching by MD5 digests or external database identifiers. We have also developed an HTML5/JavaScript web application exemplifying how to interact with SeqDepot and Perl/Python scripts for use with local processing pipelines. Freely available on the web at http://seqdepot.net/. RESTaccess via http://seqdepot.net/api/v1. Database files and scripts maybe downloaded from http://seqdepot.net/download.

  10. High throughput SNP discovery and genotyping in grapevine (Vitis vinifera L.) by combining a re-sequencing approach and SNPlex technology

    PubMed Central

    Lijavetzky, Diego; Cabezas, José Antonio; Ibáñez, Ana; Rodríguez, Virginia; Martínez-Zapater, José M

    2007-01-01

    Background Single-nucleotide polymorphisms (SNPs) are the most abundant type of DNA sequence polymorphisms. Their higher availability and stability when compared to simple sequence repeats (SSRs) provide enhanced possibilities for genetic and breeding applications such as cultivar identification, construction of genetic maps, the assessment of genetic diversity, the detection of genotype/phenotype associations, or marker-assisted breeding. In addition, the efficiency of these activities can be improved thanks to the ease with which SNP genotyping can be automated. Expressed sequence tags (EST) sequencing projects in grapevine are allowing for the in silico detection of multiple putative sequence polymorphisms within and among a reduced number of cultivars. In parallel, the sequence of the grapevine cultivar Pinot Noir is also providing thousands of polymorphisms present in this highly heterozygous genome. Still the general application of those SNPs requires further validation since their use could be restricted to those specific genotypes. Results In order to develop a large SNP set of wide application in grapevine we followed a systematic re-sequencing approach in a group of 11 grape genotypes corresponding to ancient unrelated cultivars as well as wild plants. Using this approach, we have sequenced 230 gene fragments, what represents the analysis of over 1 Mb of grape DNA sequence. This analysis has allowed the discovery of 1573 SNPs with an average of one SNP every 64 bp (one SNP every 47 bp in non-coding regions and every 69 bp in coding regions). Nucleotide diversity in grape (π = 0.0051) was found to be similar to values observed in highly polymorphic plant species such as maize. The average number of haplotypes per gene sequence was estimated as six, with three haplotypes representing over 83% of the analyzed sequences. Short-range linkage disequilibrium (LD) studies within the analyzed sequences indicate the existence of a rapid decay of LD within the

  11. Statistical and linguistic features of DNA sequences

    NASA Technical Reports Server (NTRS)

    Havlin, S.; Buldyrev, S. V.; Goldberger, A. L.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We present evidence supporting the idea that the DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. We do not find such a long-range correlation in the coding regions of the gene. We resolve the problem of the "non-stationary" feature of the sequence of base pairs by applying a new algorithm called Detrended Fluctuation Analysis (DFA). We address the claim of Voss that there is no difference in the statistical properties of coding and noncoding regions of DNA by systematically applying the DFA algorithm, as well as standard FFT analysis, to all eukaryotic DNA sequences (33 301 coding and 29 453 noncoding) in the entire GenBank database. We describe a simple model to account for the presence of long-range power-law correlations which is based upon a generalization of the classic Levy walk. Finally, we describe briefly some recent work showing that the noncoding sequences have certain statistical features in common with natural languages. Specifically, we adapt to DNA the Zipf approach to analyzing linguistic texts, and the Shannon approach to quantifying the "redundancy" of a linguistic text in terms of a measurable entropy function. We suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information.

  12. Genetic background of novel sequence types of CTX-M-8- and CTX-M-15-producing Escherichia coli and Klebsiella pneumoniae from public wastewater treatment plants in São Paulo, Brazil.

    PubMed

    Dropa, Milena; Lincopan, Nilton; Balsalobre, Livia C; Oliveira, Danielle E; Moura, Rodrigo A; Fernandes, Miriam Rodriguez; da Silva, Quézia Moura; Matté, Glavur R; Sato, Maria I Z; Matté, Maria H

    2016-03-01

    The release of extended-spectrum β-lactamase (ESBL)-producing Enterobacteriaceae to the environment is a public health issue worldwide. The aim of this study was to investigate the genetic background of genes encoding ESBLs in wastewater treatment plants (WWTPs) in São Paulo, southeastern Brazil. In 2009, during a local surveillance study, seven ESBL-producing Enterobacteriaceae strains were recovered from five WWTPs and screened for ESBL genes and mobile genetic elements. Multilocus sequence typing (MLST) was carried out, and wild plasmids were transformed into electrocompetent Escherichia coli. S1-PFGE technique was used to verify the presence of high molecular weight plasmids in wild-type strains and in bla ESBL-containing E. coli transformants. Strains harbored bla CTX-M-8, bla CTX-M-15, and/or bla SHV-28. Sequencing results showed that bla CTX-M-8 and bla CTX-M-15 genes were associated with IS26. MLST revealed new sequence types for E. coli (ST4401, ST4402, ST4403, and ST4445) and Klebsiella pneumoniae (ST1574), except for one K. pneumoniae from ST307 and Enterobacter cloacae from ST131. PCR and S1-PFGE results showed CTX-M-producing E. coli transformants carried heavy plasmids sizing 48.5-209 kb, which belonged to IncI1, IncF, and IncM1 incompatibility groups. This is the first report of CTX-M-8 and SHV-28 enzymes in environmental samples, and the present results demonstrate the plasmid-mediated spread of CTX-M-encoding genes through five WWTPs in São Paulo, Brazil, suggesting WWTPs are hotspots for the transfer of ESBL genes and confirming the urgent need to improve the management of sewage in order to minimize the dissemination of resistance genes to the environment.

  13. Sequence periodicity in nucleosomal DNA and intrinsic curvature

    PubMed Central

    2010-01-01

    Background Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Results Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. Conclusions The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA. PMID:20487515

  14. Analysis of Litopenaeus vannamei Transcriptome Using the Next-Generation DNA Sequencing Technique

    PubMed Central

    Li, Chaozheng; Weng, Shaoping; Chen, Yonggui; Yu, Xiaoqiang; Lü, Ling; Zhang, Haiqing; He, Jianguo; Xu, Xiaopeng

    2012-01-01

    Background Pacific white shrimp (Litopenaeus vannamei), the major species of farmed shrimps in the world, has been attracting extensive studies, which require more and more genome background knowledge. The now available transcriptome data of L. vannamei are insufficient for research requirements, and have not been adequately assembled and annotated. Methodology/Principal Findings This is the first study that used a next-generation high-throughput DNA sequencing technique, the Solexa/Illumina GA II method, to analyze the transcriptome from whole bodies of L. vannamei larvae. More than 2.4 Gb of raw data were generated, and 109,169 unigenes with a mean length of 396 bp were assembled using the SOAP denovo software. 73,505 unigenes (>200 bp) with good quality sequences were selected and subjected to annotation analysis, among which 37.80% can be matched in NCBI Nr database, 37.3% matched in Swissprot, and 44.1% matched in TrEMBL. Using BLAST and BLAST2Go softwares, 11,153 unigenes were classified into 25 Clusters of Orthologous Groups of proteins (COG) categories, 8171 unigenes were assigned into 51 Gene ontology (GO) functional groups, and 18,154 unigenes were divided into 220 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. To primarily verify part of the results of assembly and annotations, 12 assembled unigenes that are homologous to many embryo development-related genes were chosen and subjected to RT-PCR for electrophoresis and Sanger sequencing analyses, and to real-time PCR for expression profile analyses during embryo development. Conclusions/Significance The L. vannamei transcriptome analyzed using the next-generation sequencing technique enriches the information of L. vannamei genes, which will facilitate our understanding of the genome background of crustaceans, and promote the studies on L. vannamei. PMID:23071809

  15. Background characterization of an ultra-low background liquid scintillation counter

    DOE PAGES

    Erchinger, J. L.; Orrell, John L.; Aalseth, C. E.; ...

    2017-01-26

    The Ultra-Low Background Liquid Scintillation Counter developed by Pacific Northwest National Laboratory will expand the application of liquid scintillation counting by enabling lower detection limits and smaller sample volumes. By reducing the overall count rate of the background environment approximately 2 orders of magnitude below that of commercially available systems, backgrounds on the order of tens of counts per day over an energy range of ~3–3600 keV can be realized. Finally, initial test results of the ULB LSC show promising results for ultra-low background detection with liquid scintillation counting.

  16. Increasing Classroom Compliance: Using a High-Probability Command Sequence with Noncompliant Students

    ERIC Educational Resources Information Center

    Axelrod, Michael I.; Zank, Amber J.

    2012-01-01

    Noncompliance is one of the most problematic behaviors within the school setting. One strategy to increase compliance of noncompliant students is a high-probability command sequence (HPCS; i.e., a set of simple commands in which an individual is likely to comply immediately prior to the delivery of a command that has a lower probability of…

  17. AMPLISAS: a web server for multilocus genotyping using next-generation amplicon sequencing data.

    PubMed

    Sebastian, Alvaro; Herdegen, Magdalena; Migalska, Magdalena; Radwan, Jacek

    2016-03-01

    Next-generation sequencing (NGS) technologies are revolutionizing the fields of biology and medicine as powerful tools for amplicon sequencing (AS). Using combinations of primers and barcodes, it is possible to sequence targeted genomic regions with deep coverage for hundreds, even thousands, of individuals in a single experiment. This is extremely valuable for the genotyping of gene families in which locus-specific primers are often difficult to design, such as the major histocompatibility complex (MHC). The utility of AS is, however, limited by the high intrinsic sequencing error rates of NGS technologies and other sources of error such as polymerase amplification or chimera formation. Correcting these errors requires extensive bioinformatic post-processing of NGS data. Amplicon Sequence Assignment (AMPLISAS) is a tool that performs analysis of AS results in a simple and efficient way, while offering customization options for advanced users. AMPLISAS is designed as a three-step pipeline consisting of (i) read demultiplexing, (ii) unique sequence clustering and (iii) erroneous sequence filtering. Allele sequences and frequencies are retrieved in excel spreadsheet format, making them easy to interpret. AMPLISAS performance has been successfully benchmarked against previously published genotyped MHC data sets obtained with various NGS technologies. © 2015 John Wiley & Sons Ltd.

  18. RIKEN Integrated Sequence Analysis (RISA) System—384-Format Sequencing Pipeline with 384 Multicapillary Sequencer

    PubMed Central

    Shibata, Kazuhiro; Itoh, Masayoshi; Aizawa, Katsunori; Nagaoka, Sumiharu; Sasaki, Nobuya; Carninci, Piero; Konno, Hideaki; Akiyama, Junichi; Nishi, Katsuo; Kitsunai, Tokuji; Tashiro, Hideo; Itoh, Mari; Sumi, Noriko; Ishii, Yoshiyuki; Nakamura, Shin; Hazama, Makoto; Nishine, Tsutomu; Harada, Akira; Yamamoto, Rintaro; Matsumoto, Hiroyuki; Sakaguchi, Sumito; Ikegami, Takashi; Kashiwagi, Katsuya; Fujiwake, Syuji; Inoue, Kouji; Togawa, Yoshiyuki; Izawa, Masaki; Ohara, Eiji; Watahiki, Masanori; Yoneda, Yuko; Ishikawa, Tomokazu; Ozawa, Kaori; Tanaka, Takumi; Matsuura, Shuji; Kawai, Jun; Okazaki, Yasushi; Muramatsu, Masami; Inoue, Yorinao; Kira, Akira; Hayashizaki, Yoshihide

    2000-01-01

    The RIKEN high-throughput 384-format sequencing pipeline (RISA system) including a 384-multicapillary sequencer (the so-called RISA sequencer) was developed for the RIKEN mouse encyclopedia project. The RISA system consists of colony picking, template preparation, sequencing reaction, and the sequencing process. A novel high-throughput 384-format capillary sequencer system (RISA sequencer system) was developed for the sequencing process. This system consists of a 384-multicapillary auto sequencer (RISA sequencer), a 384-multicapillary array assembler (CAS), and a 384-multicapillary casting device. The RISA sequencer can simultaneously analyze 384 independent sequencing products. The optical system is a scanning system chosen after careful comparison with an image detection system for the simultaneous detection of the 384-capillary array. This scanning system can be used with any fluorescent-labeled sequencing reaction (chain termination reaction), including transcriptional sequencing based on RNA polymerase, which was originally developed by us, and cycle sequencing based on thermostable DNA polymerase. For long-read sequencing, 380 out of 384 sequences (99.2%) were successfully analyzed and the average read length, with more than 99% accuracy, was 654.4 bp. A single RISA sequencer can analyze 216 kb with >99% accuracy in 2.7 h (90 kb/h). For short-read sequencing to cluster the 3′ end and 5′ end sequencing by reading 350 bp, 384 samples can be analyzed in 1.5 h. We have also developed a RISA inoculator, RISA filtrator and densitometer, RISA plasmid preparator which can handle throughput of 40,000 samples in 17.5 h, and a high-throughput RISA thermal cycler which has four 384-well sites. The combination of these technologies allowed us to construct the RISA system consisting of 16 RISA sequencers, which can process 50,000 DNA samples per day. One haploid genome shotgun sequence of a higher organism, such as human, mouse, rat, domestic animals, and plants, can

  19. Brain Activation During Anticipation of Sound Sequences

    PubMed Central

    Leaver, Amber M.; Van Lare, Jennifer; Zielinski, Brandon; Halpern, Andrea R.; Rauschecker, Josef P.

    2010-01-01

    Music consists of sound sequences that require integration over time. As we become familiar with music, associations between notes, melodies, and entire symphonic movements become stronger and more complex. These associations can become so tight that, for example, hearing the end of one album track can elicit a robust image of the upcoming track while anticipating it in total silence. Here we study this predictive “anticipatory imagery” at various stages throughout learning and investigate activity changes in corresponding neural structures using functional magnetic resonance imaging (fMRI). Anticipatory imagery (in silence) for highly familiar naturalistic music was accompanied by pronounced activity in rostral prefrontal cortex (PFC) and premotor areas. Examining changes in the neural bases of anticipatory imagery during two stages of learning conditional associations between simple melodies, however, demonstrates the importance of fronto-striatal connections, consistent with a role of the basal ganglia in “training” frontal cortex (Pasupathy and Miller, 2005). Another striking change in neural resources during learning was a shift between caudal PFC earlier to rostral PFC later in learning. Our findings regarding musical anticipation and sound sequence learning are highly compatible with studies of motor sequence learning, suggesting common predictive mechanisms in both domains. PMID:19244522

  20. Statistical mechanics of simple models of protein folding and design.

    PubMed Central

    Pande, V S; Grosberg, A Y; Tanaka, T

    1997-01-01

    It is now believed that the primary equilibrium aspects of simple models of protein folding are understood theoretically. However, current theories often resort to rather heavy mathematics to overcome some technical difficulties inherent in the problem or start from a phenomenological model. To this end, we take a new approach in this pedagogical review of the statistical mechanics of protein folding. The benefit of our approach is a drastic mathematical simplification of the theory, without resort to any new approximations or phenomenological prescriptions. Indeed, the results we obtain agree precisely with previous calculations. Because of this simplification, we are able to present here a thorough and self contained treatment of the problem. Topics discussed include the statistical mechanics of the random energy model (REM), tests of the validity of REM as a model for heteropolymer freezing, freezing transition of random sequences, phase diagram of designed ("minimally frustrated") sequences, and the degree to which errors in the interactions employed in simulations of either folding and design can still lead to correct folding behavior. Images FIGURE 2 FIGURE 3 FIGURE 4 FIGURE 6 PMID:9414231