Sample records for unique sequence microclones

  1. High-resolution mapping and sequence analysis of 597 cDNA clones transcribed from the 1 Mb region in human chromosome 4q16.3 containing Huntington disease gene

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hadano, S.; Ishida, Y.; Tomiyasu, H.

    1994-09-01

    To complete a transcription map of the 1 Mb region in human chromosome 4p16.3 containing the Huntington disease (HD) gene, the isolation of cDNA clones are being performed throughout. Our method relies on a direct screening of the cDNA libraries probed with single copy microclones from 3 YAC clones spanning 1 Mbp of the HD gene region. AC-DNAs were isolated by a preparative pulsed-field gel electrophoresis, amplified by both a single unique primer (SUP)-PCR and a linker ligation PCR, and 6 microclone-DNA libraries were generated. Then, 8,640 microclones from these libraries were independently amplified by PCR, and arrayed onto themore » membranes. 800-900 microclones that were not cross-hybridized with total human and yeast genomic DNA, TAC vector DNA, and ribosomal cDNA on a dot hybridization (putatively carrying single copy sequences) were pooled to make 9 probe pools. A total of {approximately}1.8x10{sup 7} plaques from the human brain cDNA libraries was screened with 9 pool-probes, and then 672 positive cDNA clones were obtained. So far, 597 cDNA clones were defined and arrayed onto a map of the 1 Mbp of the HD gene region by hybridization with HD region-specific cosmid contigs and YAC clones. Further characterization including a DNA sequencing and Northern blot analysis is currently underway.« less

  2. The Development of Chromosome Microdissection and Microcloning Technique and its Applications in Genomic Research

    PubMed Central

    Zhou, Ruo-Nan; Hu, Zan-Min

    2007-01-01

    The technique of chromosome microdissection and microcloning has been developed for more than 20 years. As a bridge between cytogenetics and molecular genetics, it leads to a number of applications: chromosome painting probe isolation, genetic linkage map and physical map construction, and expressed sequence tags generation. During those 20 years, this technique has not only been benefited from other technological advances but also cross-fertilized with other techniques. Today, it becomes a practicality with extensive uses. The purpose of this article is to review the development of this technique and its application in the field of genomic research. Moreover, a new method of generating ESTs of specific chromosomes developed by our lab is introduced. By using this method, the technique of chromosome microdissection and microcloning would be more valuable in the advancement of genomic research. PMID:18645627

  3. Clonal and microclonal mutational heterogeneity in high hyperdiploid acute lymphoblastic leukemia

    PubMed Central

    de Smith, Adam J.; Ojha, Juhi; Francis, Stephen S.; Sanders, Erica; Endicott, Alyson A.; Hansen, Helen M.; Smirnov, Ivan; Termuhlen, Amanda M.; Walsh, Kyle M.; Metayer, Catherine; Wiemels, Joseph L.

    2016-01-01

    High hyperdiploidy (HD), the most common cytogenetic subtype of B-cell acute lymphoblastic leukemia (B-ALL), is largely curable but significant treatment-related morbidity warrants investigating the biology and identifying novel drug targets. Targeted deep-sequencing of 538 cancer-relevant genes was performed in 57 HD-ALL patients lacking overt KRAS and NRAS hotspot mutations and lacking common B-ALL deletions to enrich for discovery of novel driver genes. One-third of patients harbored damaging mutations in epigenetic regulatory genes, including the putative novel driver DOT1L (n=4). Receptor tyrosine kinase (RTK)/Ras/MAPK signaling pathway mutations were found in two-thirds of patients, including novel mutations in ROS1, which mediates phosphorylation of the PTPN11-encoded protein SHP2. Mutations in FLT3 significantly co-occurred with DOT1L (p=0.04), suggesting functional cooperation in leukemogenesis. We detected an extraordinary level of tumor heterogeneity, with microclonal (mutant allele fraction <0.10) KRAS, NRAS, FLT3, and/or PTPN11 hotspot mutations evident in 31/57 (54.4%) patients. Multiple KRAS and NRAS codon 12 and 13 microclonal mutations significantly co-occurred within tumor samples (p=4.8×10−4), suggesting ongoing formation of and selection for Ras-activating mutations. Future work is required to investigate whether tumor microheterogeneity impacts clinical outcome and to elucidate the functional consequences of epigenetic dysregulation in HD-ALL, potentially leading to novel therapeutic approaches. PMID:27683039

  4. The Chromosome Microdissection and Microcloning Technique.

    PubMed

    Zhang, Ying-Xin; Deng, Chuan-Liang; Hu, Zan-Min

    2016-01-01

    Chromosome microdissection followed by microcloning is an efficient tool combining cytogenetics and molecular genetics that can be used for the construction of the high density molecular marker linkage map and fine physical map, the generation of probes for chromosome painting, and the localization and cloning of important genes. Here, we describe a modified technique to microdissect a single chromosome, paint individual chromosomes, and construct single-chromosome DNA libraries.

  5. Isolation and characterization of 21 novel expressed DNA sequences from the distal region of human chromosome 4p

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ishida, Yoshikazu; Hadano, Shinji; Nagayama, Tomiko

    1994-07-15

    The authors have established an approach to the isolation of expressed DNA sequences from a defined region of the human chromosome. The method relies on the direct screening of cDNA libraries using pooled single-copy microclones generated by a laser chromosome microdissection in conjunction with a single unique primer polymerase chain reaction (SUP-PCR) procedure. They applied this method to the distal region of human chromosome 4p (4p15-4pter), which contains the Huntington disease (HD) and the Wolf-Hirschhorn syndrome (WHS) loci. Twenty-one nonoverlapping and region-specific cDNA clones encoding novel genes were isolated in this manner. Ten of 21 clones were subregionally assigned tomore » 4p16.1-4pter, and the remainder mapped to the region proximal to 4p16.1. Northern blot and reverse transcription followed by the PCR (RT-PCR) analysis revealed that 16 of these 21 clones detected transcripts in total RNA from human tissues. The method is applicable to other chromosomal regions and is a powerful approach to the isolation of region-specific cDNA clones. 44 refs., 3 figs., 3 tabs.« less

  6. Microdissection and molecular manipulation of single chromosomes in woody fruit trees with small chromosomes using pomelo (Citrus grandis) as a model. I. Construction of single chromosomal DNA libraries.

    PubMed

    Huang, D; Wu, W; Zhou, Y; Hu, Z; Lu, L

    2004-05-01

    Construction of single chromosomal DNA libraries by means of chromosome microdissection and microcloning will be useful for genomic research, especially for those species that have not been extensively studied genetically. Application of the technology of microdissection and microcloning to woody fruit plants has not been reported hitherto, largely due to the generally small sizes of metaphase chromosomes and the difficulty of chromosome preparation. The present study was performed to establish a method for single chromosome microdissection and microcloning in woody fruit species using pomelo as a model. The standard karyotype of a pomelo cultivar ( Citrus grandis cv. Guanxi) was established based on 20 prometaphase photomicrographs. According to the standard karyotype, chromosome 1 was identified and isolated with fine glass microneedles controlled by a micromanipulator. DNA fragments ranging from 0.3 kb to 2 kb were acquired from the isolated single chromosome 1 via two rounds of PCR mediated by Sau3A linker adaptors and then cloned into T-easy vectors to generate a DNA library of chromosome 1. Approximately 30,000 recombinant clones were obtained. Evaluation based on 108 randomly selected clones showed that the sizes of the cloned inserts varied from 0.5 kb to 1.5 kb with an average of 860 bp. Our research suggests that microdissection and microcloning of single small chromosomes in woody plants is feasible.

  7. Microclonal Multipication of Wild Cherry (Prunus Avium L.) from Shoot Tips and Root Sucker Buds

    Treesearch

    Branka Pevalek-Kozlina; Charles H. Michler; Sibila Jelaska

    1994-01-01

    The effects of different combinations and concentrations of the growth regulators: 6-benzylaminopurine (BA), 6 furfurylaminopurine (KIN), N6- (2-isopentenyl) adenine (2iP), indole-3-butyric acid (IBA), indole-3-acetic acid (IAA) and a-naphthaleneacetic acid (NAA) on axillary shoot multiplication rates for wild cherry (Prunus aviurn...

  8. Molecular cytogenetic characterization and origin of two de novo duplication 9p cases.

    PubMed

    Tsezou, A; Kitsiou, S; Galla, A; Petersen, M B; Karadima, G; Syrrou, M; Sahlèn, S; Blennow, E

    2000-03-13

    We report on two additional cases with duplication of 9p, minor with facial anomalies and developmental delay. Using fluorescence in situ hybridization and single-copy probes, we showed that the first case was a direct duplication, whereas the second case was inverted. The extent of the direct duplication was defined as 9p12 --> p24 by microdissection and microcloning of the aberrant chromosome and subsequent chromosome-specific comparative genomic hybridization. DNA polymorphism analysis with eight microsatellite markers revealed that the origin of the dup(9p) was maternal in the first case, whereas it was paternal in the second. Copyright 2000 Wiley-Liss, Inc.

  9. Putative and unique gene sequence utilization for the design of species specific probes as modeled by Lactobacillus plantarum

    USDA-ARS?s Scientific Manuscript database

    The concept of utilizing putative and unique gene sequences for the design of species specific probes was tested. The abundance profile of assigned functions within the Lactobacillus plantarum genome was used for the identification of the putative and unique gene sequence, csh. The targeted gene (cs...

  10. RECOVIR Software for Identifying Viruses

    NASA Technical Reports Server (NTRS)

    Chakravarty, Sugoto; Fox, George E.; Zhu, Dianhui

    2013-01-01

    Most single-stranded RNA (ssRNA) viruses mutate rapidly to generate a large number of strains with highly divergent capsid sequences. Determining the capsid residues or nucleotides that uniquely characterize these strains is critical in understanding the strain diversity of these viruses. RECOVIR (an acronym for "recognize viruses") software predicts the strains of some ssRNA viruses from their limited sequence data. Novel phylogenetic-tree-based databases of protein or nucleic acid residues that uniquely characterize these virus strains are created. Strains of input virus sequences (partial or complete) are predicted through residue-wise comparisons with the databases. RECOVIR uses unique characterizing residues to identify automatically strains of partial or complete capsid sequences of picorna and caliciviruses, two of the most highly diverse ssRNA virus families. Partition-wise comparisons of the database residues with the corresponding residues of more than 300 complete and partial sequences of these viruses resulted in correct strain identification for all of these sequences. This study shows the feasibility of creating databases of hitherto unknown residues uniquely characterizing the capsid sequences of two of the most highly divergent ssRNA virus families. These databases enable automated strain identification from partial or complete capsid sequences of these human and animal pathogens.

  11. Identification of a Unique Amyloid Sequence in AA Amyloidosis of a Pig Associated With Streptococcus Suis Infection.

    PubMed

    Kamiie, J; Sugahara, G; Yoshimoto, S; Aihara, N; Mineshige, T; Uetsuka, K; Shirota, K

    2017-01-01

    Here we report a pig with amyloid A (AA) amyloidosis associated with Streptococcus suis infection and identification of a unique amyloid sequence in the amyloid deposits in the tissue. Tissues from the 180-day-old underdeveloped pig contained foci of necrosis and suppurative inflammation associated with S. suis infection. Congo red stain, immunohistochemistry, and electron microscopy revealed intense AA deposition in the spleen and renal glomeruli. Mass spectrometric analysis of amyloid material extracted from the spleen showed serum AA 2 (SAA2) peptide as well as a unique peptide sequence previously reported in a pig with AA amyloidosis. The common detection of the unique amyloid sequence in the current and past cases of AA amyloidosis in pigs suggests that this amyloid sequence might play a key role in the development of porcine AA amyloidosis. An in vitro fibrillation assay demonstrated that the unique AA peptide formed typically rigid, long amyloid fibrils (10 nm wide) and the N-terminus peptide of SAA2 formed zigzagged, short fibers (7 nm wide). Moreover, the SAA2 peptide formed long, rigid amyloid fibrils in the presence of sonicated amyloid fibrils formed by the unique AA peptide. These findings indicate that the N-terminus of SAA2 as well as the AA peptide mediate the development of AA amyloidosis in pigs via cross-seeding polymerization.

  12. Incorporation of unique molecular identifiers in TruSeq adapters improves the accuracy of quantitative sequencing.

    PubMed

    Hong, Jungeui; Gresham, David

    2017-11-01

    Quantitative analysis of next-generation sequencing (NGS) data requires discriminating duplicate reads generated by PCR from identical molecules that are of unique origin. Typically, PCR duplicates are identified as sequence reads that align to the same genomic coordinates using reference-based alignment. However, identical molecules can be independently generated during library preparation. Misidentification of these molecules as PCR duplicates can introduce unforeseen biases during analyses. Here, we developed a cost-effective sequencing adapter design by modifying Illumina TruSeq adapters to incorporate a unique molecular identifier (UMI) while maintaining the capacity to undertake multiplexed, single-index sequencing. Incorporation of UMIs into TruSeq adapters (TrUMIseq adapters) enables identification of bona fide PCR duplicates as identically mapped reads with identical UMIs. Using TrUMIseq adapters, we show that accurate removal of PCR duplicates results in improved accuracy of both allele frequency (AF) estimation in heterogeneous populations using DNA sequencing and gene expression quantification using RNA-Seq.

  13. Partial bisulfite conversion for unique template sequencing

    PubMed Central

    Kumar, Vijay; Rosenbaum, Julie; Wang, Zihua; Forcier, Talitha; Ronemus, Michael; Wigler, Michael

    2018-01-01

    Abstract We introduce a new protocol, mutational sequencing or muSeq, which uses sodium bisulfite to randomly deaminate unmethylated cytosines at a fixed and tunable rate. The muSeq protocol marks each initial template molecule with a unique mutation signature that is present in every copy of the template, and in every fragmented copy of a copy. In the sequenced read data, this signature is observed as a unique pattern of C-to-T or G-to-A nucleotide conversions. Clustering reads with the same conversion pattern enables accurate count and long-range assembly of initial template molecules from short-read sequence data. We explore count and low-error sequencing by profiling 135 000 restriction fragments in a PstI representation, demonstrating that muSeq improves copy number inference and significantly reduces sporadic sequencer error. We explore long-range assembly in the context of cDNA, generating contiguous transcript clusters greater than 3,000 bp in length. The muSeq assemblies reveal transcriptional diversity not observable from short-read data alone. PMID:29161423

  14. A novel, privacy-preserving cryptographic approach for sharing sequencing data

    PubMed Central

    Cassa, Christopher A; Miller, Rachel A; Mandl, Kenneth D

    2013-01-01

    Objective DNA samples are often processed and sequenced in facilities external to the point of collection. These samples are routinely labeled with patient identifiers or pseudonyms, allowing for potential linkage to identity and private clinical information if intercepted during transmission. We present a cryptographic scheme to securely transmit externally generated sequence data which does not require any patient identifiers, public key infrastructure, or the transmission of passwords. Materials and methods This novel encryption scheme cryptographically protects participant sequence data using a shared secret key that is derived from a unique subset of an individual’s genetic sequence. This scheme requires access to a subset of an individual’s genetic sequence to acquire full access to the transmitted sequence data, which helps to prevent sample mismatch. Results We validate that the proposed encryption scheme is robust to sequencing errors, population uniqueness, and sibling disambiguation, and provides sufficient cryptographic key space. Discussion Access to a set of an individual’s genotypes and a mutually agreed cryptographic seed is needed to unlock the full sequence, which provides additional sample authentication and authorization security. We present modest fixed and marginal costs to implement this transmission architecture. Conclusions It is possible for genomics researchers who sequence participant samples externally to protect the transmission of sequence data using unique features of an individual’s genetic sequence. PMID:23125421

  15. Cloning, analysis and functional annotation of expressed sequence tags from the Earthworm Eisenia fetida

    PubMed Central

    Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping

    2007-01-01

    Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730

  16. Genome Sequence of a Canadian Vibrio parahaemolyticus Isolate with Unique Mobilizing Capacity.

    PubMed

    Bioteau, Audrey; Huguet, Kévin; Burrus, Vincent; Banerjee, Swapan

    2018-06-14

    Vibrio parahaemolyticus is a clinically significant marine bacterium implicated in gastroenteritis among consumers of raw or undercooked seafood. This report presents the whole-genome sequence of a unique strain of V. parahaemolyticus isolated from oysters harvested in Canada. © Crown copyright 2018.

  17. Transcriptomic sequencing reveals a set of unique genes activated by butyrate-induced histone modification

    USDA-ARS?s Scientific Manuscript database

    Butyrate is a nutritional element with strong epigenetic regulatory activity as an inhibitor of histone deacetylases (HDACs). Based on the analysis of differentially expressed genes induced by butyrate in the bovine epithelial cell using deep RNA-sequencing technology (RNA-seq), a set of unique gen...

  18. Defining the healthy "core microbiome" of oral microbial communities

    PubMed Central

    2009-01-01

    Background Most studies examining the commensal human oral microbiome are focused on disease or are limited in methodology. In order to diagnose and treat diseases at an early and reversible stage an in-depth definition of health is indispensible. The aim of this study therefore was to define the healthy oral microbiome using recent advances in sequencing technology (454 pyrosequencing). Results We sampled and sequenced microbiomes from several intraoral niches (dental surfaces, cheek, hard palate, tongue and saliva) in three healthy individuals. Within an individual oral cavity, we found over 3600 unique sequences, over 500 different OTUs or "species-level" phylotypes (sequences that clustered at 3% genetic difference) and 88 - 104 higher taxa (genus or more inclusive taxon). The predominant taxa belonged to Firmicutes (genus Streptococcus, family Veillonellaceae, genus Granulicatella), Proteobacteria (genus Neisseria, Haemophilus), Actinobacteria (genus Corynebacterium, Rothia, Actinomyces), Bacteroidetes (genus Prevotella, Capnocytophaga, Porphyromonas) and Fusobacteria (genus Fusobacterium). Each individual sample harboured on average 266 "species-level" phylotypes (SD 67; range 123 - 326) with cheek samples being the least diverse and the dental samples from approximal surfaces showing the highest diversity. Principal component analysis discriminated the profiles of the samples originating from shedding surfaces (mucosa of tongue, cheek and palate) from the samples that were obtained from solid surfaces (teeth). There was a large overlap in the higher taxa, "species-level" phylotypes and unique sequences among the three microbiomes: 84% of the higher taxa, 75% of the OTUs and 65% of the unique sequences were present in at least two of the three microbiomes. The three individuals shared 1660 of 6315 unique sequences. These 1660 sequences (the "core microbiome") contributed 66% of the reads. The overlapping OTUs contributed to 94% of the reads, while nearly all reads (99.8%) belonged to the shared higher taxa. Conclusions We obtained the first insight into the diversity and uniqueness of individual oral microbiomes at a resolution of next-generation sequencing. We showed that a major proportion of bacterial sequences of unrelated healthy individuals is identical, supporting the concept of a core microbiome at health. PMID:20003481

  19. Quantum-Sequencing: Fast electronic single DNA molecule sequencing

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.

  20. Partial bisulfite conversion for unique template sequencing.

    PubMed

    Kumar, Vijay; Rosenbaum, Julie; Wang, Zihua; Forcier, Talitha; Ronemus, Michael; Wigler, Michael; Levy, Dan

    2018-01-25

    We introduce a new protocol, mutational sequencing or muSeq, which uses sodium bisulfite to randomly deaminate unmethylated cytosines at a fixed and tunable rate. The muSeq protocol marks each initial template molecule with a unique mutation signature that is present in every copy of the template, and in every fragmented copy of a copy. In the sequenced read data, this signature is observed as a unique pattern of C-to-T or G-to-A nucleotide conversions. Clustering reads with the same conversion pattern enables accurate count and long-range assembly of initial template molecules from short-read sequence data. We explore count and low-error sequencing by profiling 135 000 restriction fragments in a PstI representation, demonstrating that muSeq improves copy number inference and significantly reduces sporadic sequencer error. We explore long-range assembly in the context of cDNA, generating contiguous transcript clusters greater than 3,000 bp in length. The muSeq assemblies reveal transcriptional diversity not observable from short-read data alone. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Unique Trichomonas vaginalis gene sequences identified in multinational regions of Northwest China.

    PubMed

    Liu, Jun; Feng, Meng; Wang, Xiaolan; Fu, Yongfeng; Ma, Cailing; Cheng, Xunjia

    2017-07-24

    Trichomonas vaginalis (T. vaginalis) is a flagellated protozoan parasite that infects humans worldwide. This study determined the sequence of the 18S ribosomal RNA gene of T. vaginalis infecting both females and males in Xinjiang, China. Samples from 73 females and 28 males were collected and confirmed for infection with T. vaginalis, a total of 110 sequences were identified when the T. vaginalis 18S ribosomal RNA gene was sequenced. These sequences were used to prepare a phylogenetic network. The rooted network comprised three large clades and several independent branches. Most of the Xinjiang sequences were in one group. Preliminary results suggest that Xinjiang T. vaginalis isolates might be genetically unique, as indicated by the sequence of their 18S ribosomal RNA gene. Low migration rate of local people in this province may contribute to a genetic conservativeness of T. vaginalis. The unique genetic feature of our isolates may suggest a different clinical presentation of trichomoniasis, including metronidazole susceptibility, T. vaginalis virus or Mycoplasma co-infection characteristics. The transmission and evolution of Xinjiang T. vaginalis is of interest and should be studied further. More attention should be given to T. vaginalis infection in both females and males in Xinjiang.

  2. Identification and verification of hybridoma-derived monoclonal antibody variable region sequences using recombinant DNA technology and mass spectrometry

    USDA-ARS?s Scientific Manuscript database

    Antibody engineering requires the identification of antigen binding domains or variable regions (VR) unique to each antibody. It is the VR that define the unique antigen binding properties and proper sequence identification is essential for functional evaluation and performance of recombinant antibo...

  3. Comprehensive analysis of the T-cell receptor beta chain gene in rhesus monkey by high throughput sequencing

    PubMed Central

    Li, Zhoufang; Liu, Guangjie; Tong, Yin; Zhang, Meng; Xu, Ying; Qin, Li; Wang, Zhanhui; Chen, Xiaoping; He, Jiankui

    2015-01-01

    Profiling immune repertoires by high throughput sequencing enhances our understanding of immune system complexity and immune-related diseases in humans. Previously, cloning and Sanger sequencing identified limited numbers of T cell receptor (TCR) nucleotide sequences in rhesus monkeys, thus their full immune repertoire is unknown. We applied multiplex PCR and Illumina high throughput sequencing to study the TCRβ of rhesus monkeys. We identified 1.26 million TCRβ sequences corresponding to 643,570 unique TCRβ sequences and 270,557 unique complementarity-determining region 3 (CDR3) gene sequences. Precise measurements of CDR3 length distribution, CDR3 amino acid distribution, length distribution of N nucleotide of junctional region, and TCRV and TCRJ gene usage preferences were performed. A comprehensive profile of rhesus monkey immune repertoire might aid human infectious disease studies using rhesus monkeys. PMID:25961410

  4. Swallow Event Sequencing: Comparing Healthy Older and Younger Adults.

    PubMed

    Herzberg, Erica G; Lazarus, Cathy L; Steele, Catriona M; Molfenter, Sonja M

    2018-04-23

    Previous research has established that a great deal of variation exists in the temporal sequence of swallowing events for healthy adults. Yet, the impact of aging on swallow event sequence is not well understood. Kendall et al. (Dysphagia 18(2):85-91, 2003) suggested there are 4 obligatory paired-event sequences in swallowing. We directly compared adherence to these sequences, as well as event latencies, and quantified the percentage of unique sequences in two samples of healthy adults: young (< 45) and old (> 65). The 8 swallowing events that contribute to the sequences were reliably identified from videofluoroscopy in a sample of 23 healthy seniors (10 male, mean age 74.7) and 20 healthy young adults (10 male, mean age 31.5) with no evidence of penetration-aspiration or post-swallow residue. Chi-square analyses compared the proportions of obligatory pairs and unique sequences by age group. Compared to the older subjects, younger subjects had significantly lower adherence to two obligatory sequences: Upper Esophageal Sphincter (UES) opening occurs before (or simultaneous with) the bolus arriving at the UES and UES maximum distention occurs before maximum pharyngeal constriction. The associated latencies were significantly different between age groups as well. Further, significantly fewer unique swallow sequences were observed in the older group (61%) compared with the young (82%) (χ 2  = 31.8; p < 0.001). Our findings suggest that paired swallow event sequences may not be robust across the age continuum and that variation in swallow sequences appears to decrease with aging. These findings provide normative references for comparisons to older individuals with dysphagia.

  5. Novel numerical and graphical representation of DNA sequences and proteins.

    PubMed

    Randić, M; Novic, M; Vikić-Topić, D; Plavsić, D

    2006-12-01

    We have introduced novel numerical and graphical representations of DNA, which offer a simple and unique characterization of DNA sequences. The numerical representation of a DNA sequence is given as a sequence of real numbers derived from a unique graphical representation of the standard genetic code. There is no loss of information on the primary structure of a DNA sequence associated with this numerical representation. The novel representations are illustrated with the coding sequences of the first exon of beta-globin gene of half a dozen species in addition to human. The method can be extended to proteins as is exemplified by humanin, a 24-aa peptide that has recently been identified as a specific inhibitor of neuronal cell death induced by familial Alzheimer's disease mutant genes.

  6. Real-Time PCR Assay for a Unique Chromosomal Sequence of Bacillus anthracis

    DTIC Science & Technology

    2004-12-01

    13061 Neisseria lactamica .............................................................. 23970 Bacillus coagulans ...NEG Bacillus coagulane 7050 NEG NEG Bacillus cereus 13472 NEG NEG Bacillus licheniforms 12759 NEG NEG Bacillus cereus 13824 NEG NEG Bacillus ...Assay for a Unique Chromosomal Sequence of Bacillus anthracis Elizabeth Bode,1 William Hurtle,2† and David Norwood1* United States Army Medical

  7. Draft Genome Sequence of the Spore-Forming Probiotic Strain Bacillus coagulans Unique IS-2

    PubMed Central

    Upadrasta, Aditya; Pitta, Swetha

    2016-01-01

    Bacillus coagulans Unique IS-2 is a potential spore-forming probiotic that is commercially available on the market. The draft genome sequence presented here provides deep insight into the beneficial features of this strain for its safe use as a probiotic for various human and animal health applications. PMID:27103709

  8. Digital RNA sequencing minimizes sequence-dependent bias and amplification noise with optimized single-molecule barcodes

    PubMed Central

    Shiroguchi, Katsuyuki; Jia, Tony Z.; Sims, Peter A.; Xie, X. Sunney

    2012-01-01

    RNA sequencing (RNA-Seq) is a powerful tool for transcriptome profiling, but is hampered by sequence-dependent bias and inaccuracy at low copy numbers intrinsic to exponential PCR amplification. We developed a simple strategy for mitigating these complications, allowing truly digital RNA-Seq. Following reverse transcription, a large set of barcode sequences is added in excess, and nearly every cDNA molecule is uniquely labeled by random attachment of barcode sequences to both ends. After PCR, we applied paired-end deep sequencing to read the two barcodes and cDNA sequences. Rather than counting the number of reads, RNA abundance is measured based on the number of unique barcode sequences observed for a given cDNA sequence. We optimized the barcodes to be unambiguously identifiable, even in the presence of multiple sequencing errors. This method allows counting with single-copy resolution despite sequence-dependent bias and PCR-amplification noise, and is analogous to digital PCR but amendable to quantifying a whole transcriptome. We demonstrated transcriptome profiling of Escherichia coli with more accurate and reproducible quantification than conventional RNA-Seq. PMID:22232676

  9. Sequencing Needs for Viral Diagnostics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gardner, S N; Lam, M; Mulakken, N J

    2004-01-26

    We built a system to guide decisions regarding the amount of genomic sequencing required to develop diagnostic DNA signatures, which are short sequences that are sufficient to uniquely identify a viral species. We used our existing DNA diagnostic signature prediction pipeline, which selects regions of a target species genome that are conserved among strains of the target (for reliability, to prevent false negatives) and unique relative to other species (for specificity, to avoid false positives). We performed simulations, based on existing sequence data, to assess the number of genome sequences of a target species and of close phylogenetic relatives (''nearmore » neighbors'') that are required to predict diagnostic signature regions that are conserved among strains of the target species and unique relative to other bacterial and viral species. For DNA viruses such as variola (smallpox), three target genomes provide sufficient guidance for selecting species-wide signatures. Three near neighbor genomes are critical for species specificity. In contrast, most RNA viruses require four target genomes and no near neighbor genomes, since lack of conservation among strains is more limiting than uniqueness. SARS and Ebola Zaire are exceptional, as additional target genomes currently do not improve predictions, but near neighbor sequences are urgently needed. Our results also indicate that double stranded DNA viruses are more conserved among strains than are RNA viruses, since in most cases there was at least one conserved signature candidate for the DNA viruses and zero conserved signature candidates for the RNA viruses.« less

  10. Sequencing Adventure Activities: A New Perspective.

    ERIC Educational Resources Information Center

    Bisson, Christian

    Sequencing in adventure education involves putting activities in an order appropriate to the needs of the group. Contrary to the common assumption that each adventure sequence is unique, a review of literature concerning five sequencing models reveals a certain universality. These models present sequences that move through four phases: group…

  11. The neXtProt peptide uniqueness checker: a tool for the proteomics community.

    PubMed

    Schaeffer, Mathieu; Gateau, Alain; Teixeira, Daniel; Michel, Pierre-André; Zahn-Zabal, Monique; Lane, Lydie

    2017-11-01

    The neXtProt peptide uniqueness checker allows scientists to define which peptides can be used to validate the existence of human proteins, i.e. map uniquely versus multiply to human protein sequences taking into account isobaric substitutions, alternative splicing and single amino acid variants. The pepx program is available at https://github.com/calipho-sib/pepx and can be launched from the command line or through a cgi web interface. Indexing requires a sequence file in FASTA format. The peptide uniqueness checker tool is freely available on the web at https://www.nextprot.org/tools/peptide-uniqueness-checker and from the neXtProt API at https://api.nextprot.org/. lydie.lane@sib.swiss. © The Author(s) 2017. Published by Oxford University Press.

  12. A Unique Sequence of Financial Accounting Courses Featuring Team Teaching, Linked Courses, Challenging Assignments, and Instruments for Evaluation and Assessment

    ERIC Educational Resources Information Center

    Lundblad, Heidemarie; Wilson, Barbara A.

    2008-01-01

    The Department of Accounting at California State University Northridge (CSUN) has developed a unique sequence of courses designed to ensure that accounting students are trained not only in technical accounting, but also acquire critical thinking, research and communication skills. The courses have proven effective and have embedded assessment…

  13. VISA--Vector Integration Site Analysis server: a web-based server to rapidly identify retroviral integration sites from next-generation sequencing.

    PubMed

    Hocum, Jonah D; Battrell, Logan R; Maynard, Ryan; Adair, Jennifer E; Beard, Brian C; Rawlings, David J; Kiem, Hans-Peter; Miller, Daniel G; Trobridge, Grant D

    2015-07-07

    Analyzing the integration profile of retroviral vectors is a vital step in determining their potential genotoxic effects and developing safer vectors for therapeutic use. Identifying retroviral vector integration sites is also important for retroviral mutagenesis screens. We developed VISA, a vector integration site analysis server, to analyze next-generation sequencing data for retroviral vector integration sites. Sequence reads that contain a provirus are mapped to the human genome, sequence reads that cannot be localized to a unique location in the genome are filtered out, and then unique retroviral vector integration sites are determined based on the alignment scores of the remaining sequence reads. VISA offers a simple web interface to upload sequence files and results are returned in a concise tabular format to allow rapid analysis of retroviral vector integration sites.

  14. Generation and Analysis of a Large-Scale Expressed Sequence Tag Database from a Full-Length Enriched cDNA Library of Developing Leaves of Gossypium hirsutum L

    PubMed Central

    Pang, Chaoyou; Fan, Shuli; Song, Meizhen; Yu, Shuxun

    2013-01-01

    Background Cotton (Gossypium hirsutum L.) is one of the world’s most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. Methodology/Principal Findings In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR), which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. Conclusions/Significance These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence assembly and annotation in G. hirsutum and comparative genomics among Gossypium species. PMID:24146870

  15. Points of View: A Survey of Survey Courses--Are They Effective? A Unique Approach? Four Semesters of Biology Core Curriculum

    ERIC Educational Resources Information Center

    Batzli, Janet M.

    2005-01-01

    ''Why four semesters? How does this track differ from the two-semester course sequence?'' These are the most common questions students have when they learn about the Biology Core Curriculum (Biocore), a unique four-semester honors biology sequence at University of Wisconsin-Madison (UW-Madison). Biocore was first taught at University of Wisconsin…

  16. The complete chloroplast genome sequence of Epipremnum aureum and its comparative analysis among eight Araceae species

    PubMed Central

    Han, Limin; Chen, Chen; Wang, Zhezhi

    2018-01-01

    Epipremnum aureum is an important foliage plant in the Araceae family. In this study, we have sequenced the complete chloroplast genome of E. aureum by using Illumina Hiseq sequencing platforms. This genome is a double-stranded circular DNA sequence of 164,831 bp that contains 35.8% GC. The two inverted repeats (IRa and IRb; 26,606 bp) are spaced by a small single-copy region (22,868 bp) and a large single-copy region (88,751 bp). The chloroplast genome has 131 (113 unique) functional genes, including 86 (79 unique) protein-coding genes, 37 (30 unique) tRNA genes, and eight (four unique) rRNA genes. Tandem repeats comprise the majority of the 43 long repetitive sequences. In addition, 111 simple sequence repeats are present, with mononucleotides being the most common type and di- and tetranucleotides being infrequent events. Positive selection pressure on rps12 in the E. aureum chloroplast has been demonstrated via synonymous and nonsynonymous substitution rates and selection pressure sites analyses. Ycf15 and infA are pseudogenes in this species. We constructed a Maximum Likelihood phylogenetic tree based on the complete chloroplast genomes of 38 species from 13 families. Those results strongly indicated that E. aureum is positioned as the sister of Colocasia esculenta within the Araceae family. This work may provide information for further study of the molecular phylogenetic relationships within Araceae, as well as molecular markers and breeding novel varieties by chloroplast genetic-transformation of E. aureum in particular. PMID:29529038

  17. [Identification and phylogenetic application of unique nucleotide sequence of nad7 intron2 in Rhodiola (Crassulaceae) species].

    PubMed

    Deng, Ke-Jun; Yang, Zu-Jun; Liu, Cheng; Zhao, Wei; Liu, Chang; Feng, Juan; Ren, Zheng-Long

    2007-03-01

    Genetic characterization of 9 populations of Rhodiola crenulata, R. fastigiata and R. sachalinensis (Crassulaceae) species from Sichuan and Jilin Provinces of China, was investigated using the conserved primer of nad7 intron 2. All PCR products about 800 bp long were shorter than other Crassulaceae plants, which were used as molecular markers to identify the Rhodiola species. The sequence of the products indicated that total exon of 53 bp and intron of 738 bp exhibit only 9 nucleotide variations. Blasting the nad7 sequences to GenBank and the phylogenetic analysis showed that the sequence of Rhodiola species was clusted independently, and the length was smaller than all the registered sequences of higher plants. The result suggests that the Rhiodola species had a unique sequence in this gene region, which might be related to the special growth condition.

  18. PuLSE: Quality control and quantification of peptide sequences explored by phage display libraries.

    PubMed

    Shave, Steven; Mann, Stefan; Koszela, Joanna; Kerr, Alastair; Auer, Manfred

    2018-01-01

    The design of highly diverse phage display libraries is based on assumption that DNA bases are incorporated at similar rates within the randomized sequence. As library complexity increases and expected copy numbers of unique sequences decrease, the exploration of library space becomes sparser and the presence of truly random sequences becomes critical. We present the program PuLSE (Phage Library Sequence Evaluation) as a tool for assessing randomness and therefore diversity of phage display libraries. PuLSE runs on a collection of sequence reads in the fastq file format and generates tables profiling the library in terms of unique DNA sequence counts and positions, translated peptide sequences, and normalized 'expected' occurrences from base to residue codon frequencies. The output allows at-a-glance quantitative quality control of a phage library in terms of sequence coverage both at the DNA base and translated protein residue level, which has been missing from toolsets and literature. The open source program PuLSE is available in two formats, a C++ source code package for compilation and integration into existing bioinformatics pipelines and precompiled binaries for ease of use.

  19. Equivalent Indels – Ambiguous Functional Classes and Redundancy in Databases

    PubMed Central

    Assmus, Jens; Kleffe, Jürgen; Schmitt, Armin O.; Brockmann, Gudrun A.

    2013-01-01

    There is considerable interest in studying sequenced variations. However, while the positions of substitutions are uniquely identifiable by sequence alignment, the location of insertions and deletions still poses problems. Each insertion and deletion causes a change of sequence. Yet, due to low complexity or repetitive sequence structures, the same indel can sometimes be annotated in different ways. Two indels which differ in allele sequence and position can be one and the same, i.e. the alternative sequence of the whole chromosome is identical in both cases and, therefore, the two deletions are biologically equivalent. In such a case, it is impossible to identify the exact position of an indel merely based on sequence alignment. Thus, variation entries in a mutation database are not necessarily uniquely defined. We prove the existence of a contiguous region around an indel in which all deletions of the same length are biologically identical. Databases often show only one of several possible locations for a given variation. Furthermore, different data base entries can represent equivalent variation events. We identified 1,045,590 such problematic entries of insertions and deletions out of 5,860,408 indel entries in the current human database of Ensembl. Equivalent indels are found in sequence regions of different functions like exons, introns or 5' and 3' UTRs. One and the same variation can be assigned to several different functional classifications of which only one is correct. We implemented an algorithm that determines for each indel database entry its complete set of equivalent indels which is uniquely characterized by the indel itself and a given interval of the reference sequence. PMID:23658777

  20. De novo assembly, characterization and functional annotation of pineapple fruit transcriptome through massively parallel sequencing.

    PubMed

    Ong, Wen Dee; Voo, Lok-Yung Christopher; Kumar, Vijay Subbiah

    2012-01-01

    Pineapple (Ananas comosus var. comosus), is an important tropical non-climacteric fruit with high commercial potential. Understanding the mechanism and processes underlying fruit ripening would enable scientists to enhance the improvement of quality traits such as, flavor, texture, appearance and fruit sweetness. Although, the pineapple is an important fruit, there is insufficient transcriptomic or genomic information that is available in public databases. Application of high throughput transcriptome sequencing to profile the pineapple fruit transcripts is therefore needed. To facilitate this, we have performed transcriptome sequencing of ripe yellow pineapple fruit flesh using Illumina technology. About 4.7 millions Illumina paired-end reads were generated and assembled using the Velvet de novo assembler. The assembly produced 28,728 unique transcripts with a mean length of approximately 200 bp. Sequence similarity search against non-redundant NCBI database identified a total of 16,932 unique transcripts (58.93%) with significant hits. Out of these, 15,507 unique transcripts were assigned to gene ontology terms. Functional annotation against Kyoto Encyclopedia of Genes and Genomes pathway database identified 13,598 unique transcripts (47.33%) which were mapped to 126 pathways. The assembly revealed many transcripts that were previously unknown. The unique transcripts derived from this work have rapidly increased of the number of the pineapple fruit mRNA transcripts as it is now available in public databases. This information can be further utilized in gene expression, genomics and other functional genomics studies in pineapple.

  1. De Novo Assembly, Characterization and Functional Annotation of Pineapple Fruit Transcriptome through Massively Parallel Sequencing

    PubMed Central

    Ong, Wen Dee; Voo, Lok-Yung Christopher; Kumar, Vijay Subbiah

    2012-01-01

    Background Pineapple (Ananas comosus var. comosus), is an important tropical non-climacteric fruit with high commercial potential. Understanding the mechanism and processes underlying fruit ripening would enable scientists to enhance the improvement of quality traits such as, flavor, texture, appearance and fruit sweetness. Although, the pineapple is an important fruit, there is insufficient transcriptomic or genomic information that is available in public databases. Application of high throughput transcriptome sequencing to profile the pineapple fruit transcripts is therefore needed. Methodology/Principal Findings To facilitate this, we have performed transcriptome sequencing of ripe yellow pineapple fruit flesh using Illumina technology. About 4.7 millions Illumina paired-end reads were generated and assembled using the Velvet de novo assembler. The assembly produced 28,728 unique transcripts with a mean length of approximately 200 bp. Sequence similarity search against non-redundant NCBI database identified a total of 16,932 unique transcripts (58.93%) with significant hits. Out of these, 15,507 unique transcripts were assigned to gene ontology terms. Functional annotation against Kyoto Encyclopedia of Genes and Genomes pathway database identified 13,598 unique transcripts (47.33%) which were mapped to 126 pathways. The assembly revealed many transcripts that were previously unknown. Conclusions The unique transcripts derived from this work have rapidly increased of the number of the pineapple fruit mRNA transcripts as it is now available in public databases. This information can be further utilized in gene expression, genomics and other functional genomics studies in pineapple. PMID:23091603

  2. Mosaic Graphs and Comparative Genomics in Phage Communities

    PubMed Central

    Belcaid, Mahdi; Bergeron, Anne

    2010-01-01

    Abstract Comparing the genomes of two closely related viruses often produces mosaics where nearly identical sequences alternate with sequences that are unique to each genome. When several closely related genomes are compared, the unique sequences are likely to be shared with third genomes, leading to virus mosaic communities. Here we present comparative analysis of sets of Staphylococcus aureus phages that share large identical sequences with up to three other genomes, and with different partners along their genomes. We introduce mosaic graphs to represent these complex recombination events, and use them to illustrate the breath and depth of sequence sharing: some genomes are almost completely made up of shared sequences, while genomes that share very large identical sequences can adopt alternate functional modules. Mosaic graphs also allow us to identify breakpoints that could eventually be used for the construction of recombination networks. These findings have several implications on phage metagenomics assembly, on the horizontal gene transfer paradigm, and more generally on the understanding of the composition and evolutionary dynamics of virus communities. PMID:20874413

  3. Production of Supra-regular Spatial Sequences by Macaque Monkeys.

    PubMed

    Jiang, Xinjian; Long, Tenghai; Cao, Weicong; Li, Junru; Dehaene, Stanislas; Wang, Liping

    2018-06-18

    Understanding and producing embedded sequences in language, music, or mathematics, is a central characteristic of our species. These domains are hypothesized to involve a human-specific competence for supra-regular grammars, which can generate embedded sequences that go beyond the regular sequences engendered by finite-state automata. However, is this capacity truly unique to humans? Using a production task, we show that macaque monkeys can be trained to produce time-symmetrical embedded spatial sequences whose formal description requires supra-regular grammars or, equivalently, a push-down stack automaton. Monkeys spontaneously generalized the learned grammar to novel sequences, including longer ones, and could generate hierarchical sequences formed by an embedding of two levels of abstract rules. Compared to monkeys, however, preschool children learned the grammars much faster using a chunking strategy. While supra-regular grammars are accessible to nonhuman primates through extensive training, human uniqueness may lie in the speed and learning strategy with which they are acquired. Copyright © 2018 Elsevier Ltd. All rights reserved.

  4. Unique core genomes of the bacterial family vibrionaceae: insights into niche adaptation and speciation.

    PubMed

    Kahlke, Tim; Goesmann, Alexander; Hjerde, Erik; Willassen, Nils Peder; Haugen, Peik

    2012-05-10

    The criteria for defining bacterial species and even the concept of bacterial species itself are under debate, and the discussion is apparently intensifying as more genome sequence data is becoming available. However, it is still unclear how the new advances in genomics should be used most efficiently to address this question. In this study we identify genes that are common to any group of genomes in our dataset, to determine whether genes specific to a particular taxon exist and to investigate their potential role in adaptation of bacteria to their specific niche. These genes were named unique core genes. Additionally, we investigate the existence and importance of unique core genes that are found in isolates of phylogenetically non-coherent groups. These groups of isolates, that share a genetic feature without sharing a closest common ancestor, are termed genophyletic groups. The bacterial family Vibrionaceae was used as the model, and we compiled and compared genome sequences of 64 different isolates. Using the software orthoMCL we determined clusters of homologous genes among the investigated genome sequences. We used multilocus sequence analysis to build a host phylogeny and mapped the numbers of unique core genes of all distinct groups of isolates onto the tree. The results show that unique core genes are more likely to be found in monophyletic groups of isolates. Genophyletic groups of isolates, in contrast, are less common especially for large groups of isolate. The subsequent annotation of unique core genes that are present in genophyletic groups indicate a high degree of horizontally transferred genes. Finally, the annotation of the unique core genes of Vibrio cholerae revealed genes involved in aerotaxis and biosynthesis of the iron-chelator vibriobactin. The presented work indicates that genes specific for any taxon inside the bacterial family Vibrionaceae exist. These unique core genes encode conserved metabolic functions that can shed light on the adaptation of a species to its ecological niche. Additionally, our study suggests that unique core genes can be used to aid classification of bacteria and contribute to a bacterial species definition on a genomic level. Furthermore, these genes may be of importance in clinical diagnostics and drug development.

  5. In vitro resolution of the dimer bridge of the minute virus of mice (MVM) genome supports the modified rolling hairpin model for MVM replication.

    PubMed

    Liu, Q; Yong, C B; Astell, C R

    1994-06-01

    Previous characterization of the terminal sequences of the minute virus of mice (MVM) genome demonstrated that the right hand palindrome contains two sequences, each the inverted complement of the other. However, the left hand palindrome was shown to exist as a unique sequence [Astell et al., J. Virol. 54: 179-185 (1985)]. The modified rolling hairpin (MRH) model for MVM replication provided an explanation of how the right hand palindrome could undergo hairpin transfer to generate two sequences, while the left end palindrome within the dimer bridge could undergo asymmetric resolution and retain the unique left end sequence. This report describes in vitro resolution of the wild-type dimer bridge sequence of MVM using recombinant (baculovirus) expressed NS-1 and a replication extract from LA9 cells. The resolution products are consistent with those predicted by the MRH model, providing support for this replication mechanism. In addition, mutant dimer bridge clones were constructed and used in the resolution assay. The mutant structures included removal of the asymmetry in the hairpin stem, inversion of the sequence at the initiating nick site, and a 2-bp deletion within one stem of the dimer bridge. In all cases, the mutant dimer bridge structures are resolved; however, the resolution pattern observed with the mutant dimer bridge compared with the wild-type dimer bridge is shifted toward symmetrical resolution. These results suggest that sequences within the left hand hairpin (and hence dimer bridge sequence) are responsible for asymmetric resolution and conservation of the unique sequence within the left hand palindrome of the MVM genome.

  6. Unique Variants in OPN1LW Cause Both Syndromic and Nonsyndromic X-Linked High Myopia Mapped to MYP1.

    PubMed

    Li, Jiali; Gao, Bei; Guan, Liping; Xiao, Xueshan; Zhang, Jianguo; Li, Shiqiang; Jiang, Hui; Jia, Xiaoyun; Yang, Jianhua; Guo, Xiangming; Yin, Ye; Wang, Jun; Zhang, Qingjiong

    2015-06-01

    MYP1 is a locus for X-linked syndromic and nonsyndromic high myopia. Recently, unique haplotypes in OPN1LW were found to be responsible for X-linked syndromic high myopia mapped to MYP1. The current study is to test if such variants in OPN1LW are also responsible for X-linked nonsyndromic high myopia mapped to MYP1. The proband of the family previously mapped to MYP1 was initially analyzed using whole-exome sequencing and whole-genome sequencing. Additional probands with early-onset high myopia were analyzed using whole-exome sequencing. Variants in OPN1LW were selected and confirmed by Sanger sequencing. Long-range and second PCR were used to determine the haplotype and the first gene of the red-green gene array. Candidate variants were further validated in family members and controls. The unique LVAVA haplotype in OPN1LW was detected in the family with X-linked nonsyndromic high myopia mapped to MYP1. In addition, this haplotype and a novel frameshift mutation (c.617_620dup, p.Phe208Argfs*51) in OPN1LW were detected in two other families with X-linked high myopia. The unique haplotype cosegregated with high myopia in the two families, with a maximum LOD score of 3.34 and 2.31 at θ = 0. OPN1LW with the variants in these families was the first gene in the red-green gene array and was not present in 247 male controls. Reevaluation of the clinical data in both families with the unique haplotype suggested nonsyndromic high myopia. Our study confirms the findings that unique variants in OPN1LW are responsible for both syndromic and nonsyndromic X-linked high myopia mapped to MYP1.

  7. Analysis of the transcriptome of Panax notoginseng root uncovers putative triterpene saponin-biosynthetic genes and genetic markers

    PubMed Central

    2011-01-01

    Background Panax notoginseng (Burk) F.H. Chen is important medicinal plant of the Araliacease family. Triterpene saponins are the bioactive constituents in P. notoginseng. However, available genomic information regarding this plant is limited. Moreover, details of triterpene saponin biosynthesis in the Panax species are largely unknown. Results Using the 454 pyrosequencing technology, a one-quarter GS FLX titanium run resulted in 188,185 reads with an average length of 410 bases for P. notoginseng root. These reads were processed and assembled by 454 GS De Novo Assembler software into 30,852 unique sequences. A total of 70.2% of unique sequences were annotated by Basic Local Alignment Search Tool (BLAST) similarity searches against public sequence databases. The Kyoto Encyclopedia of Genes and Genomes (KEGG) assignment discovered 41 unique sequences representing 11 genes involved in triterpene saponin backbone biosynthesis in the 454-EST dataset. In particular, the transcript encoding dammarenediol synthase (DS), which is the first committed enzyme in the biosynthetic pathway of major triterpene saponins, is highly expressed in the root of four-year-old P. notoginseng. It is worth emphasizing that the candidate cytochrome P450 (Pn02132 and Pn00158) and UDP-glycosyltransferase (Pn00082) gene most likely to be involved in hydroxylation or glycosylation of aglycones for triterpene saponin biosynthesis were discovered from 174 cytochrome P450s and 242 glycosyltransferases by phylogenetic analysis, respectively. Putative transcription factors were detected in 906 unique sequences, including Myb, homeobox, WRKY, basic helix-loop-helix (bHLH), and other family proteins. Additionally, a total of 2,772 simple sequence repeat (SSR) were identified from 2,361 unique sequences, of which, di-nucleotide motifs were the most abundant motif. Conclusion This study is the first to present a large-scale EST dataset for P. notoginseng root acquired by next-generation sequencing (NGS) technology. The candidate genes involved in triterpene saponin biosynthesis, including the putative CYP450s and UGTs, were obtained in this study. Additionally, the identification of SSRs provided plenty of genetic makers for molecular breeding and genetics applications in this species. These data will provide information on gene discovery, transcriptional regulation and marker-assisted selection for P. notoginseng. The dataset establishes an important foundation for the study with the purpose of ensuring adequate drug resources for this species. PMID:22369100

  8. Animal selection for whole genome sequencing by quantifying the unique contribution of homozygous haplotypes sequenced

    USDA-ARS?s Scientific Manuscript database

    Major whole genome sequencing projects promise to identify rare and causal variants within livestock species; however, the efficient selection of animals for sequencing remains a major problem within these surveys. The goal of this project was to develop a library of high accuracy genetic variants f...

  9. Using the self-select paradigm to delineate the nature of speech motor programming.

    PubMed

    Wright, David L; Robin, Don A; Rhee, Jooyhun; Vaculin, Amber; Jacks, Adam; Guenther, Frank H; Fox, Peter T

    2009-06-01

    The authors examined the involvement of 2 speech motor programming processes identified by S. T. Klapp (1995, 2003) during the articulation of utterances differing in syllable and sequence complexity. According to S. T. Klapp, 1 process, INT, resolves the demands of the programmed unit, whereas a second process, SEQ, oversees the serial order demands of longer sequences. A modified reaction time paradigm was used to assess INT and SEQ demands. Specifically, syllable complexity was dependent on syllable structure, whereas sequence complexity involved either repeated or unique syllabi within an utterance. INT execution was slowed when articulating single syllables in the form CCCV compared to simpler CV syllables. Planning unique syllables within a multisyllabic utterance rather than repetitions of the same syllable slowed INT but not SEQ. The INT speech motor programming process, important for mental syllabary access, is sensitive to changes in both syllable structure and the number of unique syllables in an utterance.

  10. Quantitative statistical analysis of cis-regulatory sequences in ABA/VP1- and CBF/DREB1-regulated genes of Arabidopsis.

    PubMed

    Suzuki, Masaharu; Ketterling, Matthew G; McCarty, Donald R

    2005-09-01

    We have developed a simple quantitative computational approach for objective analysis of cis-regulatory sequences in promoters of coregulated genes. The program, designated MotifFinder, identifies oligo sequences that are overrepresented in promoters of coregulated genes. We used this approach to analyze promoter sequences of Viviparous1 (VP1)/abscisic acid (ABA)-regulated genes and cold-regulated genes, respectively, of Arabidopsis (Arabidopsis thaliana). We detected significantly enriched sequences in up-regulated genes but not in down-regulated genes. This result suggests that gene activation but not repression is mediated by specific and common sequence elements in promoters. The enriched motifs include several known cis-regulatory sequences as well as previously unidentified motifs. With respect to known cis-elements, we dissected the flanking nucleotides of the core sequences of Sph element, ABA response elements (ABREs), and the C repeat/dehydration-responsive element. This analysis identified the motif variants that may correlate with qualitative and quantitative differences in gene expression. While both VP1 and cold responses are mediated in part by ABA signaling via ABREs, these responses correlate with unique ABRE variants distinguished by nucleotides flanking the ACGT core. ABRE and Sph motifs are tightly associated uniquely in the coregulated set of genes showing a strict dependence on VP1 and ABA signaling. Finally, analysis of distribution of the enriched sequences revealed a striking concentration of enriched motifs in a proximal 200-base region of VP1/ABA and cold-regulated promoters. Overall, each class of coregulated genes possesses a discrete set of the enriched motifs with unique distributions in their promoters that may account for the specificity of gene regulation.

  11. Technical Considerations for Reduced Representation Bisulfite Sequencing with Multiplexed Libraries

    PubMed Central

    Chatterjee, Aniruddha; Rodger, Euan J.; Stockwell, Peter A.; Weeks, Robert J.; Morison, Ian M.

    2012-01-01

    Reduced representation bisulfite sequencing (RRBS), which couples bisulfite conversion and next generation sequencing, is an innovative method that specifically enriches genomic regions with a high density of potential methylation sites and enables investigation of DNA methylation at single-nucleotide resolution. Recent advances in the Illumina DNA sample preparation protocol and sequencing technology have vastly improved sequencing throughput capacity. Although the new Illumina technology is now widely used, the unique challenges associated with multiplexed RRBS libraries on this platform have not been previously described. We have made modifications to the RRBS library preparation protocol to sequence multiplexed libraries on a single flow cell lane of the Illumina HiSeq 2000. Furthermore, our analysis incorporates a bioinformatics pipeline specifically designed to process bisulfite-converted sequencing reads and evaluate the output and quality of the sequencing data generated from the multiplexed libraries. We obtained an average of 42 million paired-end reads per sample for each flow-cell lane, with a high unique mapping efficiency to the reference human genome. Here we provide a roadmap of modifications, strategies, and trouble shooting approaches we implemented to optimize sequencing of multiplexed libraries on an a RRBS background. PMID:23193365

  12. Novel application of the MSSCP method in biodiversity studies.

    PubMed

    Tomczyk-Żak, Karolina; Kaczanowski, Szymon; Górecka, Magdalena; Zielenkiewicz, Urszula

    2012-02-01

    Analysis of 16S rRNA sequence diversity is widely performed for characterizing the biodiversity of microbial samples. The number of determined sequences has a considerable impact on complete results. Although the cost of mass sequencing is decreasing, it is often still too high for individual projects. We applied the multi-temperature single-strand conformational polymorphism (MSSCP) method to decrease the number of analysed sequences. This was a novel application of this method. As a control, the same sample was analysed using random sequencing. In this paper, we adapted the MSSCP technique for screening of unique sequences of the 16S rRNA gene library and bacterial strains isolated from biofilms growing on the walls of an ancient gold mine in Poland and determined whether the results obtained by both methods differed and whether random sequencing could be replaced by MSSCP. Although it was biased towards the detection of rare sequences in the samples, the qualitative results of MSSCP were not different than those of random sequencing. Unambiguous discrimination of unique clones and strains creates an opportunity to effectively estimate the biodiversity of natural communities, especially in populations which are numerous but species poor. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Unique LCR variations among lineages of HPV16, 18 and 45 isolates from women with normal cervical cytology in Ghana.

    PubMed

    Awua, Adolf K; Adanu, Richard M K; Wiredu, Edwin K; Afari, Edwin A; Zubuch, Vanessa A; Asmah, Richard H; Severini, Alberto

    2017-04-21

    In addition to being useful for classification, sequence variations of human Papillomavirus (HPV) genotypes have been implicated in differential oncogenic potential and a differential association with the different histological forms of invasive cervical cancer. These associations have also been indicated for HPV genotype lineages and sub-lineages. In order to better understand the potential implications of lineage variation in the occurrence of cervical cancers in Ghana, we studied the lineages of the three most prevalent HPV genotypes among women with normal cytology as baseline to further studies. Of previously collected self- and health personnel-collected cervical specimen, 54, which were positive for HPV16, 18 and 45, were selected and the long control region (LCR) of each HPV genotype was separately amplified by a nested PCR. DNA sequences of 41 isolates obtained with the forward and reverse primers by Sanger sequencing were analysed. Nucleotide sequence variations of the HPV16 genotypes were observed at 30 positions within the LCR (7460 - 7840). Of these, 19 were the known variations for the lineages B and C (African lineages), while the other 11 positions had variations unique to the HPV16 isolates of this study. For the HPV18 isolates, the variations were at 35 positions, 22 of which were known variations of Africa lineages and the other 13 were unique variations observed for the isolates obtained in this study (at positions 7799 and 7813). HPV45 isolates had variations at 35 positions and 2 (positions 7114 and 97) were unique to the isolates of this study. This study provides the first data on the lineages of HPV 16, 18 and 45 isolates from Ghana. Although the study did not obtain full genome sequence data for a comprehensive comparison with known lineages, these genotypes were predominately of the Africa lineages and had some unique sequence variations at positions that suggest potential oncogenic implications. These data will be useful for comparison with lineages of these genotypes from women with cervical lesion and all the forms of invasive cervical cancers.

  14. DSAP: deep-sequencing small RNA analysis pipeline.

    PubMed

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  15. Abamectin, pymetrozine and azadirachtin sequence as a unique solution to control the leafminer Liriomyza trifolii (Burgess) (Diptera: Agromyzidae) infesting garden beans (Phaseolus vulgaris L.) in Egypt.

    PubMed

    Saad, A S A; Massoud, M A; Abdel-Megeed, A A M; Hamid, N A; Mourad, A K K; Barakat, A S T

    2007-01-01

    Field trails were conducted to determine the performance of three different sequences as a unique solution for the control of the leaf miner Liriomyza trifolii (Burgess) (Diptera: Agromyzidae) infesting garden beans (Phaseolus vulgaris L.) during the two successive seasons of 2004 and 2005. Furthermore, during the evaluation period, the side effect against the ectoparasite Diglyphus isaea (Walker) (Hymenoptera: Eulophidae) was put into consideration. Meanwhile, the comparative evaluation of the pesticides alone showed that abamectin and azadirachtin were highly effective against Liriomyza trifolii, while carbosulfan, pymetrozine and thiamethoxam provided to be of a moderate effect. Moreover, carbosulfan showed harmful effect to the larvae of the ectoparasite Diglyphus isaea (Walker), while abamectin and azadirachtin gave a moderate effect. Thiamethoxam and the the detergent (Masrol 410) had slight effect in this respect. The highly effective sequence among the sequences was abamectin, pymetrozine and azadirachtin, against Liriomyza trifolii (Burgess), with slight harmful effect on Diglyphus isaea (Walker). However the sequence of azadirachtin, pymetrozine and abamectin had a moderate effect on Liriomyza trifolii (Burgess) and exhibited a slight toxic effect on Diglyphus isaea (Walker). In contrast, the sequence of carbosulfan, thiamethoxam and pymetrozine was the least effective and represented a slight effect on Diglyphus isaea (Walker). From this study, it was concluded that abamectin, pymetrozine and azadirachtin sequence has proved to be a unique solution for the control of the leaf miner Liriomyza trifolii (Burgess) infesting garden beans (Phaseolus vulgaris L.) in Egypt.

  16. Image Encryption Algorithm Based on Hyperchaotic Maps and Nucleotide Sequences Database

    PubMed Central

    2017-01-01

    Image encryption technology is one of the main means to ensure the safety of image information. Using the characteristics of chaos, such as randomness, regularity, ergodicity, and initial value sensitiveness, combined with the unique space conformation of DNA molecules and their unique information storage and processing ability, an efficient method for image encryption based on the chaos theory and a DNA sequence database is proposed. In this paper, digital image encryption employs a process of transforming the image pixel gray value by using chaotic sequence scrambling image pixel location and establishing superchaotic mapping, which maps quaternary sequences and DNA sequences, and by combining with the logic of the transformation between DNA sequences. The bases are replaced under the displaced rules by using DNA coding in a certain number of iterations that are based on the enhanced quaternary hyperchaotic sequence; the sequence is generated by Chen chaos. The cipher feedback mode and chaos iteration are employed in the encryption process to enhance the confusion and diffusion properties of the algorithm. Theoretical analysis and experimental results show that the proposed scheme not only demonstrates excellent encryption but also effectively resists chosen-plaintext attack, statistical attack, and differential attack. PMID:28392799

  17. Je, a versatile suite to handle multiplexed NGS libraries with unique molecular identifiers.

    PubMed

    Girardot, Charles; Scholtalbers, Jelle; Sauer, Sajoscha; Su, Shu-Yi; Furlong, Eileen E M

    2016-10-08

    The yield obtained from next generation sequencers has increased almost exponentially in recent years, making sample multiplexing common practice. While barcodes (known sequences of fixed length) primarily encode the sample identity of sequenced DNA fragments, barcodes made of random sequences (Unique Molecular Identifier or UMIs) are often used to distinguish between PCR duplicates and transcript abundance in, for example, single-cell RNA sequencing (scRNA-seq). In paired-end sequencing, different barcodes can be inserted at each fragment end to either increase the number of multiplexed samples in the library or to use one of the barcodes as UMI. Alternatively, UMIs can be combined with the sample barcodes into composite barcodes, or with standard Illumina® indexing. Subsequent analysis must take read duplicates and sample identity into account, by identifying UMIs. Existing tools do not support these complex barcoding configurations and custom code development is frequently required. Here, we present Je, a suite of tools that accommodates complex barcoding strategies, extracts UMIs and filters read duplicates taking UMIs into account. Using Je on publicly available scRNA-seq and iCLIP data containing UMIs, the number of unique reads increased by up to 36 %, compared to when UMIs are ignored. Je is implemented in JAVA and uses the Picard API. Code, executables and documentation are freely available at http://gbcs.embl.de/Je . Je can also be easily installed in Galaxy through the Galaxy toolshed.

  18. Proteins without unique 3D structures: biotechnological applications of intrinsically unstable/disordered proteins.

    PubMed

    Uversky, Vladimir N

    2015-03-01

    Intrinsically disordered proteins (IDPs) and intrinsically disordered protein regions (IDPRs) are functional proteins or regions that do not have unique 3D structures under functional conditions. Therefore, from the viewpoint of their lack of stable 3D structure, IDPs/IDPRs are inherently unstable. As much as structure and function of normal ordered globular proteins are determined by their amino acid sequences, the lack of unique 3D structure in IDPs/IDPRs and their disorder-based functionality are also encoded in the amino acid sequences. Because of their specific sequence features and distinctive conformational behavior, these intrinsically unstable proteins or regions have several applications in biotechnology. This review introduces some of the most characteristic features of IDPs/IDPRs (such as peculiarities of amino acid sequences of these proteins and regions, their major structural features, and peculiar responses to changes in their environment) and describes how these features can be used in the biotechnology, for example for the proteome-wide analysis of the abundance of extended IDPs, for recombinant protein isolation and purification, as polypeptide nanoparticles for drug delivery, as solubilization tools, and as thermally sensitive carriers of active peptides and proteins. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. Using the Self-Select Paradigm to Delineate the Nature of Speech Motor Programming

    PubMed Central

    Wright, David L.; Robin, Don A.; Rhee, Jooyhun; Vaculin, Amber; Jacks, Adam; Guenther, Frank H.; Fox, Peter T.

    2015-01-01

    Purpose The authors examined the involvement of 2 speech motor programming processes identified by S. T. Klapp (1995, 2003) during the articulation of utterances differing in syllable and sequence complexity. According to S. T. Klapp, 1 process, INT, resolves the demands of the programmed unit, whereas a second process, SEQ, oversees the serial order demands of longer sequences. Method A modified reaction time paradigm was used to assess INT and SEQ demands. Specifically, syllable complexity was dependent on syllable structure, whereas sequence complexity involved either repeated or unique syllabi within an utterance. Results INT execution was slowed when articulating single syllables in the form CCCV compared to simpler CV syllables. Planning unique syllables within a multisyllabic utterance rather than repetitions of the same syllable slowed INT but not SEQ. Conclusions The INT speech motor programming process, important for mental syllabary access, is sensitive to changes in both syllable structure and the number of unique syllables in an utterance. PMID:19474396

  20. RUCS: rapid identification of PCR primers for unique core sequences.

    PubMed

    Thomsen, Martin Christen Frølund; Hasman, Henrik; Westh, Henrik; Kaya, Hülya; Lund, Ole

    2017-12-15

    Designing PCR primers to target a specific selection of whole genome sequenced strains can be a long, arduous and sometimes impractical task. Such tasks would benefit greatly from an automated tool to both identify unique targets, and to validate the vast number of potential primer pairs for the targets in silico. Here we present RUCS, a program that will find PCR primer pairs and probes for the unique core sequences of a positive genome dataset complement to a negative genome dataset. The resulting primer pairs and probes are in addition to simple selection also validated through a complex in silico PCR simulation. We compared our method, which identifies the unique core sequences, against an existing tool called ssGeneFinder, and found that our method was 6.5-20 times more sensitive. We used RUCS to design primer pairs that would target a set of genomes known to contain the mcr-1 colistin resistance gene. Three of the predicted pairs were chosen for experimental validation using PCR and gel electrophoresis. All three pairs successfully produced an amplicon with the target length for the samples containing mcr-1 and no amplification products were produced for the negative samples. The novel methods presented in this manuscript can reduce the time needed to identify target sequences, and provide a quick virtual PCR validation to eliminate time wasted on ambiguously binding primers. Source code is freely available on https://bitbucket.org/genomicepidemiology/rucs. Web service is freely available on https://cge.cbs.dtu.dk/services/RUCS. mcft@cbs.dtu.dk. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  1. Giraffe genome sequence reveals clues to its unique morphology and physiology

    PubMed Central

    Agaba, Morris; Ishengoma, Edson; Miller, Webb C.; McGrath, Barbara C.; Hudson, Chelsea N.; Bedoya Reina, Oscar C.; Ratan, Aakrosh; Burhans, Rico; Chikhi, Rayan; Medvedev, Paul; Praul, Craig A.; Wu-Cavener, Lan; Wood, Brendan; Robertson, Heather; Penfold, Linda; Cavener, Douglas R.

    2016-01-01

    The origins of giraffe's imposing stature and associated cardiovascular adaptations are unknown. Okapi, which lacks these unique features, is giraffe's closest relative and provides a useful comparison, to identify genetic variation underlying giraffe's long neck and cardiovascular system. The genomes of giraffe and okapi were sequenced, and through comparative analyses genes and pathways were identified that exhibit unique genetic changes and likely contribute to giraffe's unique features. Some of these genes are in the HOX, NOTCH and FGF signalling pathways, which regulate both skeletal and cardiovascular development, suggesting that giraffe's stature and cardiovascular adaptations evolved in parallel through changes in a small number of genes. Mitochondrial metabolism and volatile fatty acids transport genes are also evolutionarily diverged in giraffe and may be related to its unusual diet that includes toxic plants. Unexpectedly, substantial evolutionary changes have occurred in giraffe and okapi in double-strand break repair and centrosome functions. PMID:27187213

  2. DAMe: a toolkit for the initial processing of datasets with PCR replicates of double-tagged amplicons for DNA metabarcoding analyses.

    PubMed

    Zepeda-Mendoza, Marie Lisandra; Bohmann, Kristine; Carmona Baez, Aldo; Gilbert, M Thomas P

    2016-05-03

    DNA metabarcoding is an approach for identifying multiple taxa in an environmental sample using specific genetic loci and taxa-specific primers. When combined with high-throughput sequencing it enables the taxonomic characterization of large numbers of samples in a relatively time- and cost-efficient manner. One recent laboratory development is the addition of 5'-nucleotide tags to both primers producing double-tagged amplicons and the use of multiple PCR replicates to filter erroneous sequences. However, there is currently no available toolkit for the straightforward analysis of datasets produced in this way. We present DAMe, a toolkit for the processing of datasets generated by double-tagged amplicons from multiple PCR replicates derived from an unlimited number of samples. Specifically, DAMe can be used to (i) sort amplicons by tag combination, (ii) evaluate PCR replicates dissimilarity, and (iii) filter sequences derived from sequencing/PCR errors, chimeras, and contamination. This is attained by calculating the following parameters: (i) sequence content similarity between the PCR replicates from each sample, (ii) reproducibility of each unique sequence across the PCR replicates, and (iii) copy number of the unique sequences in each PCR replicate. We showcase the insights that can be obtained using DAMe prior to taxonomic assignment, by applying it to two real datasets that vary in their complexity regarding number of samples, sequencing libraries, PCR replicates, and used tag combinations. Finally, we use a third mock dataset to demonstrate the impact and importance of filtering the sequences with DAMe. DAMe allows the user-friendly manipulation of amplicons derived from multiple samples with PCR replicates built in a single or multiple sequencing libraries. It allows the user to: (i) collapse amplicons into unique sequences and sort them by tag combination while retaining the sample identifier and copy number information, (ii) identify sequences carrying unused tag combinations, (iii) evaluate the comparability of PCR replicates of the same sample, and (iv) filter tagged amplicons from a number of PCR replicates using parameters of minimum length, copy number, and reproducibility across the PCR replicates. This enables an efficient analysis of complex datasets, and ultimately increases the ease of handling datasets from large-scale studies.

  3. Complete Genome Sequences of Bacillus Phages Janet and OTooleKemple52

    PubMed Central

    2018-01-01

    ABSTRACT We report here the genome sequences of two novel Bacillus cereus group-infecting bacteriophages, Janet and OTooleKemple52. These bacteriophages are double-stranded DNA-containing Myoviridae isolated from soil samples. While their genomes share a high degree of sequence identity with one another, their host preferences are unique. PMID:29748396

  4. Novel Insights into Tree Biology and Genome Evolution as Revealed Through Genomics.

    PubMed

    Neale, David B; Martínez-García, Pedro J; De La Torre, Amanda R; Montanari, Sara; Wei, Xiao-Xin

    2017-04-28

    Reference genome sequences are the key to the discovery of genes and gene families that determine traits of interest. Recent progress in sequencing technologies has enabled a rapid increase in genome sequencing of tree species, allowing the dissection of complex characters of economic importance, such as fruit and wood quality and resistance to biotic and abiotic stresses. Although the number of reference genome sequences for trees lags behind those for other plant species, it is not too early to gain insight into the unique features that distinguish trees from nontree plants. Our review of the published data suggests that, although many gene families are conserved among herbaceous and tree species, some gene families, such as those involved in resistance to biotic and abiotic stresses and in the synthesis and transport of sugars, are often expanded in tree genomes. As the genomes of more tree species are sequenced, comparative genomics will further elucidate the complexity of tree genomes and how this relates to traits unique to trees.

  5. Generation of a total of 6483 expressed sequence tags from 60 day-old bovine whole fetus and fetal placenta.

    PubMed

    Oishi, M; Gohma, H; Lejukole, H Y; Taniguchi, Y; Yamada, T; Suzuki, K; Shinkai, H; Uenishi, H; Yasue, H; Sasaki, Y

    2004-05-01

    Expressed sequence tags (ESTs) generated based on characterization of clones isolated randomly from cDNA libraries are used to study gene expression profiles in specific tissues and to provide useful information for characterizing tissue physiology. In this study, two directionally cloned cDNA libraries were constructed from 60 day-old bovine whole fetus and fetal placenta. We have characterized 5357 and 1126 clones, and then identified 3464 and 795 unique sequences for the fetus and placenta cDNA libraries: 1851 and 504 showed homology to already identified genes, and 1613 and 291 showed no significant matches to any of the sequences in DNA databases, respectively. Further, we found 94 unique sequences overlapping in both the fetus and the placenta, leading to a catalog of 4165 genes expressed in 60 day-old fetus and placenta. The catalog is used to examine expression profile of genes in 60 day-old bovine fetus and placenta.

  6. UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy

    PubMed Central

    2017-01-01

    Unique Molecular Identifiers (UMIs) are random oligonucleotide barcodes that are increasingly used in high-throughput sequencing experiments. Through a UMI, identical copies arising from distinct molecules can be distinguished from those arising through PCR amplification of the same molecule. However, bioinformatic methods to leverage the information from UMIs have yet to be formalized. In particular, sequencing errors in the UMI sequence are often ignored or else resolved in an ad hoc manner. We show that errors in the UMI sequence are common and introduce network-based methods to account for these errors when identifying PCR duplicates. Using these methods, we demonstrate improved quantification accuracy both under simulated conditions and real iCLIP and single-cell RNA-seq data sets. Reproducibility between iCLIP replicates and single-cell RNA-seq clustering are both improved using our proposed network-based method, demonstrating the value of properly accounting for errors in UMIs. These methods are implemented in the open source UMI-tools software package. PMID:28100584

  7. Highly conserved intragenic HSV-2 sequences: Results from next-generation sequencing of HSV-2 UL and US regions from genital swabs collected from 3 continents.

    PubMed

    Johnston, Christine; Magaret, Amalia; Roychoudhury, Pavitra; Greninger, Alexander L; Cheng, Anqi; Diem, Kurt; Fitzgibbon, Matthew P; Huang, Meei-Li; Selke, Stacy; Lingappa, Jairam R; Celum, Connie; Jerome, Keith R; Wald, Anna; Koelle, David M

    2017-10-01

    Understanding the variability in circulating herpes simplex virus type 2 (HSV-2) genomic sequences is critical to the development of HSV-2 vaccines. Genital lesion swabs containing ≥ 10 7 log 10 copies HSV DNA collected from Africa, the USA, and South America underwent next-generation sequencing, followed by K-mer based filtering and de novo genomic assembly. Sites of heterogeneity within coding regions in unique long and unique short (U L _U S ) regions were identified. Phylogenetic trees were created using maximum likelihood reconstruction. Among 46 samples from 38 persons, 1468 intragenic base-pair substitutions were identified. The maximum nucleotide distance between strains for concatenated U L_ U S segments was 0.4%. Phylogeny did not reveal geographic clustering. The most variable proteins had non-synonymous mutations in < 3% of amino acids. Unenriched HSV-2 DNA can undergo next-generation sequencing to identify intragenic variability. The use of clinical swabs for sequencing expands the information that can be gathered directly from these specimens. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Molecular identification and characterization of clustered regularly interspaced short palindromic repeats (CRISPRs) in a urease-positive thermophilic Campylobacter sp. (UPTC).

    PubMed

    Tasaki, E; Hirayama, J; Tazumi, A; Hayashi, K; Hara, Y; Ueno, H; Moore, J E; Millar, B C; Matsuda, M

    2012-02-01

    Novel clustered regularly-interspaced short palindromic repeats (CRISPRs) locus [7,500 base pairs (bp) in length] occurred in the urease-positive thermophilic Campylobacter (UPTC) Japanese isolate, CF89-12. The 7,500 bp gene loci consisted of the 5'-methylaminomethyl-2-thiouridylate methyltransferase gene, putative (P) CRISPR associated (p-Cas), putative open reading frames, Cas1 and Cas2, leader sequence region (146 bp), 12 CRISPRs consensus sequence repeats (each 36 bp) separated by a non-repetitive unique spacer region of similar length (26-31 bp) and the phosphatidyl glycerophosphatase A gene. When the CRISPRs loci in the UPTC CF89-12 and five C. jejuni isolates were compared with one another, these six isolates contained p-Cas, Cas1 and Cas2 within the loci. Four to 12 CRISPRs consensus sequence repeats separated by a non-repetitive unique spacer region occurred in six isolates and the nucleotide sequences of those repeats gave approximately 92-100% similarity with each other. However, no sequence similarity occurred in the unique spacer regions among these isolates. The putative σ(70) transcriptional promoter and the hypothetical ρ-independent terminator structures for the CRISPRs and Cas were detected. No in vivo transcription of p-Cas, Cas1 and Cas2 was confirmed in the UPTC cells.

  9. Structural analysis of a set of proteins resulting from a bacterial genomics project.

    PubMed

    Badger, J; Sauder, J M; Adams, J M; Antonysamy, S; Bain, K; Bergseid, M G; Buchanan, S G; Buchanan, M D; Batiyenko, Y; Christopher, J A; Emtage, S; Eroshkina, A; Feil, I; Furlong, E B; Gajiwala, K S; Gao, X; He, D; Hendle, J; Huber, A; Hoda, K; Kearins, P; Kissinger, C; Laubert, B; Lewis, H A; Lin, J; Loomis, K; Lorimer, D; Louie, G; Maletic, M; Marsh, C D; Miller, I; Molinari, J; Muller-Dieckmann, H J; Newman, J M; Noland, B W; Pagarigan, B; Park, F; Peat, T S; Post, K W; Radojicic, S; Ramos, A; Romero, R; Rutter, M E; Sanderson, W E; Schwinn, K D; Tresser, J; Winhoven, J; Wright, T A; Wu, L; Xu, J; Harris, T J R

    2005-09-01

    The targets of the Structural GenomiX (SGX) bacterial genomics project were proteins conserved in multiple prokaryotic organisms with no obvious sequence homolog in the Protein Data Bank of known structures. The outcome of this work was 80 structures, covering 60 unique sequences and 49 different genes. Experimental phase determination from proteins incorporating Se-Met was carried out for 45 structures with most of the remainder solved by molecular replacement using members of the experimentally phased set as search models. An automated tool was developed to deposit these structures in the Protein Data Bank, along with the associated X-ray diffraction data (including refined experimental phases) and experimentally confirmed sequences. BLAST comparisons of the SGX structures with structures that had appeared in the Protein Data Bank over the intervening 3.5 years since the SGX target list had been compiled identified homologs for 49 of the 60 unique sequences represented by the SGX structures. This result indicates that, for bacterial structures that are relatively easy to express, purify, and crystallize, the structural coverage of gene space is proceeding rapidly. More distant sequence-structure relationships between the SGX and PDB structures were investigated using PDB-BLAST and Combinatorial Extension (CE). Only one structure, SufD, has a truly unique topology compared to all folds in the PDB. Copyright 2005 Wiley-Liss, Inc.

  10. Kilo-sequencing: an ordered strategy for rapid DNA sequence data acquisition.

    PubMed Central

    Barnes, W M; Bevan, M

    1983-01-01

    A strategy for rapid DNA sequence acquisition in an ordered, nonrandom manner, while retaining all of the conveniences of the dideoxy method with M13 transducing phage DNA template, is described. Target DNA 3 to 14 kb in size can be stably carried by our M13 vectors. Suitable targets are stretches of DNA which lack an enzyme recognition site which is unique on our cloning vectors and adjacent to the sequencing primer; current sites that are so useful when lacking are Pst, Xba, HindIII, BglII, EcoRI. By an in vitro procedure, we cut RF DNA once randomly and once specifically, to create thousands of deletions which start at the unique restriction site adjacent to the dideoxy sequencing primer and extend various distances across the target DNA. Phage carrying a desired size of deletions, whose DNA as template will give rise to DNA sequence data in a desired location along the target DNA, may be purified by electrophoresis alive on agarose gels. Phage running in the same location on the agarose gel thus conveniently give rise to nucleotide sequence data from the same kilobase of target DNA. Images PMID:6298723

  11. Application of combinatorial biocatalysis for a unique ring expansion of dihydroxymethylzearalenone

    USDA-ARS?s Scientific Manuscript database

    Combinatorial biocatalysis was applied to generate a diverse set of dihydroxymethylzearalenone derivatives with modified ring structure. In one chemoenzymatic reaction sequence, dihydroxymethylzearalenone was first subjected to a unique enzyme-catalyzed oxidative ring opening reaction that creates ...

  12. A cricket Gene Index: a genomic resource for studying neurobiology, speciation, and molecular evolution

    PubMed Central

    Danley, Patrick D; Mullen, Sean P; Liu, Fenglong; Nene, Vishvanath; Quackenbush, John; Shaw, Kerry L

    2007-01-01

    Background As the developmental costs of genomic tools decline, genomic approaches to non-model systems are becoming more feasible. Many of these systems may lack advanced genetic tools but are extremely valuable models in other biological fields. Here we report the development of expressed sequence tags (EST's) in an orthopteroid insect, a model for the study of neurobiology, speciation, and evolution. Results We report the sequencing of 14,502 EST's from clones derived from a nerve cord cDNA library, and the subsequent construction of a Gene Index from these sequences, from the Hawaiian trigonidiine cricket Laupala kohalensis. The Gene Index contains 8607 unique sequences comprised of 2575 tentative consensus (TC) sequences and 6032 singletons. For each of the unique sequences, an attempt was made to assign a provisional annotation and to categorize its function using a Gene Ontology-based classification through a sequence-based comparison to known proteins. In addition, a set of unique 70 base pair oligomers that can be used for DNA microarrays was developed. All Gene Index information is posted at the DFCI Gene Indices web page Conclusion Orthopterans are models used to understand the neurophysiological basis of complex motor patterns such as flight and stridulation. The sequences presented in the cricket Gene Index will provide neurophysiologists with many genetic tools that have been largely absent in this field. The cricket Gene Index is one of only two gene indices to be developed in an evolutionary model system. Species within the genus Laupala have speciated recently, rapidly, and extensively. Therefore, the genes identified in the cricket Gene Index can be used to study the genomics of speciation. Furthermore, this gene index represents a significant EST resources for basal insects. As such, this resource is a valuable comparative tool for the understanding of invertebrate molecular evolution. The sequences presented here will provide much needed genomic resources for three distinct but overlapping fields of inquiry: neurobiology, speciation, and molecular evolution. PMID:17459168

  13. Hydraulic fracturing and the Crooked Lake Sequences: Insights gleaned from regional seismic networks

    NASA Astrophysics Data System (ADS)

    Schultz, Ryan; Stern, Virginia; Novakovic, Mark; Atkinson, Gail; Gu, Yu Jeffrey

    2015-04-01

    Within central Alberta, Canada, a new sequence of earthquakes has been recognized as of 1 December 2013 in a region of previous seismic quiescence near Crooked Lake, ~30 km west of the town of Fox Creek. We utilize a cross-correlation detection algorithm to detect more than 160 events to the end of 2014, which is temporally distinguished into five subsequences. This observation is corroborated by the uniqueness of waveforms clustered by subsequence. The Crooked Lake Sequences have come under scrutiny due to its strong temporal correlation (>99.99%) to the timing of hydraulic fracturing operations in the Duvernay Formation. We assert that individual subsequences are related to fracturing stimulation and, despite adverse initial station geometry, double-difference techniques allow us to spatially relate each cluster back to a unique horizontal well. Overall, we find that seismicity in the Crooked Lake Sequences is consistent with first-order observations of hydraulic fracturing induced seismicity.

  14. Generation and analysis of expressed sequence tags from a cDNA library of the fruiting body of Ganoderma lucidum

    PubMed Central

    2010-01-01

    Background Little genomic or trancriptomic information on Ganoderma lucidum (Lingzhi) is known. This study aims to discover the transcripts involved in secondary metabolite biosynthesis and developmental regulation of G. lucidum using an expressed sequence tag (EST) library. Methods A cDNA library was constructed from the G. lucidum fruiting body. Its high-quality ESTs were assembled into unique sequences with contigs and singletons. The unique sequences were annotated according to sequence similarities to genes or proteins available in public databases. The detection of simple sequence repeats (SSRs) was preformed by online analysis. Results A total of 1,023 clones were randomly selected from the G. lucidum library and sequenced, yielding 879 high-quality ESTs. These ESTs showed similarities to a diverse range of genes. The sequences encoding squalene epoxidase (SE) and farnesyl-diphosphate synthase (FPS) were identified in this EST collection. Several candidate genes, such as hydrophobin, MOB2, profilin and PHO84 were detected for the first time in G. lucidum. Thirteen (13) potential SSR-motif microsatellite loci were also identified. Conclusion The present study demonstrates a successful application of EST analysis in the discovery of transcripts involved in the secondary metabolite biosynthesis and the developmental regulation of G. lucidum. PMID:20230644

  15. Synthetic oligonucleotide probes deduced from amino acid sequence data. Theoretical and practical considerations.

    PubMed

    Lathe, R

    1985-05-05

    Synthetic probes deduced from amino acid sequence data are widely used to detect cognate coding sequences in libraries of cloned DNA segments. The redundancy of the genetic code dictates that a choice must be made between (1) a mixture of probes reflecting all codon combinations, and (2) a single longer "optimal" probe. The second strategy is examined in detail. The frequency of sequences matching a given probe by chance alone can be determined and also the frequency of sequences closely resembling the probe and contributing to the hybridization background. Gene banks cannot be treated as random associations of the four nucleotides, and probe sequences deduced from amino acid sequence data occur more often than predicted by chance alone. Probe lengths must be increased to confer the necessary specificity. Examination of hybrids formed between unique homologous probes and their cognate targets reveals that short stretches of perfect homology occurring by chance make a significant contribution to the hybridization background. Statistical methods for improving homology are examined, taking human coding sequences as an example, and considerations of codon utilization and dinucleotide frequencies yield an overall homology of greater than 82%. Recommendations for probe design and hybridization are presented, and the choice between using multiple probes reflecting all codon possibilities and a unique optimal probe is discussed.

  16. Rotary pin-in-maze discriminator

    DOEpatents

    Benavides, Gilbert L.

    1997-01-01

    A discriminator apparatus and method that discriminates between a unique signal and any other (incorrect) signal. The unique signal is a sequence of events; each event can assume one of two possible event states. Given the unique signal, a maze wheel is allowed to rotate fully in one direction. Given an incorrect signal, both the maze wheel and a pin wheel lock in position.

  17. Genome sequence of an aflatoxigenic pathogen of Argentinian peanut, Aspergillus arachidicola

    USDA-ARS?s Scientific Manuscript database

    In this study we sequenced the genome of the A. arachidicola Type strain (CBS 117610) and found its genome size to be 38.9 Mb, and its number of predicted genes to be 12,091, which are values comparable to those in other sequenced Aspergilli. Of its predicted genes, 691 were identified as unique to ...

  18. Complete Genome Sequences of Bacillus Phages Janet and OTooleKemple52.

    PubMed

    Kent, Brenna; Raymond, Thomas; Mosier, Philip D; Johnson, Allison A

    2018-05-10

    We report here the genome sequences of two novel Bacillus cereus group-infecting bacteriophages, Janet and OTooleKemple52. These bacteriophages are double-stranded DNA-containing Myoviridae isolated from soil samples. While their genomes share a high degree of sequence identity with one another, their host preferences are unique. Copyright © 2018 Kent et al.

  19. Microgravity

    NASA Image and Video Library

    1998-12-01

    Type II restriction enzymes, such as Eco R1 endonulease, present a unique advantage for the study of sequence-specific recognition because they leave a record of where they have been in the form of the cleaved ends of the DNA sites where they were bound. The differential behavior of a sequence -specific protein at sites of differing base sequence is the essence of the sequence-specificity; the core question is how do these proteins discriminate between different DNA sequences especially when the two sequences are very similar. Principal Investigator: Dan Carter/New Century Pharmaceuticals

  20. Protein Crystal Eco R1 Endonulease-DNA Complex

    NASA Technical Reports Server (NTRS)

    1998-01-01

    Type II restriction enzymes, such as Eco R1 endonulease, present a unique advantage for the study of sequence-specific recognition because they leave a record of where they have been in the form of the cleaved ends of the DNA sites where they were bound. The differential behavior of a sequence -specific protein at sites of differing base sequence is the essence of the sequence-specificity; the core question is how do these proteins discriminate between different DNA sequences especially when the two sequences are very similar. Principal Investigator: Dan Carter/New Century Pharmaceuticals

  1. Transposon Variants and Their Effects on Gene Expression in Arabidopsis

    PubMed Central

    Wang, Xi; Weigel, Detlef; Smith, Lisa M.

    2013-01-01

    Transposable elements (TEs) make up the majority of many plant genomes. Their transcription and transposition is controlled through siRNAs and epigenetic marks including DNA methylation. To dissect the interplay of siRNA–mediated regulation and TE evolution, and to examine how TE differences affect nearby gene expression, we investigated genome-wide differences in TEs, siRNAs, and gene expression among three Arabidopsis thaliana accessions. Both TE sequence polymorphisms and presence of linked TEs are positively correlated with intraspecific variation in gene expression. The expression of genes within 2 kb of conserved TEs is more stable than that of genes next to variant TEs harboring sequence polymorphisms. Polymorphism levels of TEs and closely linked adjacent genes are positively correlated as well. We also investigated the distribution of 24-nt-long siRNAs, which mediate TE repression. TEs targeted by uniquely mapping siRNAs are on average farther from coding genes, apparently because they more strongly suppress expression of adjacent genes. Furthermore, siRNAs, and especially uniquely mapping siRNAs, are enriched in TE regions missing in other accessions. Thus, targeting by uniquely mapping siRNAs appears to promote sequence deletions in TEs. Overall, our work indicates that siRNA–targeting of TEs may influence removal of sequences from the genome and hence evolution of gene expression in plants. PMID:23408902

  2. Computational approaches for identification of conserved/unique binding pockets in the A chain of ricin

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ecale Zhou, C L; Zemla, A T; Roe, D

    2005-01-29

    Specific and sensitive ligand-based protein detection assays that employ antibodies or small molecules such as peptides, aptamers, or other small molecules require that the corresponding surface region of the protein be accessible and that there be minimal cross-reactivity with non-target proteins. To reduce the time and cost of laboratory screening efforts for diagnostic reagents, we developed new methods for evaluating and selecting protein surface regions for ligand targeting. We devised combined structure- and sequence-based methods for identifying 3D epitopes and binding pockets on the surface of the A chain of ricin that are conserved with respect to a set ofmore » ricin A chains and unique with respect to other proteins. We (1) used structure alignment software to detect structural deviations and extracted from this analysis the residue-residue correspondence, (2) devised a method to compare corresponding residues across sets of ricin structures and structures of closely related proteins, (3) devised a sequence-based approach to determine residue infrequency in local sequence context, and (4) modified a pocket-finding algorithm to identify surface crevices in close proximity to residues determined to be conserved/unique based on our structure- and sequence-based methods. In applying this combined informatics approach to ricin A we identified a conserved/unique pocket in close proximity (but not overlapping) the active site that is suitable for bi-dentate ligand development. These methods are generally applicable to identification of surface epitopes and binding pockets for development of diagnostic reagents, therapeutics, and vaccines.« less

  3. DNA Barcoding in the Cycadales: Testing the Potential of Proposed Barcoding Markers for Species Identification of Cycads

    PubMed Central

    Sass, Chodon; Little, Damon P.; Stevenson, Dennis Wm.; Specht, Chelsea D.

    2007-01-01

    Barcodes are short segments of DNA that can be used to uniquely identify an unknown specimen to species, particularly when diagnostic morphological features are absent. These sequences could offer a new forensic tool in plant and animal conservation—especially for endangered species such as members of the Cycadales. Ideally, barcodes could be used to positively identify illegally obtained material even in cases where diagnostic features have been purposefully removed or to release confiscated organisms into the proper breeding population. In order to be useful, a DNA barcode sequence must not only easily PCR amplify with universal or near-universal reaction conditions and primers, but also contain enough variation to generate unique identifiers at either the species or population levels. Chloroplast regions suggested by the Plant Working Group of the Consortium for the Barcode of Life (CBoL), and two alternatives, the chloroplast psbA-trnH intergenic spacer and the nuclear ribosomal internal transcribed spacer (nrITS), were tested for their utility in generating unique identifiers for members of the Cycadales. Ease of amplification and sequence generation with universal primers and reaction conditions was determined for each of the seven proposed markers. While none of the proposed markers provided unique identifiers for all species tested, nrITS showed the most promise in terms of variability, although sequencing difficulties remain a drawback. We suggest a workflow for DNA barcoding, including database generation and management, which will ultimately be necessary if we are to succeed in establishing a universal DNA barcode for plants. PMID:17987130

  4. MySSP: Non-stationary evolutionary sequence simulation, including indels

    PubMed Central

    Rosenberg, Michael S.

    2007-01-01

    MySSP is a new program for the simulation of DNA sequence evolution across a phylogenetic tree. Although many programs are available for sequence simulation, MySSP is unique in its inclusion of indels, flexibility in allowing for non-stationary patterns, and output of ancestral sequences. Some of these features can individually be found in existing programs, but have not all have been previously available in a single package. PMID:19325855

  5. Degree sequence in message transfer

    NASA Astrophysics Data System (ADS)

    Yamuna, M.

    2017-11-01

    Message encryption is always an issue in current communication scenario. Methods are being devised using various domains. Graphs satisfy numerous unique properties which can be used for message transfer. In this paper, I propose a message encryption method based on degree sequence of graphs.

  6. Full genome sequence of Rocio virus reveal substantial variations from the prototype Rocio virus SPH 34675 sequence.

    PubMed

    Setoh, Yin Xiang; Amarilla, Alberto A; Peng, Nias Y; Slonchak, Andrii; Periasamy, Parthiban; Figueiredo, Luiz T M; Aquino, Victor H; Khromykh, Alexander A

    2018-01-01

    Rocio virus (ROCV) is an arbovirus belonging to the genus Flavivirus, family Flaviviridae. We present an updated sequence of ROCV strain SPH 34675 (GenBank: AY632542.4), the only available full genome sequence prior to this study. Using next-generation sequencing of the entire genome, we reveal substantial sequence variation from the prototype sequence, with 30 nucleotide differences amounting to 14 amino acid changes, as well as significant changes to predicted 3'UTR RNA structures. Our results present an updated and corrected sequence of a potential emerging human-virulent flavivirus uniquely indigenous to Brazil (GenBank: MF461639).

  7. Rotary pin-in-maze discriminator

    DOEpatents

    Benavides, G.L.

    1997-05-06

    A discriminator apparatus and method that discriminates between a unique signal and any other (incorrect) signal are disclosed. The unique signal is a sequence of events; each event can assume one of two possible event states. Given the unique signal, a maze wheel is allowed to rotate fully in one direction. Given an incorrect signal, both the maze wheel and a pin wheel lock in position. 4 figs.

  8. nuID: a universal naming scheme of oligonucleotides for Illumina, Affymetrix, and other microarrays

    PubMed Central

    Du, Pan; Kibbe, Warren A; Lin, Simon M

    2007-01-01

    Background Oligonucleotide probes that are sequence identical may have different identifiers between manufacturers and even between different versions of the same company's microarray; and sometimes the same identifier is reused and represents a completely different oligonucleotide, resulting in ambiguity and potentially mis-identification of the genes hybridizing to that probe. Results We have devised a unique, non-degenerate encoding scheme that can be used as a universal representation to identify an oligonucleotide across manufacturers. We have named the encoded representation 'nuID', for nucleotide universal identifier. Inspired by the fact that the raw sequence of the oligonucleotide is the true definition of identity for a probe, the encoding algorithm uniquely and non-degenerately transforms the sequence itself into a compact identifier (a lossless compression). In addition, we added a redundancy check (checksum) to validate the integrity of the identifier. These two steps, encoding plus checksum, result in an nuID, which is a unique, non-degenerate, permanent, robust and efficient representation of the probe sequence. For commercial applications that require the sequence identity to be confidential, we have an encryption schema for nuID. We demonstrate the utility of nuIDs for the annotation of Illumina microarrays, and we believe it has universal applicability as a source-independent naming convention for oligomers. Reviewers This article was reviewed by Itai Yanai, Rong Chen (nominated by Mark Gerstein), and Gregory Schuler (nominated by David Lipman). PMID:17540033

  9. Cloning of the anhidrotic ectodermal dysplasia gene: Identification of cDNAs associated with CpG islands mapped near translocation breakpoint in two female patients

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Srivastava, A.K.; Schlessinger, D.; Kere, J.

    1994-09-01

    The gene for the X chromosomal developmental disorder anhidrotic ectodermal dysplasia (EDA) has been mapped to Xq12-q13 by linkage analysis and is expressed in a few females with chromosomal translocations involving band Xq12-q13. A yeast artificial chromosome (YAC) contig (2.0 Mb) spanning two translocation breakpoints has been assembled by sequence-tagged site (STS)-based chromosomal walking. The two translocation breakpoints (X:autosome translocations from the affected female patients) have been mapped less than 60 kb apart within a YAC contig. Unique probes and intragenic STSs (mapped between the two translocations) have been developed and a somatic cell hybrid carrying the translocated X chromosomemore » from the AK patient has been analyzed by isolating unique probes that span the breakpoint. Several STSs made from intragenic sequences have been found to be conserved in mouse, hamster and monkey, but we have detected no mRNAs in a number of tissues tested. However, a probe and STS developed from the DNA spanning the AK breakpoint is conserved in mouse, hamster and monkey, and we have detected expressed sequences in skin cells and cDNA libraries. In addition, unique sequences have been obtained from two CpG islands in the region that maps proximal to the breakpoints. cDNAs containing these sequences are being studied as candidates for the gene affected in the etiology of EDA.« less

  10. Recombinatorial biases and convergent recombination determine interindividual TCRβ sharing in murine thymocytes.

    PubMed

    Li, Hanjie; Ye, Congting; Ji, Guoli; Wu, Xiaohui; Xiang, Zhe; Li, Yuanyue; Cao, Yonghao; Liu, Xiaolong; Douek, Daniel C; Price, David A; Han, Jiahuai

    2012-09-01

    Overlap of TCR repertoires among individuals provides the molecular basis for public T cell responses. By deep-sequencing the TCRβ repertoires of CD4+CD8+ thymocytes from three individual mice, we observed that a substantial degree of TCRβ overlap, comprising ∼10-15% of all unique amino acid sequences and ∼5-10% of all unique nucleotide sequences across any two individuals, is already present at this early stage of T cell development. The majority of TCRβ sharing between individual thymocyte repertoires could be attributed to the process of convergent recombination, with additional contributions likely arising from recombinatorial biases; the role of selection during intrathymic development was negligible. These results indicate that the process of TCR gene recombination is the major determinant of clonotype sharing between individuals.

  11. The Physics and Mathematics of MRI

    NASA Astrophysics Data System (ADS)

    Ansorge, Richard; Graves, Martin

    2016-10-01

    Magnetic Resonance Imaging is a very important clinical imaging tool. It combines different fields of physics and engineering in a uniquely complex way. MRI is also surprisingly versatile, `pulse sequences' can be designed to yield many different types of contrast. This versatility is unique to MRI. This short book gives both an in depth account of the methods used for the operation and construction of modern MRI systems and also the principles of sequence design and many examples of applications. An important additional feature of this book is the detailed discussion of the mathematical principles used in building optimal MRI systems and for sequence design. The mathematical discussion is very suitable for undergraduates attending medical physics courses. It is also more complete than usually found in alternative books for physical scientists or more clinically orientated works.

  12. Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis

    PubMed Central

    2012-01-01

    Background The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. Results Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. Conclusions By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand. PMID:22276739

  13. Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis.

    PubMed

    Tu, Jing; Ge, Qinyu; Wang, Shengqin; Wang, Lei; Sun, Beili; Yang, Qi; Bai, Yunfei; Lu, Zuhong

    2012-01-25

    The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand.

  14. Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Torella, JP; Lienert, F; Boehm, CR

    2014-08-07

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked withmore » UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.« less

  15. Unique nucleotide sequence (UNS)-guided assembly of repetitive DNA parts for synthetic biology applications

    PubMed Central

    Torella, Joseph P.; Lienert, Florian; Boehm, Christian R.; Chen, Jan-Hung; Way, Jeffrey C.; Silver, Pamela A.

    2016-01-01

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts and hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies — for example repeated terminator and insulator sequences — that complicate recombination-based assembly. We and others have recently developed DNA assembly methods that we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly-assembled constructs, or into high-quality combinatorial libraries in only 2–3 days. If the DNA parts must be generated from scratch, an additional 2–5 days are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques. PMID:25101822

  16. Generation and Analysis of Expressed Sequence Tags from Olea europaea L.

    PubMed Central

    Ozdemir Ozgenturk, Nehir; Oruç, Fatma; Sezerman, Ugur; Kuçukural, Alper; Vural Korkut, Senay; Toksoz, Feriha; Un, Cemal

    2010-01-01

    Olive (Olea europaea L.) is an important source of edible oil which was originated in Near-East region. In this study, two cDNA libraries were constructed from young olive leaves and immature olive fruits for generation of ESTs to discover the novel genes and search the function of unknown genes of olive. The randomly selected 3840 colonies were sequenced for EST collection from both libraries. Readable 2228 sequences for olive leaf and 1506 sequences for olive fruit were assembled into 205 and 69 contigs, respectively, whereas 2478 were singletons. Putative functions of all 2752 differentially expressed unique sequences were designated by gene homology based on BLAST and annotated using BLAST2GO. While 1339 ESTs show no homology to the database, 2024 ESTs have homology (under 80%) with hypothetical proteins, putative proteins, expressed proteins, and unknown proteins in NCBI-GenBank. 635 EST's unique genes sequence have been identified by over 80% homology to known function in other species which were not previously described in Olea family. Only 3.1% of total EST's was shown similarity with olive database existing in NCBI. This generated EST's data and consensus sequences were submitted to NCBI as valuable source for functional genome studies of olive. PMID:21197085

  17. The first genome sequence of a metatherian herpesvirus: Macropodid herpesvirus 1.

    PubMed

    Vaz, Paola K; Mahony, Timothy J; Hartley, Carol A; Fowler, Elizabeth V; Ficorilli, Nino; Lee, Sang W; Gilkerson, James R; Browning, Glenn F; Devlin, Joanne M

    2016-01-22

    While many placental herpesvirus genomes have been fully sequenced, the complete genome of a marsupial herpesvirus has not been described. Here we present the first genome sequence of a metatherian herpesvirus, Macropodid herpesvirus 1 (MaHV-1). The MaHV-1 viral genome was sequenced using an Illumina MiSeq sequencer, de novo assembly was performed and the genome was annotated. The MaHV-1 genome was 140 kbp in length and clustered phylogenetically with the primate simplexviruses, sharing 67% nucleotide sequence identity with Human herpesviruses 1 and 2. The MaHV-1 genome contained 66 predicted open reading frames (ORFs) homologous to those in other herpesvirus genomes, but lacked homologues of UL3, UL4, UL56 and glycoprotein J. This is the first alphaherpesvirus genome that has been found to lack the UL3 and UL4 homologues. We identified six novel ORFs and confirmed their transcription by RT-PCR. This is the first genome sequence of a herpesvirus that infects metatherians, a taxonomically unique mammalian clade. Members of the Simplexvirus genus are remarkably conserved, so the absence of ORFs otherwise retained in eutherian and avian alphaherpesviruses contributes to our understanding of the Alphaherpesvirinae. Further study of metatherian herpesvirus genetics and pathogenesis provides a unique approach to understanding herpesvirus-mammalian interactions.

  18. DNABIT Compress - Genome compression algorithm.

    PubMed

    Rajarajeswari, Pothuraju; Apparao, Allam

    2011-01-22

    Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, "DNABIT Compress" for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that "DNABIT Compress" algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases.

  19. Fabrication of a New Lineage of Artificial Luciferases from Natural Luciferase Pools.

    PubMed

    Kim, Sung Bae; Nishihara, Ryo; Citterio, Daniel; Suzuki, Koji

    2017-09-11

    The fabrication of artificial luciferases (ALucs) with unique optical properties has a fundamental impact on bioassays and molecular imaging. In this study, we developed a new lineage of ALucs with unique substrate preferences by extracting consensus amino acids from the alignment of 25 copepod luciferase sequences available in natural luciferase pools. The primary sequence was first created with a sequence logo generator resulting in a total of 11 sibling sequences. Phylogenetic analysis shows that the newly fabricated ALucs form an independent branch, genetically isolated from the natural luciferases, and from a prior series of ALucs produced by our laboratory using a smaller basis set. The new lineage of ALucs were strongly luminescent in living mammalian cells with specific substrate selectivity to native coelenterazine. A single-residue-level comparison of the C-terminal sequences of new ALucs reveals that some amino acids in the C-terminal ends are greatly influential on the optical intensities but limited in the color variance. The success of this approach guides on how to engineer and functionalize marine luciferases for bioluminescence imaging and assays.

  20. Evidence for Interspecies Gene Transfer in the Evolution of 2,4-Dichlorophenoxyacetic Acid Degraders

    PubMed Central

    McGowan, Catherine; Fulthorpe, Roberta; Wright, Alice; Tiedje, J. M.

    1998-01-01

    Small-subunit ribosomal DNA (SSU rDNA) from 20 phenotypically distinct strains of 2,4-dichlorophenoxyacetic acid (2,4-D)-degrading bacteria was partially sequenced, yielding 18 unique strains belonging to members of the alpha, beta, and gamma subgroups of the class Proteobacteria. To understand the origin of 2,4-D degradation in this diverse collection, the first gene in the 2,4-D pathway, tfdA, was sequenced. The sequences fell into three unique classes found in various members of the beta and gamma subgroups of Proteobacteria. None of the α-Proteobacteria yielded tfdA PCR products. A comparison of the dendrogram of the tfdA genes with that of the SSU rDNA genes demonstrated incongruency in phylogenies, and hence 2,4-D degradation must have originated from gene transfer between species. Only those strains with tfdA sequences highly similar to the tfdA sequence of strain JMP134 (tfdA class I) transferred all the 2,4-D genes and conferred the 2,4-D degradation phenotype to a Burkholderia cepacia recipient. PMID:9758850

  1. Short-term memory stores organized by information domain.

    PubMed

    Noyce, Abigail L; Cestero, Nishmar; Shinn-Cunningham, Barbara G; Somers, David C

    2016-04-01

    Vision and audition have complementary affinities, with vision excelling in spatial resolution and audition excelling in temporal resolution. Here, we investigated the relationships among the visual and auditory modalities and spatial and temporal short-term memory (STM) using change detection tasks. We created short sequences of visual or auditory items, such that each item within a sequence arose at a unique spatial location at a unique time. On each trial, two successive sequences were presented; subjects attended to either space (the sequence of locations) or time (the sequence of inter item intervals) and reported whether the patterns of locations or intervals were identical. Each subject completed blocks of unimodal trials (both sequences presented in the same modality) and crossmodal trials (Sequence 1 visual, Sequence 2 auditory, or vice versa) for both spatial and temporal tasks. We found a strong interaction between modality and task: Spatial performance was best on unimodal visual trials, whereas temporal performance was best on unimodal auditory trials. The order of modalities on crossmodal trials also mattered, suggesting that perceptual fidelity at encoding is critical to STM. Critically, no cost was attributable to crossmodal comparison: In both tasks, performance on crossmodal trials was as good as or better than on the weaker unimodal trials. STM representations of space and time can guide change detection in either the visual or the auditory modality, suggesting that the temporal or spatial organization of STM may supersede sensory-specific organization.

  2. Development of a PCR-based marker utilizing a deletion mutation in the dihydroflavonol 4-reductase (DFR) gene responsible for the lack of anthocyanin production in yellow onions (Allium cepa).

    PubMed

    Kim, Sunggil; Yoo, Kil Sun; Pike, Leonard M

    2005-02-01

    Bulb color in onions (Allium cepa) is an important trait, but the mechanism of color inheritance is poorly understood at the molecular level. A previous study showed that inactivation of the dihydroflavonol 4-reductase (DFR) gene at the transcriptional level resulted in a lack of anthocyanin production in yellow onions. The objectives of the present study were the identification of the critical mutations in the DFR gene (DFR-A) and the development of a PCR-based marker for allelic selection. We report the isolation of two additional DFR homologs (DFR-B and DFR-C). No unique sequences were identified in either DFR homolog, even in the untranslated region (UTR). Both genes shared more than 95% nucleotide sequence identity with the DFR-A gene. To obtain a unique sequence from each gene, we isolated the promoter regions. Sequences of the DFR-A and DFR-B promoters differed completely from one another, except for an approximately 100-bp sequence adjacent to the 5'UTR. It was possible to specifically amplify only the DFR-A gene using primers designed to anneal to the unique promoter region. The sequences of yellow and red DFR-A alleles were the same except for a single base-pair change in the promoter and an approximately 800-bp deletion within the 3' region of the yellow DFR-A allele. This deletion was used to develop a co-dominant PCR-based marker that segregated perfectly with color phenotypes in the F2 population. These results indicate that a deletion mutation in the yellow DFR-A gene results in the lack of anthocyanin production in yellow onions.

  3. Prevalence and genome characteristics of canine astrovirus in southwest China.

    PubMed

    Li, Mingxiang; Yan, Nan; Ji, Conghui; Wang, Min; Zhang, Bin; Yue, Hua; Tang, Cheng

    2018-05-30

    The aim of this study was to investigate canine astrovirus (CaAstV) infection in southwest China. We collected 107 faecal samples from domestic dogs with obvious diarrhoea. Forty-two diarrhoeic samples (39.3 %) were positive for CaAstV by RT-PCR, and 41/42 samples showed co-infection with canine coronavirus (CCoV), canine parvovirus-2 (CPV-2) and canine distemper virus (CDV). Phylogenetic analysis based on 26 CaAstV partial ORF1a and ORF1b sequences revealed that most CaAstV strains showed unique evolutionary features. Interestingly, putative recombination events were observed among four of the five complete ORF2 sequences cloned in this study, and three of the five complete ORF2 sequences formed a single unique group, suggesting that these strains could be a novel genotype. We successfully sequenced the complete genome of one CaAstV strain (designated 2017/44/CHN), which was 6628 nt in length. The features of this genome include putative recombination events in the ORF1a, ORF1b and ORF2 genes, while the ORF2 gene had a continuous insertion of 7 aa in region II compared with the other complete ORF2 sequences available in GenBank. Phylogenetic analysis showed that 2017/44/CHN formed a single group based on genome sequences, suggesting that this strain might be a novel genotype. The results of this study revealed that CaAstV circulates widely in diarrhoeic dogs in southwest China and exhibits unique evolutionary events. To the best of our knowledge, this is the first report of recombination events in CaAstV, and it contributes to further understanding of the genetic evolution of CaAstV.

  4. Complete nucleotide sequence and genome structure of a Japanese isolate of hibiscus latent Fort Pierce virus, a unique tobamovirus that contains an internal poly(A) region in its 3' end.

    PubMed

    Yoshida, Tetsuya; Kitazawa, Yugo; Komatsu, Ken; Neriya, Yutaro; Ishikawa, Kazuya; Fujita, Naoko; Hashimoto, Masayoshi; Maejima, Kensaku; Yamaji, Yasuyuki; Namba, Shigetou

    2014-11-01

    In this study, we detected a Japanese isolate of hibiscus latent Fort Pierce virus (HLFPV-J), a member of the genus Tobamovirus, in a hibiscus plant in Japan and determined the complete sequence and organization of its genome. HLFPV-J has four open reading frames (ORFs), each of which shares more than 98 % nucleotide sequence identity with those of other HLFPV isolates. Moreover, HLFPV-J contains a unique internal poly(A) region of variable length, ranging from 44 to 78 nucleotides, in its 3'-untranslated region (UTR), as is the case with hibiscus latent Singapore virus (HLSV), another hibiscus-infecting tobamovirus. The length of the HLFPV-J genome was 6431 nucleotides, including the shortest internal poly(A) region. The sequence identities of ORFs 1, 2, 3 and 4 of HLFPV-J to other tobamoviruses were 46.6-68.7, 49.9-70.8, 31.0-70.8 and 39.4-70.1 %, respectively, at the nucleotide level and 39.8-75.0, 43.6-77.8, 19.2-70.4 and 31.2-74.2 %, respectively, at the amino acid level. The 5'- and 3'-UTRs of HLFPV-J showed 24.3-58.6 and 13.0-79.8 % identity, respectively, to other tobamoviruses. In particular, when compared to other tobamoviruses, each ORF and UTR of HLFPV-J showed the highest sequence identity to those of HLSV. Phylogenetic analysis showed that HLFPV-J, other HLFPV isolates and HLSV constitute a malvaceous-plant-infecting tobamovirus cluster. These results indicate that the genomic structure of HLFPV-J has unique features similar to those of HLSV. To our knowledge, this is the first report of the complete genome sequence of HLFPV.

  5. Targeted Capture and High-Throughput Sequencing Using Molecular Inversion Probes (MIPs).

    PubMed

    Cantsilieris, Stuart; Stessman, Holly A; Shendure, Jay; Eichler, Evan E

    2017-01-01

    Molecular inversion probes (MIPs) in combination with massively parallel DNA sequencing represent a versatile, yet economical tool for targeted sequencing of genomic DNA. Several thousand genomic targets can be selectively captured using long oligonucleotides containing unique targeting arms and universal linkers. The ability to append sequencing adaptors and sample-specific barcodes allows large-scale pooling and subsequent high-throughput sequencing at relatively low cost per sample. Here, we describe a "wet bench" protocol detailing the capture and subsequent sequencing of >2000 genomic targets from 192 samples, representative of a single lane on the Illumina HiSeq 2000 platform.

  6. Comparison and quantitative verification of mapping algorithms for whole genome bisulfite sequencing

    USDA-ARS?s Scientific Manuscript database

    Coupling bisulfite conversion with next-generation sequencing (Bisulfite-seq) enables genome-wide measurement of DNA methylation, but poses unique challenges for mapping. However, despite a proliferation of Bisulfite-seq mapping tools, no systematic comparison of their genomic coverage and quantitat...

  7. The Pizza Problem: A Solution with Sequences

    ERIC Educational Resources Information Center

    Shafer, Kathryn G.; Mast, Caleb J.

    2008-01-01

    This article addresses the issues of coaching and assessing. A preservice middle school teacher's unique solution to the Pizza problem was not what the professor expected. The student's solution strategy, based on sequences and a reinvention of Pascal's triangle, is explained in detail. (Contains 8 figures.)

  8. Complete genome sequence of the acetylene-fermenting Pelobacter sp. strain SFB93

    USGS Publications Warehouse

    Sutton, John M.; Baesman, Shaun; Fierst, Janna L.; Poret-Peterson, Amisha T.; Oremland, Ronald S.; Dunlap, Darren S.; Akob, Denise M.

    2017-01-01

    Acetylene fermentation is a rare metabolism that was previously reported as being unique to Pelobacter acetylenicus. Here, we report the genome sequence of Pelobacter sp. strain SFB93, an acetylene-fermenting bacterium isolated from sediments collected in San Francisco Bay, CA.

  9. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chauhan, Archana; Layton, Alice; Williams, Daniel W

    Pseudomonas fluorescens strain HK44 (DSM 6700) is a genetically engineered lux-based bioluminescent bioreporter. Here we report the draft genome sequence of strain HK44. Annotation of {approx}6.1 Mb sequence indicates that 30% of the traits are unique and distributed over 5 genomic islands, a prophage and two plasmids.

  10. Re-analysis of human immunodeficiency virus type 1 isolates from Cyprus and Greece, initially designated 'subtype I', reveals a unique complex A/G/H/K/? mosaic pattern.

    PubMed

    Paraskevis, D; Magiorkinis, M; Vandamme, A M; Kostrikis, L G; Hatzakis, A

    2001-03-01

    Human immunodeficiency virus type 1 (HIV-1) has been classified into three main groups and 11 distinct subtypes. Moreover, several circulating recombinant forms (CRFs) of HIV-1 have been recently documented to have spread widely causing extensive HIV-1 epidemics. A subtype, initially designated I (CRF04_cpx), was documented in Cyprus and Greece and was found to comprise regions of sequence derived from subtypes A and G as well as regions of unclassified sequence. Re-analysis of the three full-length CRF04_cpx sequences that were available revealed a mosaic genomic organization of unique complexity comprising regions of sequence from at least five distinct subtypes, A, G, H, K and unclassified regions. These strains account for approximately 2% of the total HIV-1-infected population in Greece, thus providing evidence of the great capability of HIV-1 to recombine and produce highly divergent strains which can be spread successfully through different infection routes.

  11. Digital Biological Converter

    DTIC Science & Technology

    2013-06-28

    of cuts that each fragment should be cut into so the fragments are no greater than a specific length threshold. Additionally, vector sequences and...restriction sites are attached to each fragment while ensuring the restriction sites are unique to each sequence. The vector sequences serve as hooks...for assembly into vector for cloning purposes, and also as primer binding domains for PCR ampl ification. The restriction sites are added to

  12. Transcriptome analysis of Bupleurum chinense focusing on genes involved in the biosynthesis of saikosaponins

    PubMed Central

    2011-01-01

    Abstract Background Bupleurum chinense DC. is a widely used traditional Chinese medicinal plant. Saikosaponins are the major bioactive constituents of B. chinense, but relatively little is known about saikosaponin biosynthesis. The 454 pyrosequencing technology provides a promising opportunity for finding novel genes that participate in plant metabolism. Consequently, this technology may help to identify the candidate genes involved in the saikosaponin biosynthetic pathway. Results One-quarter of the 454 pyrosequencing runs produced a total of 195, 088 high-quality reads, with an average read length of 356 bases (NCBI SRA accession SRA039388). A de novo assembly generated 24, 037 unique sequences (22, 748 contigs and 1, 289 singletons), 12, 649 (52.6%) of which were annotated against three public protein databases using a basic local alignment search tool (E-value ≤1e-10). All unique sequences were compared with NCBI expressed sequence tags (ESTs) (237) and encoding sequences (44) from the Bupleurum genus, and with a Sanger-sequenced EST dataset (3, 111). The 23, 173 (96.4%) unique sequences obtained in the present study represent novel Bupleurum genes. The ESTs of genes related to saikosaponin biosynthesis were found to encode known enzymes that catalyze the formation of the saikosaponin backbone; 246 cytochrome P450 (P450s) and 102 glycosyltransferases (GTs) unique sequences were also found in the 454 dataset. Full length cDNAs of 7 P450s and 7 uridine diphosphate GTs (UGTs) were verified by reverse transcriptase polymerase chain reaction or by cloning using 5' and/or 3' rapid amplification of cDNA ends. Two P450s and three UGTs were identified as the most likely candidates involved in saikosaponin biosynthesis. This finding was based on the coordinate up-regulation of their expression with β-AS in methyl jasmonate-treated adventitious roots and on their similar expression patterns with β-AS in various B. chinense tissues. Conclusions A collection of high-quality ESTs for B. chinense obtained by 454 pyrosequencing is provided here for the first time. These data should aid further research on the functional genomics of B. chinense and other Bupleurum species. The candidate genes for enzymes involved in saikosaponin biosynthesis, especially the P450s and UGTs, that were revealed provide a substantial foundation for follow-up research on the metabolism and regulation of the saikosaponins. PMID:22047182

  13. Identification of Entamoeba polecki with Unique 18S rRNA Gene Sequences from Celebes Crested Macaques and Pigs in Tangkoko Nature Reserve, North Sulawesi, Indonesia.

    PubMed

    Tuda, Josef; Feng, Meng; Imada, Mihoko; Kobayashi, Seiki; Cheng, Xunjia; Tachibana, Hiroshi

    2016-09-01

    Unique species of macaques are distributed across Sulawesi Island, Indonesia, and the details of Entamoeba infections in these macaques are unknown. A total of 77 stool samples from Celebes crested macaques (Macaca nigra) and 14 stool samples from pigs were collected in Tangkoko Nature Reserve, North Sulawesi, and the prevalence of Entamoeba infection was examined by PCR. Entamoeba polecki was detected in 97% of the macaques and all of the pigs, but no other Entamoeba species were found. The nucleotide sequence of the 18S rRNA gene in E. polecki from M. nigra was unique and showed highest similarity with E. polecki subtype (ST) 4. This is the first case of identification of E. polecki ST4 from wild nonhuman primates. The sequence of the 18S rRNA gene in E. polecki from pigs was also unique and showed highest similarity with E. polecki ST1. These results suggest that the diversity of the 18S rRNA gene in E. polecki is associated with differences in host species and geographic localization, and that there has been no transmission of E. polecki between macaques and pigs in the study area. © 2016 The Author(s) Journal of Eukaryotic Microbiology © 2016 International Society of Protistologists.

  14. Identification of a precursor genomic segment that provided a sequence unique to glycophorin B and E genes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Onda, M.; Kudo, S.; Fukuda, M.

    Human glycophorin A, B, and E (GPA, GPB, and GPE) genes belong to a gene family located at the long arm of chromosome 4. These three genes are homologous from the 5'-flanking sequence to the Alu sequence, which is 1 kb downstream from the exon encoding the transmembrane domain. Analysis of the Alu sequence and flanking direct repeat sequences suggested that the GPA gene most closely resembles the ancestral gene, whereas the GPB and GPE gene arose by homologous recombination within the Alu sequence, acquiring 3' sequences from an unrelated precursor genomic segment. Here the authors describe the identification ofmore » this putative precursor genomic segment. A human genomic library was screened by using the sequence of the 3' region of the GPB gene as a probe. The genomic clones isolated were found to contain an Alu sequence that appeared to be involved in the recombination. Downstream from the Alu sequence, the nucleotide sequence of the precursor genomic segment is almost identical to that of the GPB or GPE gene. In contrast, the upstream sequence of the genomic segment differs entirely from that of the GPA, GPB, and GPE genes. Conservation of the direct repeats flanking the Alu sequence of the genomic segment strongly suggests that the sequence of this genomic segment has been maintained during evolution. This identified genomic segment was found to reside downstream from the GPA gene by both gene mapping and in situ chromosomal localization. The precursor genomic segment was also identified in the orangutan genome, which is known to lack GPB and GPE genes. These results indicate that one of the duplicated ancestral glycophorin genes acquired a unique 3' sequence by unequal crossing-over through its Alu sequence and the further downstream Alu sequence present in the duplicated gene. Further duplication and divergence of this gene yielded the GPB and GPE genes. 37 refs., 5 figs.« less

  15. Molecular Evolution and Functional Diversification of Replication Protein A1 in Plants

    PubMed Central

    Aklilu, Behailu B.; Culligan, Kevin M.

    2016-01-01

    Replication protein A (RPA) is a heterotrimeric, single-stranded DNA binding complex required for eukaryotic DNA replication, repair, and recombination. RPA is composed of three subunits, RPA1, RPA2, and RPA3. In contrast to single RPA subunit genes generally found in animals and yeast, plants encode multiple paralogs of RPA subunits, suggesting subfunctionalization. Genetic analysis demonstrates that five Arabidopsis thaliana RPA1 paralogs (RPA1A to RPA1E) have unique and overlapping functions in DNA replication, repair, and meiosis. We hypothesize here that RPA1 subfunctionalities will be reflected in major structural and sequence differences among the paralogs. To address this, we analyzed amino acid and nucleotide sequences of RPA1 paralogs from 25 complete genomes representing a wide spectrum of plants and unicellular green algae. We find here that the plant RPA1 gene family is divided into three general groups termed RPA1A, RPA1B, and RPA1C, which likely arose from two progenitor groups in unicellular green algae. In the family Brassicaceae the RPA1B and RPA1C groups have further expanded to include two unique sub-functional paralogs RPA1D and RPA1E, respectively. In addition, RPA1 groups have unique domains, motifs, cis-elements, gene expression profiles, and pattern of conservation that are consistent with proposed functions in monocot and dicot species, including a novel C-terminal zinc-finger domain found only in plant RPA1C-like sequences. These results allow for improved prediction of RPA1 subunit functions in newly sequenced plant genomes, and potentially provide a unique molecular tool to improve classification of Brassicaceae species. PMID:26858742

  16. Recombination and Population Mosaic of a Multifunctional Viral Gene, Adeno-Associated Virus cap

    PubMed Central

    Takeuchi, Yasuhiro; Myers, Richard; Danos, Olivier

    2008-01-01

    Homologous recombination is a dominant force in evolution and results in genetic mosaics. To detect evidence of recombination events and assess the biological significance of genetic mosaics, genome sequences for various viral populations of reasonably large size are now available in the GenBank. We studied a multi-functional viral gene, the adeno-associated virus (AAV) cap gene, which codes for three capsid proteins, VP1, VP2 and VP3. VP1-3 share a common C-terminal domain corresponding to VP3, which forms the viral core structure, while the VP1 unique N-terminal part contains an enzymatic domain with phospholipase A2 activity. Our recombinant detection program (RecI) revealed five novel recombination events, four of which have their cross-over points in the N-terminal, VP1 and VP2 unique region. Comparison of phylogenetic trees for different cap gene regions confirmed discordant phylogenies for the recombinant sequences. Furthermore, differences in the phylogenetic tree structures for the VP1 unique (VP1u) region and the rest of cap highlighted the mosaic nature of cap gene in the AAV population: two dominant forms of VP1u sequences were identified and these forms are linked to diverse sequences in the rest of cap gene. This observation together with the finding of frequent recombination in the VP1 and 2 unique regions suggests that this region is a recombination hot spot. Recombination events in this region preserve protein blocks of distinctive functions and contribute to convergence in VP1u and divergence of the rest of cap. Additionally the possible biological significance of two dominant VP1u forms is inferred. PMID:18286191

  17. A Repeat Look at Repeating Patterns

    ERIC Educational Resources Information Center

    Markworth, Kimberly A.

    2016-01-01

    A "repeating pattern" is a cyclical repetition of an identifiable core. Children in the primary grades usually begin pattern work with fairly simple patterns, such as AB, ABC, or ABB patterns. The unique letters represent unique elements, whereas the sequence of letters represents the core that is repeated. Based on color, shape,…

  18. MRO Sequence Checking Tool

    NASA Technical Reports Server (NTRS)

    Fisher, Forest; Gladden, Roy; Khanampornpan, Teerapat

    2008-01-01

    The MRO Sequence Checking Tool program, mro_check, automates significant portions of the MRO (Mars Reconnaissance Orbiter) sequence checking procedure. Though MRO has similar checks to the ODY s (Mars Odyssey) Mega Check tool, the checks needed for MRO are unique to the MRO spacecraft. The MRO sequence checking tool automates the majority of the sequence validation procedure and check lists that are used to validate the sequences generated by MRO MPST (mission planning and sequencing team). The tool performs more than 50 different checks on the sequence. The automation varies from summarizing data about the sequence needed for visual verification of the sequence, to performing automated checks on the sequence and providing a report for each step. To allow for the addition of new checks as needed, this tool is built in a modular fashion.

  19. Restricted transfer of learning between unimanual and bimanual finger sequences

    PubMed Central

    Bai, Wenjun

    2016-01-01

    When training bimanual skills, such as playing piano, people sometimes practice each hand separately and at a later stage combine the movements of the two hands. This poses the critical question of whether motor skills can be acquired by separately practicing each subcomponent or should be trained as a whole. In the present study, we addressed this question by training human subjects for 4 days in a unimanual or bimanual version of the discrete sequence production task. Both groups were then tested on trained and untrained sequences on both unimanual and bimanual versions of the task. Surprisingly, we found no evidence of transfer from trained unimanual to bimanual or from trained bimanual to unimanual sequences. In half the participants, we also investigated whether cuing the sequences on the left and right hand with unique letters would change transfer. With these cues, untrained sequences that shared some components with the trained sequences were performed more quickly than sequences that did not. However, the amount of this transfer was limited to ∼10% of the overall sequence-specific learning gains. These results suggest that unimanual and bimanual sequences are learned in separate representations. Making participants aware of the interrelationship between sequences can induce some transferrable component, although the main component of the skill remains unique to unimanual or bimanual execution. NEW & NOTEWORTHY Studies in reaching movement demonstrated that approximately half of motor learning can transfer across unimanual and bimanual contexts, suggesting that neural representations for unimanual and bimanual movements are fairly overlapping at the level of elementary movement. In this study, we show that little or no transfer occurred across unimanual and bimanual sequential finger movements. This result suggests that bimanual sequences are represented at a level of the motor hierarchy that integrates movements of both hands. PMID:27974447

  20. Molecular Dynamics Simulations of the 136 Unique Tetranucleotide Sequences of DNA Oligonucleotides. I. Research Design and Results on d(CpG) Steps

    PubMed Central

    Beveridge, David L.; Barreiro, Gabriela; Byun, K. Suzie; Case, David A.; Cheatham, Thomas E.; Dixit, Surjit B.; Giudice, Emmanuel; Lankas, Filip; Lavery, Richard; Maddocks, John H.; Osman, Roman; Seibert, Eleanore; Sklenar, Heinz; Stoll, Gautier; Thayer, Kelly M.; Varnai, Péter; Young, Matthew A.

    2004-01-01

    We describe herein a computationally intensive project aimed at carrying out molecular dynamics (MD) simulations including water and counterions on B-DNA oligomers containing all 136 unique tetranucleotide base sequences. This initiative was undertaken by an international collaborative effort involving nine research groups, the “Ascona B-DNA Consortium” (ABC). Calculations were carried out on the 136 cases imbedded in 39 DNA oligomers with repeating tetranucleotide sequences, capped on both ends by GC pairs and each having a total length of 15 nucleotide pairs. All MD simulations were carried out using a well-defined protocol, the AMBER suite of programs, and the parm94 force field. Phase I of the ABC project involves a total of ∼0.6 μs of simulation for systems containing ∼24,000 atoms. The resulting trajectories involve 600,000 coordinate sets and represent ∼400 gigabytes of data. In this article, the research design, details of the simulation protocol, informatics issues, and the organization of the results into a web-accessible database are described. Preliminary results from 15-ns MD trajectories are presented for the d(CpG) step in its 10 unique sequence contexts, and issues of stability and convergence, the extent of quasiergodic problems, and the possibility of long-lived conformational substates are discussed. PMID:15326025

  1. A low molecular weight artificial RNA of unique size with multiple probe target regions

    NASA Technical Reports Server (NTRS)

    Pitulle, C.; Dsouza, L.; Fox, G. E.

    1997-01-01

    Artificial RNAs (aRNAs) containing novel sequence segments embedded in a deletion mutant of Vibrio proteolyticus 5S rRNA have previously been shown to be expressed from a plasmid borne growth rate regulated promoter in E. coli. These aRNAs accumulate to high levels and their detection is a promising tool for studies in molecular microbial ecology and in environmental monitoring. Herein a new construct is described which illustrates the versatility of detection that is possible with aRNAs. This 3xPen aRNA construct carries a 72 nucleotide insert with three copies of a unique 17 base probe target sequence. This aRNA is 160 nucleotides in length and again accumulates to high levels in the E. coli cytoplasm without incorporating into ribosomes. The 3xPen aRNA illustrates two improvements in detection. First, by appropriate selection of insert size, we obtained an aRNA which provides a unique and hence, easily quantifiable peak, on a high resolution gel profile of low molecular weight RNAs. Second, the existence of multiple probe targets results in a nearly commensurate increase in signal when detection is by hybridization. These aRNAs are naturally amplified and carry sequence segments that are not found in known rRNA sequences. It thus may be possible to detect them directly. An experimental step involving RT-PCR or PCR amplification of the gene could therefore be avoided.

  2. Exploring the Presence of microDNAs in Prostate Cancer Cell Lines, Tissue, and Sera of Prostate Cancer Patients and its Possible Application as Biomarker

    DTIC Science & Technology

    2016-04-01

    Sequence tags were mapped on the human reference genome using the Novoalign software. Only those...ends of the linear islands to create a novel junctional sequence that does not exist in the genome . Thus the PE- sequence of a fragment that breaks at... genome (Fig. 3b). Those PE-tags where one tag maps uniquely to an island and the other remains unmapped, but passes the sequence quality filter,

  3. Genomic sequence for the aflatoxigenic filamentous fungus Aspergillus nomius

    USDA-ARS?s Scientific Manuscript database

    The genome of the A. nomius type strain was sequenced using a personal genome machine. Annotation of the genes was undertaken, followed by gene ontology and an investigation into the number of secondary metabolite clusters. Comparative studies with other Aspergillus species involved shared/unique ge...

  4. Evaluation of ribosomal RNA removal protocols for Salmonella RNA-Seq projects

    USDA-ARS?s Scientific Manuscript database

    Next generation sequencing is a powerful technology and its application to sequencing entire RNA populations of food-borne pathogens will provide valuable insights. A problem unique to prokaryotic RNA-Seq is the massive abundance of ribosomal RNA. Unlike eukaryotic messenger RNA (mRNA), bacterial ...

  5. Application of circular consensus sequencing and network analysis to characterize the bovine IgG repertoire

    USDA-ARS?s Scientific Manuscript database

    Background: Vertebrate immune systems generate diverse repertoires of antibodies capable of mediating response to a variety of antigens. Next generation sequencing methods provide unique approaches to a number of immuno-based research areas including antibody discovery and engineering, disease surve...

  6. A new endonuclease recognizing the deoxynucleotide sequence CCNNGG from the cyanobacterium Synechocystis 6701.

    PubMed

    Calléja, F; Tandeau de Marsac, N; Coursin, T; van Ormondt, H; de Waard, A

    1985-09-25

    A new sequence-specific endonuclease from the cyanobacterium Synechocystis species PCC 6701 has been purified and characterized. This enzyme, SecI, is unique in recognizing the nucleotide sequence: 5' -CCNNGG-3' 3' -GGNNCC-5' and cleaves it at the position indicated by the symbol. Two other restriction endonucleases, SecII and SecIII, found in this organism are isoschizomers of MspI and MstII, respectively.

  7. Transcriptomic analysis of Siberian ginseng (Eleutherococcus senticosus) to discover genes involved in saponin biosynthesis.

    PubMed

    Hwang, Hwan-Su; Lee, Hyoshin; Choi, Yong Eui

    2015-03-14

    Eleutherococcus senticosus, Siberian ginseng, is a highly valued woody medicinal plant belonging to the family Araliaceae. E. senticosus produces a rich variety of saponins such as oleanane-type, noroleanane-type, 29-hydroxyoleanan-type, and lupane-type saponins. Genomic or transcriptomic approaches have not been used to investigate the saponin biosynthetic pathway in this plant. In this study, de novo sequencing was performed to select candidate genes involved in the saponin biosynthetic pathway. A half-plate 454 pyrosequencing run produced 627,923 high-quality reads with an average sequence length of 422 bases. De novo assembly generated 72,811 unique sequences, including 15,217 contigs and 57,594 singletons. Approximately 48,300 (66.3%) unique sequences were annotated using BLAST similarity searches. All of the mevalonate pathway genes for saponin biosynthesis starting from acetyl-CoA were isolated. Moreover, 206 reads of cytochrome P450 (CYP) and 145 reads of uridine diphosphate glycosyltransferase (UGT) sequences were isolated. Based on methyl jasmonate (MeJA) treatment and real-time PCR (qPCR) analysis, 3 CYPs and 3 UGTs were finally selected as candidate genes involved in the saponin biosynthetic pathway. The identified sequences associated with saponin biosynthesis will facilitate the study of the functional genomics of saponin biosynthesis and genetic engineering of E. senticosus.

  8. Comprehensive Survey of Genetic Diversity in Chloroplast Genomes and 45S nrDNAs within Panax ginseng Species

    PubMed Central

    Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Lee, Hyun Oh; Joh, Ho Jun; Kim, Nam-Hoon; Park, Hyun-Seung; Yang, Tae-Jin

    2015-01-01

    We report complete sequences of chloroplast (cp) genome and 45S nuclear ribosomal DNA (45S nrDNA) for 11 Panax ginseng cultivars. We have obtained complete sequences of cp and 45S nrDNA, the representative barcoding target sequences for cytoplasm and nuclear genome, respectively, based on low coverage NGS sequence of each cultivar. The cp genomes sizes ranged from 156,241 to 156,425 bp and the major size variation was derived from differences in copy number of tandem repeats in the ycf1 gene and in the intergenic regions of rps16-trnUUG and rpl32-trnUAG. The complete 45S nrDNA unit sequences were 11,091 bp, representing a consensus single transcriptional unit with an intergenic spacer region. Comparative analysis of these sequences as well as those previously reported for three Chinese accessions identified very rare but unique polymorphism in the cp genome within P. ginseng cultivars. There were 12 intra-species polymorphisms (six SNPs and six InDels) among 14 cultivars. We also identified five SNPs from 45S nrDNA of 11 Korean ginseng cultivars. From the 17 unique informative polymorphic sites, we developed six reliable markers for analysis of ginseng diversity and cultivar authentication. PMID:26061692

  9. Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry.

    PubMed

    Asara, John M; Schweitzer, Mary H; Freimark, Lisa M; Phillips, Matthew; Cantley, Lewis C

    2007-04-13

    Fossilized bones from extinct taxa harbor the potential for obtaining protein or DNA sequences that could reveal evolutionary links to extant species. We used mass spectrometry to obtain protein sequences from bones of a 160,000- to 600,000-year-old extinct mastodon (Mammut americanum) and a 68-million-year-old dinosaur (Tyrannosaurus rex). The presence of T. rex sequences indicates that their peptide bonds were remarkably stable. Mass spectrometry can thus be used to determine unique sequences from ancient organisms from peptide fragmentation patterns, a valuable tool to study the evolution and adaptation of ancient taxa from which genomic sequences are unlikely to be obtained.

  10. Comparison of CNVs in Buffalo with other species

    USDA-ARS?s Scientific Manuscript database

    Using a read-depth (RD) and a hybrid read-pair, split-read (RAPTR-SV) CNV detection method, we identified over 1425 unique CNVs in 14 Water Buffalo individual compared to the cattle genome sequence. Total variable sequence of the CNV regions (CNVR) from the RD method approached 59 megabases (~ 2% of...

  11. A matrix-based approach to solving the inverse Frobenius-Perron problem using sequences of density functions of stochastically perturbed dynamical systems

    NASA Astrophysics Data System (ADS)

    Nie, Xiaokai; Coca, Daniel

    2018-01-01

    The paper introduces a matrix-based approach to estimate the unique one-dimensional discrete-time dynamical system that generated a given sequence of probability density functions whilst subjected to an additive stochastic perturbation with known density.

  12. A matrix-based approach to solving the inverse Frobenius-Perron problem using sequences of density functions of stochastically perturbed dynamical systems.

    PubMed

    Nie, Xiaokai; Coca, Daniel

    2018-01-01

    The paper introduces a matrix-based approach to estimate the unique one-dimensional discrete-time dynamical system that generated a given sequence of probability density functions whilst subjected to an additive stochastic perturbation with known density.

  13. A Unique (3+2) Annulation Reaction between Meldrum's Acid and Nitrones: Mechanistic Insight by ESI-IMS-MS and DFT Studies.

    PubMed

    Lespes, Nicolas; Pair, Etienne; Maganga, Clisy; Bretier, Marie; Tognetti, Vincent; Joubert, Laurent; Levacher, Vincent; Hubert-Roux, Marie; Afonso, Carlos; Loutelier-Bourhis, Corinne; Brière, Jean-François

    2018-03-15

    The fragile intermediates of the domino process leading to an isoxazolidin-5-one, triggered by unique reactivity between Meldrum's acid and an N-benzyl nitrone in the presence of a Brønsted base, were determined thanks to the softness and accuracy of electrospray ionization mass spectrometry coupled to ion mobility spectrometry (ESI-IMS-MS). The combined DFT study shed light on the overall organocatalytic sequence that starts with a stepwise (3+2) annulation reaction that is followed by a decarboxylative protonation sequence encompassing a stereoselective pathway issue. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. On continuous user authentication via typing behavior.

    PubMed

    Roth, Joseph; Liu, Xiaoming; Metaxas, Dimitris

    2014-10-01

    We hypothesize that an individual computer user has a unique and consistent habitual pattern of hand movements, independent of the text, while typing on a keyboard. As a result, this paper proposes a novel biometric modality named typing behavior (TB) for continuous user authentication. Given a webcam pointing toward a keyboard, we develop real-time computer vision algorithms to automatically extract hand movement patterns from the video stream. Unlike the typical continuous biometrics, such as keystroke dynamics (KD), TB provides a reliable authentication with a short delay, while avoiding explicit key-logging. We collect a video database where 63 unique subjects type static text and free text for multiple sessions. For one typing video, the hands are segmented in each frame and a unique descriptor is extracted based on the shape and position of hands, as well as their temporal dynamics in the video sequence. We propose a novel approach, named bag of multi-dimensional phrases, to match the cross-feature and cross-temporal pattern between a gallery sequence and probe sequence. The experimental results demonstrate a superior performance of TB when compared with KD, which, together with our ultrareal-time demo system, warrant further investigation of this novel vision application and biometric modality.

  15. The unique C- and N-terminal sequences of Metallothionein isoform 3 mediate growth inhibition and Vectorial active transport in MCF-7 cells.

    PubMed

    Voels, Brent; Wang, Liping; Sens, Donald A; Garrett, Scott H; Zhang, Ke; Somji, Seema

    2017-05-25

    The 3rd isoform of the metallothionein (MT3) gene family has been shown to be overexpressed in most ductal breast cancers. A previous study has shown that the stable transfection of MCF-7 cells with the MT3 gene inhibits cell growth. The goal of the present study was to determine the role of the unique C-terminal and N-terminal sequences of MT3 on phenotypic properties and gene expression profiles of MCF-7 cells. MCF-7 cells were transfected with various metallothionein gene constructs which contain the insertion or the removal of the unique MT3 C- and N-terminal domains. Global gene expression analysis was performed on the MCF-7 cells containing the various constructs and the expression of the unique C- and N- terminal domains of MT3 was correlated to phenotypic properties of the cells. The results of the present study demonstrate that the C-terminal sequence of MT3, in the absence of the N-terminal sequence, induces dome formation in MCF-7 cells, which in cell cultures is the phenotypic manifestation of a cell's ability to perform vectorial active transport. Global gene expression analysis demonstrated that the increased expression of the GAGE gene family correlated with dome formation. Expression of the C-terminal domain induced GAGE gene expression, whereas the N-terminal domain inhibited GAGE gene expression and that the effect of the N-terminal domain inhibition was dominant over the C-terminal domain of MT3. Transfection with the metallothionein 1E gene increased the expression of GAGE genes. In addition, both the C- and the N-terminal sequences of the MT3 gene had growth inhibitory properties, which correlated to an increased expression of the interferon alpha-inducible protein 6. Our study shows that the C-terminal domain of MT3 confers dome formation in MCF-7 cells and the presence of this domain induces expression of the GAGE family of genes. The differential effects of MT3 and metallothionein 1E on the expression of GAGE genes suggests unique roles of these genes in the development and progression of breast cancer. The finding that interferon alpha-inducible protein 6 expression is associated with the ability of MT3 to inhibit growth needs further investigation.

  16. Myelin protein zero gene sequencing diagnoses Charcot-Marie-Tooth Type 1B disease

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Su, Y.; Zhang, H.; Madrid, R.

    1994-09-01

    Charcot-Marie-Tooth disease (CMT), the most common genetic neuropathy, affects about 1 in 2600 people in Norway and is found worldwide. CMT Type 1 (CMT1) has slow nerve conduction with demyelinated Schwann cells. Autosomal dominant CMT Type 1B (CMT1B) results from mutations in the myelin protein zero gene which directs the synthesis of more than half of all Schwann cell protein. This gene was mapped to the chromosome 1q22-1q23.1 borderline by fluorescence in situ hybridization. The first 7 of 7 reported CMT1B mutations are unique. Thus the most effective means to identify CMT1B mutations in at-risk family members and fetuses ismore » to sequence the entire coding sequence in dominant or sporadic CMT patients without the CMT1A duplication. Of the 19 primers used in 16 pars to uniquely amplify the entire MPZ coding sequence, 6 primer pairs were used to amplify and sequence the 6 exons. The DyeDeoxy Terminator cycle sequencing method used with four different color fluorescent lables was superior to manual sequencing because it sequences more bases unambiguously from extracted genomic DNA samples within 24 hours. This protocol was used to test 28 CMT and Dejerine-Sottas patients without CMT1A gene duplication. Sequencing MPZ gene-specific amplified fragments identified 9 polymorphic sites within the 6 exons that encode the 248 amino acid MPZ protein. The large number of major CMT1B mutations identified by single strand sequencing are being verified by reverse strand sequencing and when possible, by restriction enzyme analysis. This protocol can be used to distringuish CMT1B patients from othre CMT phenotypes and to determine the CMT1B status of relatives both presymptomatically and prenatally.« less

  17. A public HTLV-1 molecular epidemiology database for sequence management and data mining.

    PubMed

    Araujo, Thessika Hialla Almeida; Souza-Brito, Leandro Inacio; Libin, Pieter; Deforche, Koen; Edwards, Dustin; de Albuquerque-Junior, Antonio Eduardo; Vandamme, Anne-Mieke; Galvao-Castro, Bernardo; Alcantara, Luiz Carlos Junior

    2012-01-01

    It is estimated that 15 to 20 million people are infected with the human T-cell lymphotropic virus type 1 (HTLV-1). At present, there are more than 2,000 unique HTLV-1 isolate sequences published. A central database to aggregate sequence information from a range of epidemiological aspects including HTLV-1 infections, pathogenesis, origins, and evolutionary dynamics would be useful to scientists and physicians worldwide. Described here, we have developed a database that collects and annotates sequence data and can be accessed through a user-friendly search interface. The HTLV-1 Molecular Epidemiology Database website is available at http://htlv1db.bahia.fiocruz.br/. All data was obtained from publications available at GenBank or through contact with the authors. The database was developed using Apache Webserver 2.1.6 and SGBD MySQL. The webpage interfaces were developed in HTML and sever-side scripting written in PHP. The HTLV-1 Molecular Epidemiology Database is hosted on the Gonçalo Moniz/FIOCRUZ Research Center server. There are currently 2,457 registered sequences with 2,024 (82.37%) of those sequences representing unique isolates. Of these sequences, 803 (39.67%) contain information about clinical status (TSP/HAM, 17.19%; ATL, 7.41%; asymptomatic, 12.89%; other diseases, 2.17%; and no information, 60.32%). Further, 7.26% of sequences contain information on patient gender while 5.23% of sequences provide the age of the patient. The HTLV-1 Molecular Epidemiology Database retrieves and stores annotated HTLV-1 proviral sequences from clinical, epidemiological, and geographical studies. The collected sequences and related information are now accessible on a publically available and user-friendly website. This open-access database will support clinical research and vaccine development related to viral genotype.

  18. Unique autosomal recessive variant of palmoplantar keratoderma associated with hearing loss not caused by known mutations*

    PubMed Central

    Hegazi, Moustafa Abdelaal; Manou, Sommen; Sakr, Hazem; Camp, Guy Van

    2017-01-01

    Inherited Palmoplantar Keratodermas are rare disorders of genodermatosis that are conventionally regarded as autosomal dominant in inheritance with extensive clinical and genetic heterogeneity. This is the first report of a unique autosomal recessive Inherited Palmoplantar keratoderma - sensorineural hearing loss syndrome which has not been reported before in 3 siblings of a large consanguineous family. The patients presented unique clinical features that were different from other known Inherited Palmoplantar Keratodermas - hearing loss syndromes. Mutations in GJB2 or GJB6 and the mitochondrial A7445G mutation, known to be the major causes of diverse Inherited Palmoplantar Keratodermas -hearing loss syndromes were not detected by Sanger sequencing. Moreover, the pathogenic mutation could not be identified using whole exome sequencing. Other known Inherited Palmoplantar keratoderma syndromes were excluded based on both clinical criteria and genetic analysis. PMID:29267478

  19. Quantitative profiling of immune repertoires for minor lymphocyte counts using unique molecular identifiers.

    PubMed

    Egorov, Evgeny S; Merzlyak, Ekaterina M; Shelenkov, Andrew A; Britanova, Olga V; Sharonov, George V; Staroverov, Dmitriy B; Bolotin, Dmitriy A; Davydov, Alexey N; Barsova, Ekaterina; Lebedev, Yuriy B; Shugay, Mikhail; Chudakov, Dmitriy M

    2015-06-15

    Emerging high-throughput sequencing methods for the analyses of complex structure of TCR and BCR repertoires give a powerful impulse to adaptive immunity studies. However, there are still essential technical obstacles for performing a truly quantitative analysis. Specifically, it remains challenging to obtain comprehensive information on the clonal composition of small lymphocyte populations, such as Ag-specific, functional, or tissue-resident cell subsets isolated by sorting, microdissection, or fine needle aspirates. In this study, we report a robust approach based on unique molecular identifiers that allows profiling Ag receptors for several hundred to thousand lymphocytes while preserving qualitative and quantitative information on clonal composition of the sample. We also describe several general features regarding the data analysis with unique molecular identifiers that are critical for accurate counting of starting molecules in high-throughput sequencing applications. Copyright © 2015 by The American Association of Immunologists, Inc.

  20. Deep sequencing and in silico analysis of small RNA library reveals novel miRNA from leaf Persicaria minor transcriptome.

    PubMed

    Samad, Abdul Fatah A; Nazaruddin, Nazaruddin; Murad, Abdul Munir Abdul; Jani, Jaeyres; Zainal, Zamri; Ismail, Ismanizan

    2018-03-01

    In current era, majority of microRNA (miRNA) are being discovered through computational approaches which are more confined towards model plants. Here, for the first time, we have described the identification and characterization of novel miRNA in a non-model plant, Persicaria minor ( P . minor ) using computational approach. Unannotated sequences from deep sequencing were analyzed based on previous well-established parameters. Around 24 putative novel miRNAs were identified from 6,417,780 reads of the unannotated sequence which represented 11 unique putative miRNA sequences. PsRobot target prediction tool was deployed to identify the target transcripts of putative novel miRNAs. Most of the predicted target transcripts (mRNAs) were known to be involved in plant development and stress responses. Gene ontology showed that majority of the putative novel miRNA targets involved in cellular component (69.07%), followed by molecular function (30.08%) and biological process (0.85%). Out of 11 unique putative miRNAs, 7 miRNAs were validated through semi-quantitative PCR. These novel miRNAs discoveries in P . minor may develop and update the current public miRNA database.

  1. DNABIT Compress – Genome compression algorithm

    PubMed Central

    Rajarajeswari, Pothuraju; Apparao, Allam

    2011-01-01

    Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, “DNABIT Compress” for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that “DNABIT Compress” algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases. PMID:21383923

  2. The Genome Sequencer FLX System--longer reads, more applications, straight forward bioinformatics and more complete data sets.

    PubMed

    Droege, Marcus; Hill, Brendon

    2008-08-31

    The Genome Sequencer FLX System (GS FLX), powered by 454 Sequencing, is a next-generation DNA sequencing technology featuring a unique mix of long reads, exceptional accuracy, and ultra-high throughput. It has been proven to be the most versatile of all currently available next-generation sequencing technologies, supporting many high-profile studies in over seven applications categories. GS FLX users have pursued innovative research in de novo sequencing, re-sequencing of whole genomes and target DNA regions, metagenomics, and RNA analysis. 454 Sequencing is a powerful tool for human genetics research, having recently re-sequenced the genome of an individual human, currently re-sequencing the complete human exome and targeted genomic regions using the NimbleGen sequence capture process, and detected low-frequency somatic mutations linked to cancer.

  3. The Comprehensive Antibiotic Resistance Database

    PubMed Central

    McArthur, Andrew G.; Waglechner, Nicholas; Nizam, Fazmin; Yan, Austin; Azad, Marisa A.; Baylay, Alison J.; Bhullar, Kirandeep; Canova, Marc J.; De Pascale, Gianfranco; Ejim, Linda; Kalan, Lindsay; King, Andrew M.; Koteva, Kalinka; Morar, Mariya; Mulvey, Michael R.; O'Brien, Jonathan S.; Pawlowski, Andrew C.; Piddock, Laura J. V.; Spanogiannopoulos, Peter; Sutherland, Arlene D.; Tang, Irene; Taylor, Patricia L.; Thaker, Maulik; Wang, Wenliang; Yan, Marie; Yu, Tennison

    2013-01-01

    The field of antibiotic drug discovery and the monitoring of new antibiotic resistance elements have yet to fully exploit the power of the genome revolution. Despite the fact that the first genomes sequenced of free living organisms were those of bacteria, there have been few specialized bioinformatic tools developed to mine the growing amount of genomic data associated with pathogens. In particular, there are few tools to study the genetics and genomics of antibiotic resistance and how it impacts bacterial populations, ecology, and the clinic. We have initiated development of such tools in the form of the Comprehensive Antibiotic Research Database (CARD; http://arpcard.mcmaster.ca). The CARD integrates disparate molecular and sequence data, provides a unique organizing principle in the form of the Antibiotic Resistance Ontology (ARO), and can quickly identify putative antibiotic resistance genes in new unannotated genome sequences. This unique platform provides an informatic tool that bridges antibiotic resistance concerns in health care, agriculture, and the environment. PMID:23650175

  4. Unique features of a global human ectoparasite identified through sequencing of the bed bug genome.

    PubMed

    Benoit, Joshua B; Adelman, Zach N; Reinhardt, Klaus; Dolan, Amanda; Poelchau, Monica; Jennings, Emily C; Szuter, Elise M; Hagan, Richard W; Gujar, Hemant; Shukla, Jayendra Nath; Zhu, Fang; Mohan, M; Nelson, David R; Rosendale, Andrew J; Derst, Christian; Resnik, Valentina; Wernig, Sebastian; Menegazzi, Pamela; Wegener, Christian; Peschel, Nicolai; Hendershot, Jacob M; Blenau, Wolfgang; Predel, Reinhard; Johnston, Paul R; Ioannidis, Panagiotis; Waterhouse, Robert M; Nauen, Ralf; Schorn, Corinna; Ott, Mark-Christoph; Maiwald, Frank; Johnston, J Spencer; Gondhalekar, Ameya D; Scharf, Michael E; Peterson, Brittany F; Raje, Kapil R; Hottel, Benjamin A; Armisén, David; Crumière, Antonin Jean Johan; Refki, Peter Nagui; Santos, Maria Emilia; Sghaier, Essia; Viala, Sèverine; Khila, Abderrahman; Ahn, Seung-Joon; Childers, Christopher; Lee, Chien-Yueh; Lin, Han; Hughes, Daniel S T; Duncan, Elizabeth J; Murali, Shwetha C; Qu, Jiaxin; Dugan, Shannon; Lee, Sandra L; Chao, Hsu; Dinh, Huyen; Han, Yi; Doddapaneni, Harshavardhan; Worley, Kim C; Muzny, Donna M; Wheeler, David; Panfilio, Kristen A; Vargas Jentzsch, Iris M; Vargo, Edward L; Booth, Warren; Friedrich, Markus; Weirauch, Matthew T; Anderson, Michelle A E; Jones, Jeffery W; Mittapalli, Omprakash; Zhao, Chaoyang; Zhou, Jing-Jiang; Evans, Jay D; Attardo, Geoffrey M; Robertson, Hugh M; Zdobnov, Evgeny M; Ribeiro, Jose M C; Gibbs, Richard A; Werren, John H; Palli, Subba R; Schal, Coby; Richards, Stephen

    2016-02-02

    The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the past two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host-symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human-bed bug and symbiont-bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite.

  5. Unique features of a global human ectoparasite identified through sequencing of the bed bug genome

    PubMed Central

    Benoit, Joshua B.; Adelman, Zach N.; Reinhardt, Klaus; Dolan, Amanda; Poelchau, Monica; Jennings, Emily C.; Szuter, Elise M.; Hagan, Richard W.; Gujar, Hemant; Shukla, Jayendra Nath; Zhu, Fang; Mohan, M.; Nelson, David R.; Rosendale, Andrew J.; Derst, Christian; Resnik, Valentina; Wernig, Sebastian; Menegazzi, Pamela; Wegener, Christian; Peschel, Nicolai; Hendershot, Jacob M.; Blenau, Wolfgang; Predel, Reinhard; Johnston, Paul R.; Ioannidis, Panagiotis; Waterhouse, Robert M.; Nauen, Ralf; Schorn, Corinna; Ott, Mark-Christoph; Maiwald, Frank; Johnston, J. Spencer; Gondhalekar, Ameya D.; Scharf, Michael E.; Peterson, Brittany F.; Raje, Kapil R.; Hottel, Benjamin A.; Armisén, David; Crumière, Antonin Jean Johan; Refki, Peter Nagui; Santos, Maria Emilia; Sghaier, Essia; Viala, Sèverine; Khila, Abderrahman; Ahn, Seung-Joon; Childers, Christopher; Lee, Chien-Yueh; Lin, Han; Hughes, Daniel S. T.; Duncan, Elizabeth J.; Murali, Shwetha C.; Qu, Jiaxin; Dugan, Shannon; Lee, Sandra L.; Chao, Hsu; Dinh, Huyen; Han, Yi; Doddapaneni, Harshavardhan; Worley, Kim C.; Muzny, Donna M.; Wheeler, David; Panfilio, Kristen A.; Vargas Jentzsch, Iris M.; Vargo, Edward L.; Booth, Warren; Friedrich, Markus; Weirauch, Matthew T.; Anderson, Michelle A. E.; Jones, Jeffery W.; Mittapalli, Omprakash; Zhao, Chaoyang; Zhou, Jing-Jiang; Evans, Jay D.; Attardo, Geoffrey M.; Robertson, Hugh M.; Zdobnov, Evgeny M.; Ribeiro, Jose M. C.; Gibbs, Richard A.; Werren, John H.; Palli, Subba R.; Schal, Coby; Richards, Stephen

    2016-01-01

    The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the past two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host–symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human–bed bug and symbiont–bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite. PMID:26836814

  6. A Molecular Framework for Understanding DCIS

    DTIC Science & Technology

    2016-10-01

    frozen patient biopsies, these have been annotated by our pathologist and prepared to be taken on for sequencing. The tissue includes DCIS, IDC...stroma adjacent to DCIS/IDC and normal tissue . We have initiated the RNA sequencing from these samples and also the DNA sequencing 15. SUBJECT TERMS DCIS...before they reach 55. Utilizing a unique bank of frozen mammary biopsies, containing samples with DCIS alone, and a combination of DCIS and IDC, we aim

  7. A new endonuclease recognizing the deoxynucleotide sequence CCNNGG from the cyanobacterium Synechocystis 6701.

    PubMed Central

    Calléja, F; Tandeau de Marsac, N; Coursin, T; van Ormondt, H; de Waard, A

    1985-01-01

    A new sequence-specific endonuclease from the cyanobacterium Synechocystis species PCC 6701 has been purified and characterized. This enzyme, SecI, is unique in recognizing the nucleotide sequence: 5' -CCNNGG-3' 3' -GGNNCC-5' and cleaves it at the position indicated by the symbol. Two other restriction endonucleases, SecII and SecIII, found in this organism are isoschizomers of MspI and MstII, respectively. Images PMID:2997722

  8. Ohmic resistance in a multi-anode MxCs

    EPA Pesticide Factsheets

    A-3txf_sequence summary.xksx: Abundance of contigs or unique sequences for each biofilm samples from anodes in the MEC reactorHodon Waterloo final_fasta_working.docx: Raw sequences with their identification numbersRNA S1_MEC.docx: Representative sequences with their ID number and taxonomyThis dataset is associated with the following publication:Santodomingo, J., H. Ryu, B. Dhar, and H. Lee. Ohmic resistance affects microbial community and electrochemical kinetics in a multi-anode microbial electrochemical cell. JOURNAL OF POWER SOURCES. Elsevier Science Ltd, New York, NY, USA, 331: 315-321, (2016).

  9. Repeated sequence sets in mitochondrial DNA molecules of root knot nematodes (Meloidogyne): nucleotide sequences, genome location and potential for host-race identification.

    PubMed Central

    Okimoto, R; Chamberlin, H M; Macfarlane, J L; Wolstenholme, D R

    1991-01-01

    Within a 7 kb segment of the mtDNA molecule of the root knot nematode, Meloidogyne javanica, that lacks standard mitochondrial genes, are three sets of strictly tandemly arranged, direct repeat sequences: approximately 36 copies of a 102 ntp sequence that contains a TaqI site; 11 copies of a 63 ntp sequence, and 5 copies of an 8 ntp sequence. The 7 kb repeat-containing segment is bounded by putative tRNAasp and tRNAf-met genes and the arrangement of sequences within this segment is: the tRNAasp gene; a unique 1,528 ntp segment that contains two highly stable hairpin-forming sequences; the 102 ntp repeat set; the 8 ntp repeat set; a unique 1,068 ntp segment; the 63 ntp repeat set; and the tRNAf-met gene. The nucleotide sequences of the 102 ntp copies and the 63 ntp copies have been conserved among the species examined. Data from Southern hybridization experiments indicate that 102 ntp and 63 ntp repeats occur in the mtDNAs of three, two and two races of M.incognita, M.hapla and M.arenaria, respectively. Nucleotide sequences of the M.incognita Race-3 102 ntp repeat were found to be either identical or highly similar to those of the M.javanica 102 ntp repeat. Differences in migration distance and number of 102 ntp repeat-containing bands seen in Southern hybridization autoradiographs of restriction-digested mtDNAs of M.javanica and the different host races of M.incognita, M.hapla and M.arenaria are sufficient to distinguish the different host races of each species. Images PMID:2027769

  10. Mitochondrial Genome Sequence of the Legume Vicia faba

    PubMed Central

    Negruk, Valentine

    2013-01-01

    The number of plant mitochondrial genomes sequenced exceeds two dozen. However, for a detailed comparative study of different phylogenetic branches more plant mitochondrial genomes should be sequenced. This article presents sequencing data and comparative analysis of mitochondrial DNA (mtDNA) of the legume Vicia faba. The size of the V. faba circular mitochondrial master chromosome of cultivar Broad Windsor was estimated as 588,000 bp with a genome complexity of 387,745 bp and 52 conservative mitochondrial genes; 32 of them encoding proteins, 3 rRNA, and 17 tRNA genes. Six tRNA genes were highly homologous to chloroplast genome sequences. In addition to the 52 conservative genes, 114 unique open reading frames (ORFs) were found, 36 without significant homology to any known proteins and 29 with homology to the Medicago truncatula nuclear genome and to other plant mitochondrial ORFs, 49 ORFs were not homologous to M. truncatula but possessed sequences with significant homology to other plant mitochondrial or nuclear ORFs. In general, the unique ORFs revealed very low homology to known closely related legumes, but several sequence homologies were found between V. faba, Beta vulgaris, Nicotiana tabacum, Vitis vinifera, and even the monocots Oryza sativa and Zea mays. Most likely these ORFs arose independently during angiosperm evolution (Kubo and Mikami, 2007; Kubo and Newton, 2008). Computational analysis revealed in total about 45% of V. faba mtDNA sequence being homologous to the Medicago truncatula nuclear genome (more than to any sequenced plant mitochondrial genome), and 35% of this homology ranging from a few dozen to 12,806 bp are located on chromosome 1. Apparently, mitochondrial rrn5, rrn18, rps10, ATP synthase subunit alpha, cox2, and tRNA sequences are part of transcribed nuclear mosaic ORFs. PMID:23675376

  11. Genome sequencing and analysis of a type A Clostridium perfringens isolate from a case of bovine clostridial abomasitis.

    PubMed

    Nowell, Victoria J; Kropinski, Andrew M; Songer, J Glenn; MacInnes, Janet I; Parreira, Valeria R; Prescott, John F

    2012-01-01

    Clostridium perfringens is a common inhabitant of the avian and mammalian gastrointestinal tracts and can behave commensally or pathogenically. Some enteric diseases caused by type A C. perfringens, including bovine clostridial abomasitis, remain poorly understood. To investigate the potential basis of virulence in strains causing this disease, we sequenced the genome of a type A C. perfringens isolate (strain F262) from a case of bovine clostridial abomasitis. The ∼3.34 Mbp chromosome of C. perfringens F262 is predicted to contain 3163 protein-coding genes, 76 tRNA genes, and an integrated plasmid sequence, Cfrag (∼18 kb). In addition, sequences of two complete circular plasmids, pF262C (4.8 kb) and pF262D (9.1 kb), and two incomplete plasmid fragments, pF262A (48.5 kb) and pF262B (50.0 kb), were identified. Comparison of the chromosome sequence of C. perfringens F262 to complete C. perfringens chromosomes, plasmids and phages revealed 261 unique genes. No novel toxin genes related to previously described clostridial toxins were identified: 60% of the 261 unique genes were hypothetical proteins. There was a two base pair deletion in virS, a gene reported to encode the main sensor kinase involved in virulence gene activation. Despite this frameshift mutation, C. perfringens F262 expressed perfringolysin O, alpha-toxin and the beta2-toxin, suggesting that another regulation system might contribute to the pathogenicity of this strain. Two complete plasmids, pF262C (4.8 kb) and pF262D (9.1 kb), unique to this strain of C. perfringens were identified.

  12. Genome Sequencing and Analysis of a Type A Clostridium perfringens Isolate from a Case of Bovine Clostridial Abomasitis

    PubMed Central

    Nowell, Victoria J.; Kropinski, Andrew M.; Songer, J. Glenn; MacInnes, Janet I.; Parreira, Valeria R.; Prescott, John F.

    2012-01-01

    Clostridium perfringens is a common inhabitant of the avian and mammalian gastrointestinal tracts and can behave commensally or pathogenically. Some enteric diseases caused by type A C. perfringens, including bovine clostridial abomasitis, remain poorly understood. To investigate the potential basis of virulence in strains causing this disease, we sequenced the genome of a type A C. perfringens isolate (strain F262) from a case of bovine clostridial abomasitis. The ∼3.34 Mbp chromosome of C. perfringens F262 is predicted to contain 3163 protein-coding genes, 76 tRNA genes, and an integrated plasmid sequence, Cfrag (∼18 kb). In addition, sequences of two complete circular plasmids, pF262C (4.8 kb) and pF262D (9.1 kb), and two incomplete plasmid fragments, pF262A (48.5 kb) and pF262B (50.0 kb), were identified. Comparison of the chromosome sequence of C. perfringens F262 to complete C. perfringens chromosomes, plasmids and phages revealed 261 unique genes. No novel toxin genes related to previously described clostridial toxins were identified: 60% of the 261 unique genes were hypothetical proteins. There was a two base pair deletion in virS, a gene reported to encode the main sensor kinase involved in virulence gene activation. Despite this frameshift mutation, C. perfringens F262 expressed perfringolysin O, alpha-toxin and the beta2-toxin, suggesting that another regulation system might contribute to the pathogenicity of this strain. Two complete plasmids, pF262C (4.8 kb) and pF262D (9.1 kb), unique to this strain of C. perfringens were identified. PMID:22412860

  13. Variations in Nuclear Localization Strategies Among Pol X Family Enzymes.

    PubMed

    Kirby, Thomas W; Pedersen, Lars C; Gabel, Scott A; Gassman, Natalie R; London, Robert E

    2018-06-22

    Despite the essential roles of pol X family enzymes in DNA repair, information about the structural basis of their nuclear import is limited. Recent studies revealed the unexpected presence of a functional NLS in DNA polymerase β, indicating the importance of active nuclear targeting, even for enzymes likely to leak into and out of the nucleus. The current studies further explore the active nuclear transport of these enzymes by identifying and structurally characterizing the functional NLS sequences in the three remaining human pol X enzymes: terminal deoxynucleotidyl transferase (TdT), DNA polymerase μ (pol μ), and DNA polymerase λ (pol λ). NLS identifications are based on Importin α (Impα) binding affinity determined by fluorescence polarization of fluorescein-labeled NLS peptides, X-ray crystallographic analysis of the Impα∆IBB•NLS complexes, and fluorescence-based subcellular localization studies. All three polymerases use NLS sequences located near their N-terminus; TdT and pol μ utilize monopartite NLS sequences, while pol λ utilizes a bipartite sequence, unique among the pol X family members. The pol μ NLS has relatively weak measured affinity for Impα, due in part to its proximity to the N-terminus that limits non-specific interactions of flanking residues preceding the NLS. However, this effect is partially mitigated by an N-terminal sequence unsupportive of Met1 removal by methionine aminopeptidase, leading to a 3-fold increase in affinity when the N-terminal methionine is present. Nuclear targeting is unique to each pol X family enzyme with variations dependent on the structure and unique functional role of each polymerase. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  14. Analysis and functional annotation of expressed sequence tags from in vitro cell lines of elasmobranchs: spiny dogfish shark (Squalus acanthias) and little skate (Leucoraja erinacea)

    PubMed Central

    Parton, Angela; Bayne, Christopher J.; Barnes, David W.

    2010-01-01

    Elasmobranchs are the most commonly used experimental models among the jawed, cartilaginous fish (Chondrichthyes). Previously we developed cell lines from embryos of two elasmobranchs, Squalus acanthias the spiny dogfish shark (SAE line), and Leucoraja erinacea the little skate (LEE-1 line). From these lines cDNA libraries were derived and expressed sequence tags (ESTs) generated. From the SAE cell line 4303 unique transcripts were identified, with 1848 of these representing unknown sequences (showing no BLASTX identification). From the LEE-1 cell line, 3660 unique transcripts were identified, and unknown, unique sequences totaled 1333. Gene Ontology (GO) annotation showed that GO assignments for the two cell lines were in general similar. These results suggest that the procedures used to derive the cell lines led to isolation of cell types of the same general embryonic origin from both species. The LEE-1 transcripts included GO categories “envelope” and “oxidoreductase activity” but the SAE transcripts did not. GO analysis of SAE transcripts identified the category “anatomical structure formation” that was not present in LEE-1 cells. Increased organelle compartments may exist within LEE-1 cells compared to SAE cells, and the higher oxidoreductase activity in LEE-1 cells may indicate a role for these cells in responses associated with innate immunity or in steroidogenesis. These EST libraries from elasmobranch cell lines provide information for assembly of genomic sequences and are useful in revealing gene diversity, new genes and molecular markers, as well as in providing means for elucidation of full-length cDNAs and probes for gene array analyses. This is the first study of this type with members of the Chondrichthyes. PMID:20471924

  15. Analysis and functional annotation of expressed sequence tags from in vitro cell lines of elasmobranchs: Spiny dogfish shark (Squalus acanthias) and little skate (Leucoraja erinacea).

    PubMed

    Parton, Angela; Bayne, Christopher J; Barnes, David W

    2010-09-01

    Elasmobranchs are the most commonly used experimental models among the jawed, cartilaginous fish (Chondrichthyes). Previously we developed cell lines from embryos of two elasmobranchs, Squalus acanthias the spiny dogfish shark (SAE line), and Leucoraja erinacea the little skate (LEE-1 line). From these lines cDNA libraries were derived and expressed sequence tags (ESTs) generated. From the SAE cell line 4303 unique transcripts were identified, with 1848 of these representing unknown sequences (showing no BLASTX identification). From the LEE-1 cell line, 3660 unique transcripts were identified, and unknown, unique sequences totaled 1333. Gene Ontology (GO) annotation showed that GO assignments for the two cell lines were in general similar. These results suggest that the procedures used to derive the cell lines led to isolation of cell types of the same general embryonic origin from both species. The LEE-1 transcripts included GO categories "envelope" and "oxidoreductase activity" but the SAE transcripts did not. GO analysis of SAE transcripts identified the category "anatomical structure formation" that was not present in LEE-1 cells. Increased organelle compartments may exist within LEE-1 cells compared to SAE cells, and the higher oxidoreductase activity in LEE-1 cells may indicate a role for these cells in responses associated with innate immunity or in steroidogenesis. These EST libraries from elasmobranch cell lines provide information for assembly of genomic sequences and are useful in revealing gene diversity, new genes and molecular markers, as well as in providing means for elucidation of full-length cDNAs and probes for gene array analyses. This is the first study of this type with members of the Chondrichthyes. Copyright 2010 Elsevier Inc. All rights reserved.

  16. Measures of Working Memory, Sequence Learning, and Speech Recognition in the Elderly.

    ERIC Educational Resources Information Center

    Humes, Larry E.; Floyd, Shari S.

    2005-01-01

    This study describes the measurement of 2 cognitive functions, working-memory capacity and sequence learning, in 2 groups of listeners: young adults with normal hearing and elderly adults with impaired hearing. The measurement of these 2 cognitive abilities with a unique, nonverbal technique capable of auditory, visual, and auditory-visual…

  17. An Investigation of the Effects of CRA Instruction and Students with Autism Spectrum Disorder

    ERIC Educational Resources Information Center

    Stroizer, Shaunita; Hinton, Vanessa; Flores, Margaret; Terry, LaTonya

    2015-01-01

    Students with Autism Spectrum Disorders (ASD) have unique educational needs. The concrete representational abstract (CRA) instructional sequence has been shown effective in teaching students with mathematical difficulties. The purpose of this study was to examine the effects of the CRA sequence in teaching students with ASD. A multiple baseline…

  18. Nucleotide cleaving agents and method

    DOEpatents

    Que, Jr., Lawrence; Hanson, Richard S.; Schnaith, Leah M. T.

    2000-01-01

    The present invention provides a unique series of nucleotide cleaving agents and a method for cleaving a nucleotide sequence, whether single-stranded or double-stranded DNA or RNA, using and a cationic metal complex having at least one polydentate ligand to cleave the nucleotide sequence phosphate backbone to yield a hydroxyl end and a phosphate end.

  19. Programming and Reprogramming Sequence Timing Following High and Low Contextual Interference Practice

    ERIC Educational Resources Information Center

    Wright, David L.; Magnuson, Curt E.; Black, Charles B.

    2005-01-01

    Individuals practiced two unique discrete sequence production tasks that differed in their relative time profile in either a blocked or random practice schedule. Each participant was subsequently administered a "precuing" protocol to examine the cost of initially compiling or modifying the plan for an upcoming movement's relative timing. The…

  20. Differentially expressed genes of Coptotermes formosanus (Isoptera: Rhinotermitidae) challenged by chemical insecticides.

    PubMed

    Zhang, Yi; Zhao, Yuanyuan; Qiu, Xuehong; Han, Richou

    2013-08-01

    Coptotermes formosanus Shiraki (Isoptera: Rhinotermitidae) termites are harmful social insects to wood constructions. The current control methods heavily depend on the chemical insecticides with increasing resistance. Analysis of the differentially expressed genes mediated by chemical insecticides will contribute to the understanding of the termite resistance to chemicals and to the establishment of alternative control measures. In the present article, a full-length cDNA library was constructed from the termites induced by a mixture of commonly used insecticides (0.01% sulfluramid and 0.01% triflumuron) for 24 h, by using the RNA ligase-mediated Rapid Amplification cDNA End method. Fifty-eight differentially expressed clones were obtained by polymerase chain reaction and confirmed by dot-blot hybridization. Forty-six known sequences were obtained, which clustered into 33 unique sequences grouped in 6 contigs and 27 singlets. Sixty-seven percent (22) of the sequences had counterpart genes from other organisms, whereas 33% (11) were undescribed. A Gene Ontology analysis classified 33 unique sequences into different functional categories. In general, most of the differential expression genes were involved in binding and catalytic activity.

  1. Predicting protein crystallization propensity from protein sequence

    PubMed Central

    2011-01-01

    The high-throughput structure determination pipelines developed by structural genomics programs offer a unique opportunity for data mining. One important question is how protein properties derived from a primary sequence correlate with the protein’s propensity to yield X-ray quality crystals (crystallizability) and 3D X-ray structures. A set of protein properties were computed for over 1,300 proteins that expressed well but were insoluble, and for ~720 unique proteins that resulted in X-ray structures. The correlation of the protein’s iso-electric point and grand average hydropathy (GRAVY) with crystallizability was analyzed for full length and domain constructs of protein targets. In a second step, several additional properties that can be calculated from the protein sequence were added and evaluated. Using statistical analyses we have identified a set of the attributes correlating with a protein’s propensity to crystallize and implemented a Support Vector Machine (SVM) classifier based on these. We have created applications to analyze and provide optimal boundary information for query sequences and to visualize the data. These tools are available via the web site http://bioinformatics.anl.gov/cgi-bin/tools/pdpredictor. PMID:20177794

  2. Genome Sequence of the Bacterium Streptomyces davawensis JCM 4913 and Heterologous Production of the Unique Antibiotic Roseoflavin

    PubMed Central

    Jankowitsch, Frank; Schwarz, Julia; Rückert, Christian; Gust, Bertolt; Szczepanowski, Rafael; Blom, Jochen; Pelzer, Stefan; Kalinowski, Jörn

    2012-01-01

    Streptomyces davawensis JCM 4913 synthesizes the antibiotic roseoflavin, a structural riboflavin (vitamin B2) analog. Here, we report the 9,466,619-bp linear chromosome of S. davawensis JCM 4913 and a 89,331-bp linear plasmid. The sequence has an average G+C content of 70.58% and contains six rRNA operons (16S-23S-5S) and 69 tRNA genes. The 8,616 predicted protein-coding sequences include 32 clusters coding for secondary metabolites, several of which are unique to S. davawensis. The chromosome contains long terminal inverted repeats of 33,255 bp each and atypical telomeres. Sequence analysis with regard to riboflavin biosynthesis revealed three different patterns of gene organization in Streptomyces species. Heterologous expression of a set of genes present on a subgenomic fragment of S. davawensis resulted in the production of roseoflavin by the host Streptomyces coelicolor M1152. Phylogenetic analysis revealed that S. davawensis is a close relative of Streptomyces cinnabarinus, and much to our surprise, we found that the latter bacterium is a roseoflavin producer as well. PMID:23043000

  3. Onco-Regulon: an integrated database and software suite for site specific targeting of transcription factors of cancer genes

    PubMed Central

    Tomar, Navneet; Mishra, Akhilesh; Mrinal, Nirotpal; Jayaram, B.

    2016-01-01

    Transcription factors (TFs) bind at multiple sites in the genome and regulate expression of many genes. Regulating TF binding in a gene specific manner remains a formidable challenge in drug discovery because the same binding motif may be present at multiple locations in the genome. Here, we present Onco-Regulon (http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm), an integrated database of regulatory motifs of cancer genes clubbed with Unique Sequence-Predictor (USP) a software suite that identifies unique sequences for each of these regulatory DNA motifs at the specified position in the genome. USP works by extending a given DNA motif, in 5′→3′, 3′ →5′ or both directions by adding one nucleotide at each step, and calculates the frequency of each extended motif in the genome by Frequency Counter programme. This step is iterated till the frequency of the extended motif becomes unity in the genome. Thus, for each given motif, we get three possible unique sequences. Closest Sequence Finder program predicts off-target drug binding in the genome. Inclusion of DNA-Protein structural information further makes Onco-Regulon a highly informative repository for gene specific drug development. We believe that Onco-Regulon will help researchers to design drugs which will bind to an exclusive site in the genome with no off-target effects, theoretically. Database URL: http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm PMID:27515825

  4. Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences

    PubMed Central

    Saber, Morteza Mahmoudi; Adeyemi Babarinde, Isaac; Hettiarachchi, Nilmini; Saitou, Naruya

    2016-01-01

    Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions. PMID:27289096

  5. Molecular characterization of an ependymin precursor from goldfish brain.

    PubMed

    Königstorfer, A; Sterrer, S; Eckerskorn, C; Lottspeich, F; Schmidt, R; Hoffmann, W

    1989-01-01

    Ependymins are thought to be implicated in fundamental processes involved in plasticity of the goldfish CNS. Gas-phase sequencing of purified ependymins beta and gamma revealed that they share the same N-terminal sequence. Each sequence displays microheterogeneities at several positions. Based on the protein sequences obtained, we constructed synthetic oligonucleotides and used them as hybridization probes for screening cDNA libraries of goldfish brain. In this article we describe the full-length sequence of a mRNA encoding a precursor of ependymins. A cleavable signal sequence characteristic of secretory proteins is located at the N-terminal end, followed directly by the ependymin sequence. Also, two potential N-glycosylation sites were detected. A computer search revealed that ependymins form a novel family of unique proteins.

  6. Program Fair Evaluation--Summative Appraisal of Instructional Sequences with Dissimilar Objectives.

    ERIC Educational Resources Information Center

    Popham, W. James

    A comparative evaluation involving two instructional programs is given, although the approach can easily serve to compare more than two programs. The steps involved in conducting a program fair evaluation of two instructional programs are: (1) Identify objectives (a) common to both programs, (b) unique to one program, and (c) unique to the other…

  7. Phylogenomic analyses and molecular signatures for the class Halobacteria and its two major clades: a proposal for division of the class Halobacteria into an emended order Halobacteriales and two new orders, Haloferacales ord. nov. and Natrialbales ord. nov., containing the novel families Haloferacaceae fam. nov. and Natrialbaceae fam. nov.

    PubMed

    Gupta, Radhey S; Naushad, Sohail; Baker, Sheridan

    2015-03-01

    The Halobacteria constitute one of the largest groups within the Archaea. The hierarchical relationship among members of this large class, which comprises a single order and a single family, has proven difficult to determine based upon 16S rRNA gene trees and morphological and physiological characteristics. This work reports detailed phylogenetic and comparative genomic studies on >100 halobacterial (haloarchaeal) genomes containing representatives from 30 genera to investigate their evolutionary relationships. In phylogenetic trees reconstructed on the basis of 32 conserved proteins, using both neighbour-joining and maximum-likelihood methods, two major clades (clades A and B) encompassing nearly two-thirds of the sequenced haloarchaeal species were strongly supported. Clades grouping the same species/genera were also supported by the 16S rRNA gene trees and trees for several individual highly conserved proteins (RpoC, EF-Tu, UvrD, GyrA, EF-2/EF-G). In parallel, our comparative analyses of protein sequences from haloarchaeal genomes have identified numerous discrete molecular markers in the form of conserved signature indels (CSI) in protein sequences and conserved signature proteins (CSPs) that are found uniquely in specific groups of haloarchaea. Thirteen CSIs in proteins involved in diverse functions and 68 CSPs that are uniquely present in all or most genome-sequenced haloarchaea provide novel molecular means for distinguishing members of the class Halobacteria from all other prokaryotes. The members of clade A are distinguished from all other haloarchaea by the unique shared presence of two CSIs in the ribose operon protein and small GTP-binding protein and eight CSPs that are found specifically in members of this clade. Likewise, four CSIs in different proteins and five other CSPs are present uniquely in members of clade B and distinguish them from all other haloarchaea. Based upon their specific clustering in phylogenetic trees for different gene/protein sequences and the unique shared presence of large numbers of molecular signatures, members of clades A and B are indicated to be distinct from all other haloarchaea because of their uniquely shared evolutionary histories. Based upon these results, it is proposed that clades A and B be recognized as two new orders, Natrialbales ord. nov. and Haloferacales ord. nov., within the class Halobacteria, containing the novel families Natrialbaceae fam. nov. and Haloferacaceae fam. nov. Other members of the class Halobacteria that are not members of these two orders will remain part of the emended order Halobacteriales in an emended family Halobacteriaceae. © 2015 IUMS.

  8. A disruptive sequencer meets disruptive publishing.

    PubMed

    Loman, Nick; Goodwin, Sarah; Jansen, Hans; Loose, Matt

    2015-01-01

    Nanopore sequencing was recently made available to users in the form of the Oxford Nanopore MinION. Released to users through an early access programme, the MinION is made unique by its tiny form factor and ability to generate very long sequences from single DNA molecules. The platform is undergoing rapid evolution with three distinct nanopore types and five updates to library preparation chemistry in the last 18 months. To keep pace with the rapid evolution of this sequencing platform, and to provide a space where new analysis methods can be openly discussed, we present a new F1000Research channel devoted to updates to and analysis of nanopore sequence data.

  9. Sequence and structural implications of a bovine corneal keratan sulfate proteoglycan core protein. Protein 37B represents bovine lumican and proteins 37A and 25 are unique

    NASA Technical Reports Server (NTRS)

    Funderburgh, J. L.; Funderburgh, M. L.; Brown, S. J.; Vergnes, J. P.; Hassell, J. R.; Mann, M. M.; Conrad, G. W.; Spooner, B. S. (Principal Investigator)

    1993-01-01

    Amino acid sequence from tryptic peptides of three different bovine corneal keratan sulfate proteoglycan (KSPG) core proteins (designated 37A, 37B, and 25) showed similarities to the sequence of a chicken KSPG core protein lumican. Bovine lumican cDNA was isolated from a bovine corneal expression library by screening with chicken lumican cDNA. The bovine cDNA codes for a 342-amino acid protein, M(r) 38,712, containing amino acid sequences identified in the 37B KSPG core protein. The bovine lumican is 68% identical to chicken lumican, with an 83% identity excluding the N-terminal 40 amino acids. Location of 6 cysteine and 4 consensus N-glycosylation sites in the bovine sequence were identical to those in chicken lumican. Bovine lumican had about 50% identity to bovine fibromodulin and 20% identity to bovine decorin and biglycan. About two-thirds of the lumican protein consists of a series of 10 amino acid leucine-rich repeats that occur in regions of calculated high beta-hydrophobic moment, suggesting that the leucine-rich repeats contribute to beta-sheet formation in these proteins. Sequences obtained from 37A and 25 core proteins were absent in bovine lumican, thus predicting a unique primary structure and separate mRNA for each of the three bovine KSPG core proteins.

  10. Improved serial analysis of V1 ribosomal sequence tags (SARST-V1) provides a rapid, comprehensive, sequence-based characterization of bacterial diversity and community composition.

    PubMed

    Yu, Zhongtang; Yu, Marie; Morrison, Mark

    2006-04-01

    Serial analysis of ribosomal sequence tags (SARST) is a recently developed technology that can generate large 16S rRNA gene (rrs) sequence data sets from microbiomes, but there are numerous enzymatic and purification steps required to construct the ribosomal sequence tag (RST) clone libraries. We report here an improved SARST method, which still targets the V1 hypervariable region of rrs genes, but reduces the number of enzymes, oligonucleotides, reagents, and technical steps needed to produce the RST clone libraries. The new method, hereafter referred to as SARST-V1, was used to examine the eubacterial diversity present in community DNA recovered from the microbiome resident in the ovine rumen. The 190 sequenced clones contained 1055 RSTs and no less than 236 unique phylotypes (based on > or = 95% sequence identity) that were assigned to eight different eubacterial phyla. Rarefaction and monomolecular curve analyses predicted that the complete RST clone library contains 99% of the 353 unique phylotypes predicted to exist in this microbiome. When compared with ribosomal intergenic spacer analysis (RISA) of the same community DNA sample, as well as a compilation of nine previously published conventional rrs clone libraries prepared from the same type of samples, the RST clone library provided a more comprehensive characterization of the eubacterial diversity present in rumen microbiomes. As such, SARST-V1 should be a useful tool applicable to comprehensive examination of diversity and composition in microbiomes and offers an affordable, sequence-based method for diversity analysis.

  11. ``Sequence space soup'' of proteins and copolymers

    NASA Astrophysics Data System (ADS)

    Chan, Hue Sun; Dill, Ken A.

    1991-09-01

    To study the protein folding problem, we use exhaustive computer enumeration to explore ``sequence space soup,'' an imaginary solution containing the ``native'' conformations (i.e., of lowest free energy) under folding conditions, of every possible copolymer sequence. The model is of short self-avoiding chains of hydrophobic (H) and polar (P) monomers configured on the two-dimensional square lattice. By exhaustive enumeration, we identify all native structures for every possible sequence. We find that random sequences of H/P copolymers will bear striking resemblance to known proteins: Most sequences under folding conditions will be approximately as compact as known proteins, will have considerable amounts of secondary structure, and it is most probable that an arbitrary sequence will fold to a number of lowest free energy conformations that is of order one. In these respects, this simple model shows that proteinlike behavior should arise simply in copolymers in which one monomer type is highly solvent averse. It suggests that the structures and uniquenesses of native proteins are not consequences of having 20 different monomer types, or of unique properties of amino acid monomers with regard to special packing or interactions, and thus that simple copolymers might be designable to collapse to proteinlike structures and properties. A good strategy for designing a sequence to have a minimum possible number of native states is to strategically insert many P monomers. Thus known proteins may be marginally stable due to a balance: More H residues stabilize the desired native state, but more P residues prevent simultaneous stabilization of undesired native states.

  12. Restricted transfer of learning between unimanual and bimanual finger sequences.

    PubMed

    Yokoi, Atsushi; Bai, Wenjun; Diedrichsen, Jörn

    2017-03-01

    When training bimanual skills, such as playing piano, people sometimes practice each hand separately and at a later stage combine the movements of the two hands. This poses the critical question of whether motor skills can be acquired by separately practicing each subcomponent or should be trained as a whole. In the present study, we addressed this question by training human subjects for 4 days in a unimanual or bimanual version of the discrete sequence production task. Both groups were then tested on trained and untrained sequences on both unimanual and bimanual versions of the task. Surprisingly, we found no evidence of transfer from trained unimanual to bimanual or from trained bimanual to unimanual sequences. In half the participants, we also investigated whether cuing the sequences on the left and right hand with unique letters would change transfer. With these cues, untrained sequences that shared some components with the trained sequences were performed more quickly than sequences that did not. However, the amount of this transfer was limited to ∼10% of the overall sequence-specific learning gains. These results suggest that unimanual and bimanual sequences are learned in separate representations. Making participants aware of the interrelationship between sequences can induce some transferrable component, although the main component of the skill remains unique to unimanual or bimanual execution. NEW & NOTEWORTHY Studies in reaching movement demonstrated that approximately half of motor learning can transfer across unimanual and bimanual contexts, suggesting that neural representations for unimanual and bimanual movements are fairly overlapping at the level of elementary movement. In this study, we show that little or no transfer occurred across unimanual and bimanual sequential finger movements. This result suggests that bimanual sequences are represented at a level of the motor hierarchy that integrates movements of both hands. Copyright © 2017 the American Physiological Society.

  13. De Novo Transcriptome Sequencing Reveals Important Molecular Networks and Metabolic Pathways of the Plant, Chlorophytum borivilianum

    PubMed Central

    Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir

    2013-01-01

    Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum. PMID:24376689

  14. Optimization of sequence alignment for simple sequence repeat regions.

    PubMed

    Jighly, Abdulqader; Hamwieh, Aladdin; Ogbonnaya, Francis C

    2011-07-20

    Microsatellites, or simple sequence repeats (SSRs), are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs) mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs).SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type.When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic phylogenic relationship.

  15. The gene space in wheat: the complete γ-gliadin gene family from the wheat cultivar Chinese Spring.

    PubMed

    Anderson, Olin D; Huo, Naxin; Gu, Yong Q

    2013-06-01

    The complete set of unique γ-gliadin genes is described for the wheat cultivar Chinese Spring using a combination of expressed sequence tag (EST) and Roche 454 DNA sequences. Assemblies of Chinese Spring ESTs yielded 11 different γ-gliadin gene sequences. Two of the sequences encode identical polypeptides and are assumed to be the result of a recent gene duplication. One gene has a 3' coding mutation that changes the reading frame in the final eight codons. A second assembly of Chinese Spring γ-gliadin sequences was generated using Roche 454 total genomic DNA sequences. The 454 assembly confirmed the same 11 active genes as the EST assembly plus two pseudogenes not represented by ESTs. These 13 γ-gliadin sequences represent the complete unique set of γ-gliadin genes for cv Chinese Spring, although not ruled out are additional genes that are exact duplications of these 13 genes. A comparison with the ESTs of two other hexaploid cultivars (Butte 86 and Recital) finds that the most active genes are present in all three cultivars, with exceptions likely due to too few ESTs for detection in Butte 86 and Recital. A comparison of the numbers of ESTs per gene indicates differential levels of expression within the γ-gliadin gene family. Genome assignments were made for 6 of the 13 Chinese Spring γ-gliadin genes, i.e., one assignment from a match to two γ-gliadin genes found within a tetraploid wheat A genome BAC and four genes that match four distinct γ-gliadin sequences assembled from Roche 454 sequences from Aegilops tauschii, the hexaploid wheat D-genome ancestor.

  16. De Novo transcriptome sequencing reveals important molecular networks and metabolic pathways of the plant, Chlorophytum borivilianum.

    PubMed

    Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir

    2013-01-01

    Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum.

  17. Surface display of a massively variable lipoprotein by a Legionella diversity-generating retroelement.

    PubMed

    Arambula, Diego; Wong, Wenge; Medhekar, Bob A; Guo, Huatao; Gingery, Mari; Czornyj, Elizabeth; Liu, Minghsun; Dey, Sanghamitra; Ghosh, Partho; Miller, Jeff F

    2013-05-14

    Diversity-generating retroelements (DGRs) are a unique family of retroelements that confer selective advantages to their hosts by facilitating localized DNA sequence evolution through a specialized error-prone reverse transcription process. We characterized a DGR in Legionella pneumophila, an opportunistic human pathogen that causes Legionnaires disease. The L. pneumophila DGR is found within a horizontally acquired genomic island, and it can theoretically generate 10(26) unique nucleotide sequences in its target gene, legionella determinent target A (ldtA), creating a repertoire of 10(19) distinct proteins. Expression of the L. pneumophila DGR resulted in transfer of DNA sequence information from a template repeat to a variable repeat (VR) accompanied by adenine-specific mutagenesis of progeny VRs at the 3'end of ldtA. ldtA encodes a twin-arginine translocated lipoprotein that is anchored in the outer leaflet of the outer membrane, with its C-terminal variable region surface exposed. Related DGRs were identified in L. pneumophila clinical isolates that encode unique target proteins with homologous VRs, demonstrating the adaptability of DGR components. This work characterizes a DGR that diversifies a bacterial protein and confirms the hypothesis that DGR-mediated mutagenic homing occurs through a conserved mechanism. Comparative bioinformatics predicts that surface display of massively variable proteins is a defining feature of a subset of bacterial DGRs.

  18. Methods for chromosome-specific staining

    DOEpatents

    Gray, Joe W.; Pinkel, Daniel

    1995-01-01

    Methods and compositions for chromosome-specific staining are provided. Compositions comprise heterogenous mixtures of labeled nucleic acid fragments having substantially complementary base sequences to unique sequence regions of the chromosomal DNA for which their associated staining reagent is specific. Methods include methods for making the chromosome-specific staining compositions of the invention, and methods for applying the staining compositions to chromosomes.

  19. Genetic Diversity of Bacterial Communities and Gene Transfer Agents in Northern South China Sea

    PubMed Central

    Sun, Fu-Lin; Wang, You-Shao; Wu, Mei-Lin; Jiang, Zhao-Yu; Sun, Cui-Ci; Cheng, Hao

    2014-01-01

    Pyrosequencing of the 16S ribosomal RNA gene (rDNA) amplicons was performed to investigate the unique distribution of bacterial communities in northern South China Sea (nSCS) and evaluate community structure and spatial differences of bacterial diversity. Cyanobacteria, Proteobacteria, Actinobacteria, and Bacteroidetes constitute the majority of bacteria. The taxonomic description of bacterial communities revealed that more Chroococcales, SAR11 clade, Acidimicrobiales, Rhodobacterales, and Flavobacteriales are present in the nSCS waters than other bacterial groups. Rhodobacterales were less abundant in tropical water (nSCS) than in temperate and cold waters. Furthermore, the diversity of Rhodobacterales based on the gene transfer agent (GTA) major capsid gene (g5) was investigated. Four g5 gene clone libraries were constructed from samples representing different regions and yielded diverse sequences. Fourteen g5 clusters could be identified among 197 nSCS clones. These clusters were also related to known g5 sequences derived from genome-sequenced Rhodobacterales. The composition of g5 sequences in surface water varied with the g5 sequences in the sampling sites; this result indicated that the Rhodobacterales population could be highly diverse in nSCS. Phylogenetic tree analysis result indicated distinguishable diversity patterns among tropical (nSCS), temperate, and cold waters, thereby supporting the niche adaptation of specific Rhodobacterales members in unique environments. PMID:25364820

  20. A novel approach for monitoring genetically engineered microorganisms by using artificial, stable RNAs

    NASA Technical Reports Server (NTRS)

    Pitulle, C.; Hedenstierna, K. O.; Fox, G. E.

    1995-01-01

    Further improvements in technology for efficient monitoring of genetically engineered microorganisms (GEMs) in the environment are needed. Technology for monitoring rRNA is well established but has not generally been applicable to GEMs because of the lack of unique rRNA target sequences. In the work described herein, it is demonstrated that a deletion mutant of a plasmid-borne Vibrio proteolyticus 5S rRNA gene continues to accumulate to high levels in Escherichia coli although it is no longer incorporated into 70S ribosomes. This deletion construct was subsequently modified by mutagenesis to create a unique recognition site for the restriction endonuclease BstEII, into which new sequences could be readily inserted. Finally, a novel 17-nucleotide identifier sequence from Pennisetum purpureum was embedded into the construct to create an RNA identification cassette. The artificial identifier RNA, expressed from this cassette in vivo, accumulated in E. coli to levels comparable to those of wild-type 5S rRNA without being seriously detrimental to cell survival in laboratory experiments and without entering the ribosomes. These results demonstrate that artificial, stable RNAs containing sequence segments remarkably different from those present in any known rRNA can be designed and that neither the deleted sequence segment nor ribosome incorporation is essential for accumulation of an RNA product.

  1. Bat Biology, Genomes, and the Bat1K Project: To Generate Chromosome-Level Genomes for All Living Bat Species.

    PubMed

    Teeling, Emma C; Vernes, Sonja C; Dávalos, Liliana M; Ray, David A; Gilbert, M Thomas P; Myers, Eugene

    2018-02-15

    Bats are unique among mammals, possessing some of the rarest mammalian adaptations, including true self-powered flight, laryngeal echolocation, exceptional longevity, unique immunity, contracted genomes, and vocal learning. They provide key ecosystem services, pollinating tropical plants, dispersing seeds, and controlling insect pest populations, thus driving healthy ecosystems. They account for more than 20% of all living mammalian diversity, and their crown-group evolutionary history dates back to the Eocene. Despite their great numbers and diversity, many species are threatened and endangered. Here we announce Bat1K, an initiative to sequence the genomes of all living bat species (n∼1,300) to chromosome-level assembly. The Bat1K genome consortium unites bat biologists (>148 members as of writing), computational scientists, conservation organizations, genome technologists, and any interested individuals committed to a better understanding of the genetic and evolutionary mechanisms that underlie the unique adaptations of bats. Our aim is to catalog the unique genetic diversity present in all living bats to better understand the molecular basis of their unique adaptations; uncover their evolutionary history; link genotype with phenotype; and ultimately better understand, promote, and conserve bats. Here we review the unique adaptations of bats and highlight how chromosome-level genome assemblies can uncover the molecular basis of these traits. We present a novel sequencing and assembly strategy and review the striking societal and scientific benefits that will result from the Bat1K initiative.

  2. Plasmid Characterization and Chromosome Analysis of Two netF+ Clostridium perfringens Isolates Associated with Foal and Canine Necrotizing Enteritis.

    PubMed

    Mehdizadeh Gohari, Iman; Kropinski, Andrew M; Weese, Scott J; Parreira, Valeria R; Whitehead, Ashley E; Boerlin, Patrick; Prescott, John F

    2016-01-01

    The recent discovery of a novel beta-pore-forming toxin, NetF, which is strongly associated with canine and foal necrotizing enteritis should improve our understanding of the role of type A Clostridium perfringens associated disease in these animals. The current study presents the complete genome sequence of two netF-positive strains, JFP55 and JFP838, which were recovered from cases of foal necrotizing enteritis and canine hemorrhagic gastroenteritis, respectively. Genome sequencing was done using Single Molecule, Real-Time (SMRT) technology-PacBio and Illumina Hiseq2000. The JFP55 and JFP838 genomes include a single 3.34 Mb and 3.53 Mb chromosome, respectively, and both genomes include five circular plasmids. Plasmid annotation revealed that three plasmids were shared by the two newly sequenced genomes, including a NetF/NetE toxins-encoding tcp-conjugative plasmid, a CPE/CPB2 toxins-encoding tcp-conjugative plasmid and a putative bacteriocin-encoding plasmid. The putative beta-pore-forming toxin genes, netF, netE and netG, were located in unique pathogenicity loci on tcp-conjugative plasmids. The C. perfringens JFP55 chromosome carries 2,825 protein-coding genes whereas the chromosome of JFP838 contains 3,014 protein-encoding genes. Comparison of these two chromosomes with three available reference C. perfringens chromosome sequences identified 48 (~247 kb) and 81 (~430 kb) regions unique to JFP55 and JFP838, respectively. Some of these divergent genomic regions in both chromosomes are phage- and plasmid-related segments. Sixteen of these unique chromosomal regions (~69 kb) were shared between the two isolates. Five of these shared regions formed a mosaic of plasmid-integrated segments, suggesting that these elements were acquired early in a clonal lineage of netF-positive C. perfringens strains. These results provide significant insight into the basis of canine and foal necrotizing enteritis and are the first to demonstrate that netF resides on a large and unique plasmid-encoded locus.

  3. Molecular identification of Armillaria gallica from the Niobrara Valley Preserve in Nebraska

    Treesearch

    Mee-Sook Kim; Ned B. Klopfenstein

    2011-01-01

    Armillaria isolates were collected from a unique forest ecosystem in the Niobrara Valley Preserve in Nebraska, USA, which comprises a glacial and early postglacial refugium in the central plains of North America. The isolates were collected from diverse forest trees representing a unique mixture of forest types. Combined methods of rDNA sequencing and flow cytometric...

  4. Y and W Chromosome Assemblies: Approaches and Discoveries.

    PubMed

    Tomaszkiewicz, Marta; Medvedev, Paul; Makova, Kateryna D

    2017-04-01

    Hundreds of vertebrate genomes have been sequenced and assembled to date. However, most sequencing projects have ignored the sex chromosomes unique to the heterogametic sex - Y and W - that are known as sex-limited chromosomes (SLCs). Indeed, haploid and repetitive Y chromosomes in species with male heterogamety (XY), and W chromosomes in species with female heterogamety (ZW), are difficult to sequence and assemble. Nevertheless, obtaining their sequences is important for understanding the intricacies of vertebrate genome function and evolution. Recent progress has been made towards the adaptation of next-generation sequencing (NGS) techniques to deciphering SLC sequences. We review here currently available methodology and results with regard to SLC sequencing and assembly. We focus on vertebrates, but bring in some examples from other taxa. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. GENESUS: a two-step sequence design program for DNA nanostructure self-assembly.

    PubMed

    Tsutsumi, Takanobu; Asakawa, Takeshi; Kanegami, Akemi; Okada, Takao; Tahira, Tomoko; Hayashi, Kenshi

    2014-01-01

    DNA has been recognized as an ideal material for bottom-up construction of nanometer scale structures by self-assembly. The generation of sequences optimized for unique self-assembly (GENESUS) program reported here is a straightforward method for generating sets of strand sequences optimized for self-assembly of arbitrarily designed DNA nanostructures by a generate-candidates-and-choose-the-best strategy. A scalable procedure to prepare single-stranded DNA having arbitrary sequences is also presented. Strands for the assembly of various structures were designed and successfully constructed, validating both the program and the procedure.

  6. Genetic architecture of the Delis-Kaplan Executive Function System Trail Making Test: evidence for distinct genetic influences on executive function.

    PubMed

    Vasilopoulos, Terrie; Franz, Carol E; Panizzon, Matthew S; Xian, Hong; Grant, Michael D; Lyons, Michael J; Toomey, Rosemary; Jacobson, Kristen C; Kremen, William S

    2012-03-01

    To examine how genes and environments contribute to relationships among Trail Making Test (TMT) conditions and the extent to which these conditions have unique genetic and environmental influences. Participants included 1,237 middle-aged male twins from the Vietnam Era Twin Study of Aging. The Delis-Kaplan Executive Function System TMT included visual searching, number and letter sequencing, and set-shifting components. Phenotypic correlations among TMT conditions ranged from 0.29 to 0.60, and genes accounted for the majority (58-84%) of each correlation. Overall heritability ranged from 0.34 to 0.62 across conditions. Phenotypic factor analysis suggested a single factor. In contrast, genetic models revealed a single common genetic factor but also unique genetic influences separate from the common factor. Genetic variance (i.e., heritability) of number and letter sequencing was completely explained by the common genetic factor while unique genetic influences separate from the common factor accounted for 57% and 21% of the heritabilities of visual search and set shifting, respectively. After accounting for general cognitive ability, unique genetic influences accounted for 64% and 31% of those heritabilities. A common genetic factor, most likely representing a combination of speed and sequencing, accounted for most of the correlation among TMT 1-4. Distinct genetic factors, however, accounted for a portion of variance in visual scanning and set shifting. Thus, although traditional phenotypic shared variance analysis techniques suggest only one general factor underlying different neuropsychological functions in nonpatient populations, examining the genetic underpinnings of cognitive processes with twin analysis can uncover more complex etiological processes.

  7. The complete mitochondrial genome sequence of the maned wolf (Chrysocyon brachyurus).

    PubMed

    Zhao, Chao; Yang, Xiufeng; Zhang, Honghai; Zhang, Jin; Chen, Lei; Sha, Weilai; Liu, Guangshuai

    2016-01-01

    In this study, the complete mitochondrial genome of the maned wolf (Chrysocyon brachyurus), the unique species in Chrysocyon, was sequenced and reported for the first time using blood samples obtained from a female individual in Shanghai Zoo, China. Sequence analysis showed that the genome structure was in accordance with other Canidae species and it contained 12 S rRNA gene, 16 S rRNA gene, 22 tRNA genes, 13 protein-coding genes and 1 control region.

  8. Targeting of Repeated Sequences Unique to a Gene Results in Significant Increases in Antisense Oligonucleotide Potency

    PubMed Central

    Vickers, Timothy A.; Freier, Susan M.; Bui, Huynh-Hoa; Watt, Andrew; Crooke, Stanley T.

    2014-01-01

    A new strategy for identifying potent RNase H-dependent antisense oligonucleotides (ASOs) is presented. Our analysis of the human transcriptome revealed that a significant proportion of genes contain unique repeated sequences of 16 or more nucleotides in length. Activities of ASOs targeting these repeated sites in several representative genes were compared to those of ASOs targeting unique single sites in the same transcript. Antisense activity at repeated sites was also evaluated in a highly controlled minigene system. Targeting both native and minigene repeat sites resulted in significant increases in potency as compared to targeting of non-repeated sites. The increased potency at these sites is a result of increased frequency of ASO/RNA interactions which, in turn, increases the probability of a productive interaction between the ASO/RNA heteroduplex and human RNase H1 in the cell. These results suggest a new, highly efficient strategy for rapid identification of highly potent ASOs. PMID:25334092

  9. RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.

    PubMed

    Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

    2012-01-01

    RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.

  10. Lesion bypass activity of DNA polymerase θ (POLQ) is an intrinsic property of the pol domain and depends on unique sequence inserts.

    PubMed

    Hogg, Matthew; Seki, Mineaki; Wood, Richard D; Doublié, Sylvie; Wallace, Susan S

    2011-01-21

    DNA polymerase θ (POLQ, polθ) is a large, multidomain DNA polymerase encoded in higher eukaryotic genomes. It is important for maintaining genetic stability in cells and helping protect cells from DNA damage caused by ionizing radiation. POLQ contains an N-terminal helicase-like domain, a large central domain of indeterminate function, and a C-terminal polymerase domain with sequence similarity to the A-family of DNA polymerases. The enzyme has several unique properties, including low fidelity and the ability to insert and extend past abasic sites and thymine glycol lesions. It is not known whether the abasic site bypass activity is an intrinsic property of the polymerase domain or whether helicase activity is also required. Three "insertion" sequence elements present in POLQ are not found in any other A-family DNA polymerase, and it has been proposed that they may lend some unique properties to POLQ. Here, we analyzed the activity of the DNA polymerase in the absence of each sequence insertion. We found that the pol domain is capable of highly efficient bypass of abasic sites in the absence of the helicase-like or central domains. Insertion 1 increases the processivity of the polymerase but has little, if any, bearing on the translesion synthesis properties of the enzyme. However, removal of insertions 2 and 3 reduces activity on undamaged DNA and completely abrogates the ability of the enzyme to bypass abasic sites or thymine glycol lesions. Copyright © 2010 Elsevier Ltd. All rights reserved.

  11. Emergence and Evolution of Hominidae-Specific Coding and Noncoding Genomic Sequences.

    PubMed

    Saber, Morteza Mahmoudi; Adeyemi Babarinde, Isaac; Hettiarachchi, Nilmini; Saitou, Naruya

    2016-07-12

    Family Hominidae, which includes humans and great apes, is recognized for unique complex social behavior and intellectual abilities. Despite the increasing genome data, however, the genomic origin of its phenotypic uniqueness has remained elusive. Clade-specific genes and highly conserved noncoding sequences (HCNSs) are among the high-potential evolutionary candidates involved in driving clade-specific characters and phenotypes. On this premise, we analyzed whole genome sequences along with gene orthology data retrieved from major DNA databases to find Hominidae-specific (HS) genes and HCNSs. We discovered that Down syndrome critical region 4 (DSCR4) is the only experimentally verified gene uniquely present in Hominidae. DSCR4 has no structural homology to any known protein and was inferred to have emerged in several steps through LTR/ERV1, LTR/ERVL retrotransposition, and transversion. Using the genomic distance as neutral evolution threshold, we identified 1,658 HS HCNSs. Polymorphism coverage and derived allele frequency analysis of HS HCNSs showed that these HCNSs are under purifying selection, indicating that they may harbor important functions. They are overrepresented in promoters/untranslated regions, in close proximity of genes involved in sensory perception of sound and developmental process, and also showed a significantly lower nucleosome occupancy probability. Interestingly, many ancestral sequences of the HS HCNSs showed very high evolutionary rates. This suggests that new functions emerged through some kind of positive selection, and then purifying selection started to operate to keep these functions. © The Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  12. The tmRNA website

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hudson, Corey M.; Williams, Kelly P.

    We report that the transfer-messenger RNA (tmRNA) and its partner protein SmpB act together in resolving problems arising when translating bacterial ribosomes reach the end of mRNA with no stop codon. Their genes have been found in nearly all bacterial genomes and in some organelles. The tmRNA Website serves tmRNA sequences, alignments and feature annotations, and has recently moved to http: //bioinformatics.sandia.gov/tmrna/. New features include software used to find the sequences, an update raising the number of unique tmRNA sequences from 492 to 1716, and a database of SmpB sequences which are served along with the tmRNA sequence from themore » same organism.« less

  13. PCR detection of uncultured rumen bacteria.

    PubMed

    Rosero, Jaime A; Strosová, Lenka; Mrázek, Jakub; Fliegerová, Kateřina; Kopečný, Jan

    2012-07-01

    16S rRNA sequences of ruminal uncultured bacterial clones from public databases were phylogenetically examined. The sequences were found to form two unique clusters not affiliated with any known bacterial species: cluster of unidentified sequences of free floating rumen fluid uncultured bacteria (FUB) and cluster of unidentified sequences of bacteria associated with rumen epithelium (AUB). A set of PCR primers targeting 16S rRNA of ruminal free uncultured bacteria and rumen epithelium adhering uncultured bacteria was designed based on these sequences. FUB primers were used for relative quantification of uncultured bacteria in ovine rumen samples. The effort to increase the population size of FUB group has been successful in sulfate reducing broth and culture media supplied with cellulose.

  14. RNAcentral: A comprehensive database of non-coding RNA sequences

    DOE PAGES

    Williams, Kelly Porter; Lau, Britney Yan

    2016-10-28

    RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. Furthermore, the website has been subject to continuous improvements focusing on text and sequence similaritymore » searches as well as genome browsing functionality.« less

  15. RNAcentral: A comprehensive database of non-coding RNA sequences

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Williams, Kelly Porter; Lau, Britney Yan

    RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. Furthermore, the website has been subject to continuous improvements focusing on text and sequence similaritymore » searches as well as genome browsing functionality.« less

  16. The tmRNA website

    DOE PAGES

    Hudson, Corey M.; Williams, Kelly P.

    2014-11-05

    We report that the transfer-messenger RNA (tmRNA) and its partner protein SmpB act together in resolving problems arising when translating bacterial ribosomes reach the end of mRNA with no stop codon. Their genes have been found in nearly all bacterial genomes and in some organelles. The tmRNA Website serves tmRNA sequences, alignments and feature annotations, and has recently moved to http: //bioinformatics.sandia.gov/tmrna/. New features include software used to find the sequences, an update raising the number of unique tmRNA sequences from 492 to 1716, and a database of SmpB sequences which are served along with the tmRNA sequence from themore » same organism.« less

  17. Almost periodic solutions to difference equations

    NASA Technical Reports Server (NTRS)

    Bayliss, A.

    1975-01-01

    The theory of Massera and Schaeffer relating the existence of unique almost periodic solutions of an inhomogeneous linear equation to an exponential dichotomy for the homogeneous equation was completely extended to discretizations by a strongly stable difference scheme. In addition it is shown that the almost periodic sequence solution will converge to the differential equation solution. The preceding theory was applied to a class of exponentially stable partial differential equations to which one can apply the Hille-Yoshida theorem. It is possible to prove the existence of unique almost periodic solutions of the inhomogeneous equation (which can be approximated by almost periodic sequences) which are the solutions to appropriate discretizations. Two methods of discretizations are discussed: the strongly stable scheme and the Lax-Wendroff scheme.

  18. Classification of Particle Numbers with Unique Heitmann-Radin Minimizer

    NASA Astrophysics Data System (ADS)

    De Luca, Lucia; Friesecke, Gero

    2017-06-01

    We show that minimizers of the Heitmann-Radin energy (Heitmann and Radin in J Stat Phys 22(3):281-287, 1980) are unique if and only if the particle number N belongs to an infinite sequence whose first thirty-five elements are 1, 2, 3, 4, 5, 7, 8, 10, 12, 14, 16, 19, 21, 24, 27, 30, 33, 37, 40, 44, 48, 52, 56, 61, 65, 70, 75, 80, 85, 91, 96, 102, 108, 114, 120 (see the paper for a closed-form description of this sequence). The proof relies on the discrete differential geometry techniques introduced in De Luca and Friesecke (Crystallization in two dimensions and a discrete Gauss-Bonnet Theorem, 2016).

  19. Draft genome sequence of marine alphaproteobacterial strain HIMB11, the first cultivated representative of a unique lineage within the Roseobacter clade possessing an unusually small genome

    PubMed Central

    Durham, Bryndan P.; Grote, Jana; Whittaker, Kerry A.; Bender, Sara J.; Luo, Haiwei; Grim, Sharon L.; Brown, Julia M.; Casey, John R.; Dron, Antony; Florez-Leiva, Lennin; Krupke, Andreas; Luria, Catherine M.; Mine, Aric H.; Nigro, Olivia D.; Pather, Santhiska; Talarmin, Agathe; Wear, Emma K.; Weber, Thomas S.; Wilson, Jesse M.; Church, Matthew J.; DeLong, Edward F.; Karl, David M.; Steward, Grieg F.; Eppley, John M.; Kyrpides, Nikos C.; Schuster, Stephan; Rappé, Michael S.

    2014-01-01

    Strain HIMB11 is a planktonic marine bacterium isolated from coastal seawater in Kaneohe Bay, Oahu, Hawaii belonging to the ubiquitous and versatile Roseobacter clade of the alphaproteobacterial family Rhodobacteraceae. Here we describe the preliminary characteristics of strain HIMB11, including annotation of the draft genome sequence and comparative genomic analysis with other members of the Roseobacter lineage. The 3,098,747 bp draft genome is arranged in 34 contigs and contains 3,183 protein-coding genes and 54 RNA genes. Phylogenomic and 16S rRNA gene analyses indicate that HIMB11 represents a unique sublineage within the Roseobacter clade. Comparison with other publicly available genome sequences from members of the Roseobacter lineage reveals that strain HIMB11 has the genomic potential to utilize a wide variety of energy sources (e.g. organic matter, reduced inorganic sulfur, light, carbon monoxide), while possessing a reduced number of substrate transporters. PMID:25197450

  20. Draft genome sequence of marine alphaproteobacterial strain HIMB11, the first cultivated representative of a unique lineage within the Roseobacter clade possessing an unusually small genome.

    PubMed

    Durham, Bryndan P; Grote, Jana; Whittaker, Kerry A; Bender, Sara J; Luo, Haiwei; Grim, Sharon L; Brown, Julia M; Casey, John R; Dron, Antony; Florez-Leiva, Lennin; Krupke, Andreas; Luria, Catherine M; Mine, Aric H; Nigro, Olivia D; Pather, Santhiska; Talarmin, Agathe; Wear, Emma K; Weber, Thomas S; Wilson, Jesse M; Church, Matthew J; DeLong, Edward F; Karl, David M; Steward, Grieg F; Eppley, John M; Kyrpides, Nikos C; Schuster, Stephan; Rappé, Michael S

    2014-06-15

    Strain HIMB11 is a planktonic marine bacterium isolated from coastal seawater in Kaneohe Bay, Oahu, Hawaii belonging to the ubiquitous and versatile Roseobacter clade of the alphaproteobacterial family Rhodobacteraceae. Here we describe the preliminary characteristics of strain HIMB11, including annotation of the draft genome sequence and comparative genomic analysis with other members of the Roseobacter lineage. The 3,098,747 bp draft genome is arranged in 34 contigs and contains 3,183 protein-coding genes and 54 RNA genes. Phylogenomic and 16S rRNA gene analyses indicate that HIMB11 represents a unique sublineage within the Roseobacter clade. Comparison with other publicly available genome sequences from members of the Roseobacter lineage reveals that strain HIMB11 has the genomic potential to utilize a wide variety of energy sources (e.g. organic matter, reduced inorganic sulfur, light, carbon monoxide), while possessing a reduced number of substrate transporters.

  1. Ub-ISAP: a streamlined UNIX pipeline for mining unique viral vector integration sites from next generation sequencing data.

    PubMed

    Kamboj, Atul; Hallwirth, Claus V; Alexander, Ian E; McCowage, Geoffrey B; Kramer, Belinda

    2017-06-17

    The analysis of viral vector genomic integration sites is an important component in assessing the safety and efficiency of patient treatment using gene therapy. Alongside this clinical application, integration site identification is a key step in the genetic mapping of viral elements in mutagenesis screens that aim to elucidate gene function. We have developed a UNIX-based vector integration site analysis pipeline (Ub-ISAP) that utilises a UNIX-based workflow for automated integration site identification and annotation of both single and paired-end sequencing reads. Reads that contain viral sequences of interest are selected and aligned to the host genome, and unique integration sites are then classified as transcription start site-proximal, intragenic or intergenic. Ub-ISAP provides a reliable and efficient pipeline to generate large datasets for assessing the safety and efficiency of integrating vectors in clinical settings, with broader applications in cancer research. Ub-ISAP is available as an open source software package at https://sourceforge.net/projects/ub-isap/ .

  2. Recurrence of 49-base decamers, nonomers, and octamers within mouse C mu gene of Ig heavy chain and its primordial building block.

    PubMed Central

    Yazaki, A; Ohno, S

    1983-01-01

    Within the published 2,168-base-long mouse C mu gene of Ig heavy chain consisting of four coding and four noncoding segments, 2 base decamers, 8 nonomers, and 39 octamers recurred. Recurring base heptamers (about 100) and hexamers (about 350) were simply too numerous to merit individual identification. In spite of extensive overlaps between these recurring base decamers to hexamers, they occupied nearly the entire length of mouse Ig C mu gene. As with other genes of the beta-sheet-forming beta 2-microglobulin family, the Ig C mu gene (flanking and intervening noncoding sequences included) is not a unique sequence but rather it is degenerate repeats of the 45-base-long primordial building-block sequence uniquely its own. This primordial building block must originally have specified the 15-amino-acid-residue-long primordial arm of beta-sheet-forming loops, the characteristics of the beta 2-microglobulin family of polypeptides. PMID:6403948

  3. Generation, annotation and analysis of ESTs from Trichoderma harzianum CECT 2413

    PubMed Central

    Vizcaíno, Juan Antonio; González, Francisco Javier; Suárez, M Belén; Redondo, José; Heinrich, Julian; Delgado-Jarana, Jesús; Hermosa, Rosa; Gutiérrez, Santiago; Monte, Enrique; Llobell, Antonio; Rey, Manuel

    2006-01-01

    Background The filamentous fungus Trichoderma harzianum is used as biological control agent of several plant-pathogenic fungi. In order to study the genome of this fungus, a functional genomics project called "TrichoEST" was developed to give insights into genes involved in biological control activities using an approach based on the generation of expressed sequence tags (ESTs). Results Eight different cDNA libraries from T. harzianum strain CECT 2413 were constructed. Different growth conditions involving mainly different nutrient conditions and/or stresses were used. We here present the analysis of the 8,710 ESTs generated. A total of 3,478 unique sequences were identified of which 81.4% had sequence similarity with GenBank entries, using the BLASTX algorithm. Using the Gene Ontology hierarchy, we performed the annotation of 51.1% of the unique sequences and compared its distribution among the gene libraries. Additionally, the InterProScan algorithm was used in order to further characterize the sequences. The identification of the putatively secreted proteins was also carried out. Later, based on the EST abundance, we examined the highly expressed genes and a hydrophobin was identified as the gene expressed at the highest level. We compared our collection of ESTs with the previous collections obtained from Trichoderma species and we also compared our sequence set with different complete eukaryotic genomes from several animals, plants and fungi. Accordingly, the presence of similar sequences in different kingdoms was also studied. Conclusion This EST collection and its annotation provide a significant resource for basic and applied research on T. harzianum, a fungus with a high biotechnological interest. PMID:16872539

  4. Evolutionary Dynamics on Protein Bi-stability Landscapes can Potentially Resolve Adaptive Conflicts

    PubMed Central

    Sikosek, Tobias; Bornberg-Bauer, Erich; Chan, Hue Sun

    2012-01-01

    Experimental studies have shown that some proteins exist in two alternative native-state conformations. It has been proposed that such bi-stable proteins can potentially function as evolutionary bridges at the interface between two neutral networks of protein sequences that fold uniquely into the two different native conformations. Under adaptive conflict scenarios, bi-stable proteins may be of particular advantage if they simultaneously provide two beneficial biological functions. However, computational models that simulate protein structure evolution do not yet recognize the importance of bi-stability. Here we use a biophysical model to analyze sequence space to identify bi-stable or multi-stable proteins with two or more equally stable native-state structures. The inclusion of such proteins enhances phenotype connectivity between neutral networks in sequence space. Consideration of the sequence space neighborhood of bridge proteins revealed that bi-stability decreases gradually with each mutation that takes the sequence further away from an exactly bi-stable protein. With relaxed selection pressures, we found that bi-stable proteins in our model are highly successful under simulated adaptive conflict. Inspired by these model predictions, we developed a method to identify real proteins in the PDB with bridge-like properties, and have verified a clear bi-stability gradient for a series of mutants studied by Alexander et al. (Proc Nat Acad Sci USA 2009, 106:21149–21154) that connect two sequences that fold uniquely into two different native structures via a bridge-like intermediate mutant sequence. Based on these findings, new testable predictions for future studies on protein bi-stability and evolution are discussed. PMID:23028272

  5. A compact, in vivo screen of all 6-mers reveals drivers of tissue-specific expression and guides synthetic regulatory element design.

    PubMed

    Smith, Robin P; Riesenfeld, Samantha J; Holloway, Alisha K; Li, Qiang; Murphy, Karl K; Feliciano, Natalie M; Orecchia, Lorenzo; Oksenberg, Nir; Pollard, Katherine S; Ahituv, Nadav

    2013-07-18

    Large-scale annotation efforts have improved our ability to coarsely predict regulatory elements throughout vertebrate genomes. However, it is unclear how complex spatiotemporal patterns of gene expression driven by these elements emerge from the activity of short, transcription factor binding sequences. We describe a comprehensive promoter extension assay in which the regulatory potential of all 6 base-pair (bp) sequences was tested in the context of a minimal promoter. To enable this large-scale screen, we developed algorithms that use a reverse-complement aware decomposition of the de Bruijn graph to design a library of DNA oligomers incorporating every 6-bp sequence exactly once. Our library multiplexes all 4,096 unique 6-mers into 184 double-stranded 15-bp oligomers, which is sufficiently compact for in vivo testing. We injected each multiplexed construct into zebrafish embryos and scored GFP expression in 15 tissues at two developmental time points. Twenty-seven constructs produced consistent expression patterns, with the majority doing so in only one tissue. Functional sequences are enriched near biologically relevant genes, match motifs for developmental transcription factors, and are required for enhancer activity. By concatenating tissue-specific functional sequences, we generated completely synthetic enhancers for the notochord, epidermis, spinal cord, forebrain and otic lateral line, and show that short regulatory sequences do not always function modularly. This work introduces a unique in vivo catalog of short, functional regulatory sequences and demonstrates several important principles of regulatory element organization. Furthermore, we provide resources for designing compact, reverse-complement aware k-mer libraries.

  6. Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies

    PubMed Central

    2014-01-01

    Background The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination. Results We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome. Conclusions In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied. PMID:24647006

  7. Engineered proteins with PUF scaffold to manipulate RNA metabolism

    PubMed Central

    Wang, Yang; Wang, Zefeng; Tanaka Hall, Traci M.

    2013-01-01

    Pumilio/fem-3 mRNA binding factor (FBF) proteins are characterized by a sequence-specific RNA-binding domain. This unique single-stranded RNA recognition module, whose sequence specificity can be reprogrammed, has been fused with functional modules to engineer protein factors with various functions. Here we summarize the advancement in developing RNA regulatory tools and opportunities for the future. PMID:23731364

  8. Methods for chromosome-specific staining

    DOEpatents

    Gray, J.W.; Pinkel, D.

    1995-09-05

    Methods and compositions for chromosome-specific staining are provided. Compositions comprise heterogeneous mixtures of labeled nucleic acid fragments having substantially complementary base sequences to unique sequence regions of the chromosomal DNA for which their associated staining reagent is specific. Methods include ways for making the chromosome-specific staining compositions of the invention, and methods for applying the staining compositions to chromosomes. 3 figs.

  9. Methods and compositions for chromosome-specific staining

    DOEpatents

    Gray, Joe W.; Pinkel, Daniel

    2003-07-22

    Methods and compositions for chromosome-specific staining are provided. Compositions comprise heterogenous mixtures of labeled nucleic acid fragments having substantially complementary base sequences to unique sequence regions of the chromosomal DNA for which their associated staining reagent is specific. Methods include methods for making the chromosome-specific staining compositions of the invention, and methods for applying the staining compositions to chromosomes.

  10. Evaluation of DNA Binding Drugs as Inhibitors of ESX, and ETS Domain Transcription Factor Associated With Breast Cancer: Effects of ESX/DNA Complex Disruption

    DTIC Science & Technology

    2000-08-01

    4). Sequence recognition of all four DNA bases is achieved by positioning an N- methylimidazole opposite guanine or N-methylpyrrole opposite...unique sequences of DNA based upon selective binding motifs to all four DNA bases , although relatively little is known about the ability of these agents to

  11. Metagenomic Evaluation of Bacterial and Archaeal Diversity in the Geothermal Hot Springs of Manikaran, India

    PubMed Central

    Pathak, Ashish; Green, Stefan J.; Joshi, Amit; Chauhan, Ashvini

    2015-01-01

    Bacterial and archaeal diversity in geothermal spring water were investigated using 16S rRNA gene amplicon metagenomic sequencing. This revealed the dominance of Firmicutes, Aquificae, and the Deinococcus-Thermus group in this thermophilic environment. A number of sequences remained taxonomically unresolved, indicating the presence of potentially novel microbes in this unique habitat. PMID:25700403

  12. Draft Genome Sequence of Corynebacterium kefirresidentii SB, Isolated from Kefir.

    PubMed

    Blasche, Sonja; Kim, Yongkyu; Patil, Kiran R

    2017-09-14

    The genus Corynebacterium includes Gram-positive species with a high G+C content. We report here a novel species, Corynebacterium kefirresidentii SB, isolated from kefir grains collected in Germany. Its draft genome sequence was remarkably dissimilar (average nucleotide identity, 76.54%) to those of other Corynebacterium spp., confirming that this is a unique novel species. Copyright © 2017 Blasche et al.

  13. Wheat EST resources for functional genomics of abiotic stress

    PubMed Central

    Houde, Mario; Belcaid, Mahdi; Ouellet, François; Danyluk, Jean; Monroy, Antonio F; Dryanova, Ani; Gulick, Patrick; Bergeron, Anne; Laroche, André; Links, Matthew G; MacCarthy, Luke; Crosby, William L; Sarhan, Fathey

    2006-01-01

    Background Wheat is an excellent species to study freezing tolerance and other abiotic stresses. However, the sequence of the wheat genome has not been completely characterized due to its complexity and large size. To circumvent this obstacle and identify genes involved in cold acclimation and associated stresses, a large scale EST sequencing approach was undertaken by the Functional Genomics of Abiotic Stress (FGAS) project. Results We generated 73,521 quality-filtered ESTs from eleven cDNA libraries constructed from wheat plants exposed to various abiotic stresses and at different developmental stages. In addition, 196,041 ESTs for which tracefiles were available from the National Science Foundation wheat EST sequencing program and DuPont were also quality-filtered and used in the analysis. Clustering of the combined ESTs with d2_cluster and TGICL yielded a few large clusters containing several thousand ESTs that were refractory to routine clustering techniques. To resolve this problem, the sequence proximity and "bridges" were identified by an e-value distance graph to manually break clusters into smaller groups. Assembly of the resolved ESTs generated a 75,488 unique sequence set (31,580 contigs and 43,908 singletons/singlets). Digital expression analyses indicated that the FGAS dataset is enriched in stress-regulated genes compared to the other public datasets. Over 43% of the unique sequence set was annotated and classified into functional categories according to Gene Ontology. Conclusion We have annotated 29,556 different sequences, an almost 5-fold increase in annotated sequences compared to the available wheat public databases. Digital expression analysis combined with gene annotation helped in the identification of several pathways associated with abiotic stress. The genomic resources and knowledge developed by this project will contribute to a better understanding of the different mechanisms that govern stress tolerance in wheat and other cereals. PMID:16772040

  14. Unique Features of the Loblolly Pine (Pinus taeda L.) Megagenome Revealed Through Sequence Annotation

    PubMed Central

    Wegrzyn, Jill L.; Liechty, John D.; Stevens, Kristian A.; Wu, Le-Shin; Loopstra, Carol A.; Vasquez-Gross, Hans A.; Dougherty, William M.; Lin, Brian Y.; Zieve, Jacob J.; Martínez-García, Pedro J.; Holt, Carson; Yandell, Mark; Zimin, Aleksey V.; Yorke, James A.; Crepeau, Marc W.; Puiu, Daniela; Salzberg, Steven L.; de Jong, Pieter J.; Mockaitis, Keithanne; Main, Doreen; Langley, Charles H.; Neale, David B.

    2014-01-01

    The largest genus in the conifer family Pinaceae is Pinus, with over 100 species. The size and complexity of their genomes (∼20–40 Gb, 2n = 24) have delayed the arrival of a well-annotated reference sequence. In this study, we present the annotation of the first whole-genome shotgun assembly of loblolly pine (Pinus taeda L.), which comprises 20.1 Gb of sequence. The MAKER-P annotation pipeline combined evidence-based alignments and ab initio predictions to generate 50,172 gene models, of which 15,653 are classified as high confidence. Clustering these gene models with 13 other plant species resulted in 20,646 gene families, of which 1554 are predicted to be unique to conifers. Among the conifer gene families, 159 are composed exclusively of loblolly pine members. The gene models for loblolly pine have the highest median and mean intron lengths of 24 fully sequenced plant genomes. Conifer genomes are full of repetitive DNA, with the most significant contributions from long-terminal-repeat retrotransposons. In depth analysis of the tandem and interspersed repetitive content yielded a combined estimate of 82%. PMID:24653211

  15. TCRmodel: high resolution modeling of T cell receptors from sequence.

    PubMed

    Gowthaman, Ragul; Pierce, Brian G

    2018-05-22

    T cell receptors (TCRs), along with antibodies, are responsible for specific antigen recognition in the adaptive immune response, and millions of unique TCRs are estimated to be present in each individual. Understanding the structural basis of TCR targeting has implications in vaccine design, autoimmunity, as well as T cell therapies for cancer. Given advances in deep sequencing leading to immune repertoire-level TCR sequence data, fast and accurate modeling methods are needed to elucidate shared and unique 3D structural features of these molecules which lead to their antigen targeting and cross-reactivity. We developed a new algorithm in the program Rosetta to model TCRs from sequence, and implemented this functionality in a web server, TCRmodel. This web server provides an easy to use interface, and models are generated quickly that users can investigate in the browser and download. Benchmarking of this method using a set of nonredundant recently released TCR crystal structures shows that models are accurate and compare favorably to models from another available modeling method. This server enables the community to obtain insights into TCRs of interest, and can be combined with methods to model and design TCR recognition of antigens. The TCRmodel server is available at: http://tcrmodel.ibbr.umd.edu/.

  16. Comparative sequence analyses on the 16S rRNA (rDNA) of Bacillus acidocaldarius, Bacillus acidoterrestris, and Bacillus cycloheptanicus and proposal for creation of a new genus, Alicyclobacillus gen. nov

    NASA Technical Reports Server (NTRS)

    Wisotzkey, J. D.; Jurtshuk, P. Jr; Fox, G. E.; Deinhard, G.; Poralla, K.

    1992-01-01

    Comparative 16S rRNA (rDNA) sequence analyses performed on the thermophilic Bacillus species Bacillus acidocaldarius, Bacillus acidoterrestris, and Bacillus cycloheptanicus revealed that these organisms are sufficiently different from the traditional Bacillus species to warrant reclassification in a new genus, Alicyclobacillus gen. nov. An analysis of 16S rRNA sequences established that these three thermoacidophiles cluster in a group that differs markedly from both the obligately thermophilic organisms Bacillus stearothermophilus and the facultatively thermophilic organism Bacillus coagulans, as well as many other common mesophilic and thermophilic Bacillus species. The thermoacidophilic Bacillus species B. acidocaldarius, B. acidoterrestris, and B. cycloheptanicus also are unique in that they possess omega-alicylic fatty acid as the major natural membranous lipid component, which is a rare phenotype that has not been found in any other Bacillus species characterized to date. This phenotype, along with the 16S rRNA sequence data, suggests that these thermoacidophiles are biochemically and genetically unique and supports the proposal that they should be reclassified in the new genus Alicyclobacillus.

  17. Applying the Concept of Peptide Uniqueness to Anti-Polio Vaccination.

    PubMed

    Kanduc, Darja; Fasano, Candida; Capone, Giovanni; Pesce Delfino, Antonella; Calabrò, Michele; Polimeno, Lorenzo

    2015-01-01

    Although rare, adverse events may associate with anti-poliovirus vaccination thus possibly hampering global polio eradication worldwide. To design peptide-based anti-polio vaccines exempt from potential cross-reactivity risks and possibly able to reduce rare potential adverse events such as the postvaccine paralytic poliomyelitis due to the tendency of the poliovirus genome to mutate. Proteins from poliovirus type 1, strain Mahoney, were analyzed for amino acid sequence identity to the human proteome at the pentapeptide level, searching for sequences that (1) have zero percent of identity to human proteins, (2) are potentially endowed with an immunologic potential, and (3) are highly conserved among poliovirus strains. Sequence analyses produced a set of consensus epitopic peptides potentially able to generate specific anti-polio immune responses exempt from cross-reactivity with the human host. Peptide sequences unique to poliovirus proteins and conserved among polio strains might help formulate a specific and universal anti-polio vaccine able to react with multiple viral strains and exempt from the burden of possible cross-reactions with human proteins. As an additional advantage, using a peptide-based vaccine instead of current anti-polio DNA vaccines would eliminate the rare post-polio poliomyelitis cases and other disabling symptoms that may appear following vaccination.

  18. In silico Analysis of 2085 Clones from a Normalized Rat Vestibular Periphery 3′ cDNA Library

    PubMed Central

    Roche, Joseph P.; Cioffi, Joseph A.; Kwitek, Anne E.; Erbe, Christy B.; Popper, Paul

    2005-01-01

    The inserts from 2400 cDNA clones isolated from a normalized Rattus norvegicus vestibular periphery cDNA library were sequenced and characterized. The Wackym-Soares vestibular 3′ cDNA library was constructed from the saccular and utricular maculae, the ampullae of all three semicircular canals and Scarpa's ganglia containing the somata of the primary afferent neurons, microdissected from 104 male and female rats. The inserts from 2400 randomly selected clones were sequenced from the 5′ end. Each sequence was analyzed using the BLAST algorithm compared to the Genbank nonredundant, rat genome, mouse genome and human genome databases to search for high homology alignments. Of the initial 2400 clones, 315 (13%) were found to be of poor quality and did not yield useful information, and therefore were eliminated from the analysis. Of the remaining 2085 sequences, 918 (44%) were found to represent 758 unique genes having useful annotations that were identified in databases within the public domain or in the published literature; these sequences were designated as known characterized sequences. 1141 sequences (55%) aligned with 1011 unique sequences had no useful annotations and were designated as known but uncharacterized sequences. Of the remaining 26 sequences (1%), 24 aligned with rat genomic sequences, but none matched previously described rat expressed sequence tags or mRNAs. No significant alignment to the rat or human genomic sequences could be found for the remaining 2 sequences. Of the 2085 sequences analyzed, 86% were singletons. The known, characterized sequences were analyzed with the FatiGO online data-mining tool (http://fatigo.bioinfo.cnio.es/) to identify level 5 biological process gene ontology (GO) terms for each alignment and to group alignments with similar or identical GO terms. Numerous genes were identified that have not been previously shown to be expressed in the vestibular system. Further characterization of the novel cDNA sequences may lead to the identification of genes with vestibular-specific functions. Continued analysis of the rat vestibular periphery transcriptome should provide new insights into vestibular function and generate new hypotheses. Physiological studies are necessary to further elucidate the roles of the identified genes and novel sequences in vestibular function. PMID:16103642

  19. Advanced Applications of Next-Generation Sequencing Technologies to Orchid Biology.

    PubMed

    Yeh, Chuan-Ming; Liu, Zhong-Jian; Tsai, Wen-Chieh

    2018-01-01

    Next-generation sequencing technologies are revolutionizing biology by permitting, transcriptome sequencing, whole-genome sequencing and resequencing, and genome-wide single nucleotide polymorphism profiling. Orchid research has benefited from this breakthrough, and a few orchid genomes are now available; new biological questions can be approached and new breeding strategies can be designed. The first part of this review describes the unique features of orchid biology. The second part provides an overview of the current next-generation sequencing platforms, many of which are already used in plant laboratories. The third part summarizes the state of orchid transcriptome and genome sequencing and illustrates current achievements. The genetic sequences currently obtained will not only provide a broad scope for the study of orchid biology, but also serves as a starting point for uncovering the mystery of orchid evolution.

  20. Multiplex analysis of DNA

    DOEpatents

    Church, George M.; Kieffer-Higgins, Stephen

    1992-01-01

    This invention features vectors and a method for sequencing DNA. The method includes the steps of: a) ligating the DNA into a vector comprising a tag sequence, the tag sequence includes at least 15 bases, wherein the tag sequence will not hybridize to the DNA under stringent hybridization conditions and is unique in the vector, to form a hybrid vector, b) treating the hybrid vector in a plurality of vessels to produce fragments comprising the tag sequence, wherein the fragments differ in length and terminate at a fixed known base or bases, wherein the fixed known base or bases differs in each vessel, c) separating the fragments from each vessel according to their size, d) hybridizing the fragments with an oligonucleotide able to hybridize specifically with the tag sequence, and e) detecting the pattern of hybridization of the tag sequence, wherein the pattern reflects the nucleotide sequence of the DNA.

  1. Molecular characterization of Taenia multiceps isolates from Gansu Province, China by sequencing of mitochondrial cytochrome C oxidase subunit 1.

    PubMed

    Li, Wen Hui; Jia, Wan Zhong; Qu, Zi Gang; Xie, Zhi Zhou; Luo, Jian Xun; Yin, Hong; Sun, Xiao Lin; Blaga, Radu; Fu, Bao Quan

    2013-04-01

    A total of 16 Taenia multiceps isolates collected from naturally infected sheep or goats in Gansu Province, China were characterized by sequences of mitochondrial cytochrome c oxidase subunit 1 (cox1) gene. The complete cox1 gene was amplified for individual T. multiceps isolates by PCR, ligated to pMD18T vector, and sequenced. Sequence analysis indicated that out of 16 T. multiceps isolates 10 unique cox1 gene sequences of 1,623 bp were obtained with sequence variation of 0.12-0.68%. The results showed that the cox1 gene sequences were highly conserved among the examined T. multiceps isolates. However, they were quite different from those of the other Taenia species. Phylogenetic analysis based on complete cox1 gene sequences revealed that T. multiceps isolates were composed of 3 genotypes and distinguished from the other Taenia species.

  2. Molecular Characterization of Taenia multiceps Isolates from Gansu Province, China by Sequencing of Mitochondrial Cytochrome C Oxidase Subunit 1

    PubMed Central

    Li, Wen Hui; Jia, Wan Zhong; Qu, Zi Gang; Xie, Zhi Zhou; Luo, Jian Xun; Yin, Hong; Sun, Xiao Lin; Blaga, Radu

    2013-01-01

    A total of 16 Taenia multiceps isolates collected from naturally infected sheep or goats in Gansu Province, China were characterized by sequences of mitochondrial cytochrome c oxidase subunit 1 (cox1) gene. The complete cox1 gene was amplified for individual T. multiceps isolates by PCR, ligated to pMD18T vector, and sequenced. Sequence analysis indicated that out of 16 T. multiceps isolates 10 unique cox1 gene sequences of 1,623 bp were obtained with sequence variation of 0.12-0.68%. The results showed that the cox1 gene sequences were highly conserved among the examined T. multiceps isolates. However, they were quite different from those of the other Taenia species. Phylogenetic analysis based on complete cox1 gene sequences revealed that T. multiceps isolates were composed of 3 genotypes and distinguished from the other Taenia species. PMID:23710087

  3. Genome Sequence of Candidatus Nitrososphaera evergladensis from Group I.1b Enriched from Everglades Soil Reveals Novel Genomic Features of the Ammonia-Oxidizing Archaea

    PubMed Central

    Zhalnina, Kateryna V.; Dias, Raquel; Leonard, Michael T.; Dorr de Quadros, Patricia; Camargo, Flavio A. O.; Drew, Jennifer C.; Farmerie, William G.; Daroub, Samira H.; Triplett, Eric W.

    2014-01-01

    The activity of ammonia-oxidizing archaea (AOA) leads to the loss of nitrogen from soil, pollution of water sources and elevated emissions of greenhouse gas. To date, eight AOA genomes are available in the public databases, seven are from the group I.1a of the Thaumarchaeota and only one is from the group I.1b, isolated from hot springs. Many soils are dominated by AOA from the group I.1b, but the genomes of soil representatives of this group have not been sequenced and functionally characterized. The lack of knowledge of metabolic pathways of soil AOA presents a critical gap in understanding their role in biogeochemical cycles. Here, we describe the first complete genome of soil archaeon Candidatus Nitrososphaera evergladensis, which has been reconstructed from metagenomic sequencing of a highly enriched culture obtained from an agricultural soil. The AOA enrichment was sequenced with the high throughput next generation sequencing platforms from Pacific Biosciences and Ion Torrent. The de novo assembly of sequences resulted in one 2.95 Mb contig. Annotation of the reconstructed genome revealed many similarities of the basic metabolism with the rest of sequenced AOA. Ca. N. evergladensis belongs to the group I.1b and shares only 40% of whole-genome homology with the closest sequenced relative Ca. N. gargensis. Detailed analysis of the genome revealed coding sequences that were completely absent from the group I.1a. These unique sequences code for proteins involved in control of DNA integrity, transporters, two-component systems and versatile CRISPR defense system. Notably, genomes from the group I.1b have more gene duplications compared to the genomes from the group I.1a. We suggest that the presence of these unique genes and gene duplications may be associated with the environmental versatility of this group. PMID:24999826

  4. Evaluation of the reproducibility of amplicon sequencing with Illumina MiSeq platform

    PubMed Central

    Van Nostrand, Joy D.; Ning, Daliang; Sun, Bo; Xue, Kai; Liu, Feifei; Deng, Ye; Liang, Yuting; Zhou, Jizhong

    2017-01-01

    Illumina’s MiSeq has become the dominant platform for gene amplicon sequencing in microbial ecology studies; however, various technical concerns, such as reproducibility, still exist. To assess reproducibility, 16S rRNA gene amplicons from 18 soil samples of a reciprocal transplantation experiment were sequenced on an Illumina MiSeq. The V4 region of 16S rRNA gene from each sample was sequenced in triplicate with each replicate having a unique barcode. The average OTU overlap, without considering sequence abundance, at a rarefaction level of 10,323 sequences was 33.4±2.1% and 20.2±1.7% between two and among three technical replicates, respectively. When OTU sequence abundance was considered, the average sequence abundance weighted OTU overlap was 85.6±1.6% and 81.2±2.1% for two and three replicates, respectively. Removing singletons significantly increased the overlap for both (~1–3%, p<0.001). Increasing the sequencing depth to 160,000 reads by deep sequencing increased OTU overlap both when sequence abundance was considered (95%) and when not (44%). However, if singletons were not removed the overlap between two technical replicates (not considering sequence abundance) plateaus at 39% with 30,000 sequences. Diversity measures were not affected by the low overlap as α-diversities were similar among technical replicates while β-diversities (Bray-Curtis) were much smaller among technical replicates than among treatment replicates (e.g., 0.269 vs. 0.374). Higher diversity coverage, but lower OTU overlap, was observed when replicates were sequenced in separate runs. Detrended correspondence analysis indicated that while there was considerable variation among technical replicates, the reproducibility was sufficient for detecting treatment effects for the samples examined. These results suggest that although there is variation among technical replicates, amplicon sequencing on MiSeq is useful for analyzing microbial community structure if used appropriately and with caution. For example, including technical replicates, removing spurious sequences and unrepresentative OTUs, using a clustering method with a high stringency for OTU generation, estimating treatment effects at higher taxonomic levels, and adapting the unique molecular identifier (UMI) and other newly developed methods to lower PCR and sequencing error and to identify true low abundance rare species all can increase reproducibility. PMID:28453559

  5. Evaluation of the reproducibility of amplicon sequencing with Illumina MiSeq platform

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wen, Chongqing; Wu, Liyou; Qin, Yujia

    Illumina's MiSeq has become the dominant platform for gene amplicon sequencing in microbial ecology studies; however, various technical concerns, such as reproducibility, still exist. To assess reproducibility, 16S rRNA gene amplicons from 18 soil samples of a reciprocal transplantation experiment were sequenced on an Illumina MiSeq. The V4 region of 16S rRNA gene from each sample was sequenced in triplicate with each replicate having a unique barcode. The average OTU overlap, without considering sequence abundance, at a rarefaction level of 10,323 sequences was 33.4±2.1% and 20.2±1.7% between two and among three technical replicates, respectively. When OTU sequence abundance was considered,more » the average sequence abundance weighted OTU overlap was 85.6±1.6% and 81.2±2.1% for two and three replicates, respectively. Removing singletons significantly increased the overlap for both (~1-3%, p<0.001). Increasing the sequencing depth to 160,000 reads by deep sequencing increased OTU overlap both when sequence abundance was considered (95%) and when not (44%). However, if singletons were not removed the overlap between two technical replicates (not considering sequence abundance) plateaus at 39% with 30,000 sequences. Diversity measures were not affected by the low overlap as α-diversities were similar among technical replicates while β-diversities (Bray-Curtis) were much smaller among technical replicates than among treatment replicates (e.g., 0.269 vs. 0.374). Higher diversity coverage, but lower OTU overlap, was observed when replicates were sequenced in separate runs. Detrended correspondence analysis indicated that while there was considerable variation among technical replicates, the reproducibility was sufficient for detecting treatment effects for the samples examined. These results suggest that although there is variation among technical replicates, amplicon sequencing on MiSeq is useful for analyzing microbial community structure if used appropriately and with caution. For example, including technical replicates, removing spurious sequences and unrepresentative OTUs, using a clustering method with a high stringency for OTU generation, estimating treatment effects at higher taxonomic levels, and adapting the unique molecular identifier (UMI) and other newly developed methods to lower PCR and sequencing error and to identify true low abundance rare species all can increase reproducibility.« less

  6. Evaluation of the reproducibility of amplicon sequencing with Illumina MiSeq platform

    DOE PAGES

    Wen, Chongqing; Wu, Liyou; Qin, Yujia; ...

    2017-04-28

    Illumina's MiSeq has become the dominant platform for gene amplicon sequencing in microbial ecology studies; however, various technical concerns, such as reproducibility, still exist. To assess reproducibility, 16S rRNA gene amplicons from 18 soil samples of a reciprocal transplantation experiment were sequenced on an Illumina MiSeq. The V4 region of 16S rRNA gene from each sample was sequenced in triplicate with each replicate having a unique barcode. The average OTU overlap, without considering sequence abundance, at a rarefaction level of 10,323 sequences was 33.4±2.1% and 20.2±1.7% between two and among three technical replicates, respectively. When OTU sequence abundance was considered,more » the average sequence abundance weighted OTU overlap was 85.6±1.6% and 81.2±2.1% for two and three replicates, respectively. Removing singletons significantly increased the overlap for both (~1-3%, p<0.001). Increasing the sequencing depth to 160,000 reads by deep sequencing increased OTU overlap both when sequence abundance was considered (95%) and when not (44%). However, if singletons were not removed the overlap between two technical replicates (not considering sequence abundance) plateaus at 39% with 30,000 sequences. Diversity measures were not affected by the low overlap as α-diversities were similar among technical replicates while β-diversities (Bray-Curtis) were much smaller among technical replicates than among treatment replicates (e.g., 0.269 vs. 0.374). Higher diversity coverage, but lower OTU overlap, was observed when replicates were sequenced in separate runs. Detrended correspondence analysis indicated that while there was considerable variation among technical replicates, the reproducibility was sufficient for detecting treatment effects for the samples examined. These results suggest that although there is variation among technical replicates, amplicon sequencing on MiSeq is useful for analyzing microbial community structure if used appropriately and with caution. For example, including technical replicates, removing spurious sequences and unrepresentative OTUs, using a clustering method with a high stringency for OTU generation, estimating treatment effects at higher taxonomic levels, and adapting the unique molecular identifier (UMI) and other newly developed methods to lower PCR and sequencing error and to identify true low abundance rare species all can increase reproducibility.« less

  7. Evaluation of the reproducibility of amplicon sequencing with Illumina MiSeq platform.

    PubMed

    Wen, Chongqing; Wu, Liyou; Qin, Yujia; Van Nostrand, Joy D; Ning, Daliang; Sun, Bo; Xue, Kai; Liu, Feifei; Deng, Ye; Liang, Yuting; Zhou, Jizhong

    2017-01-01

    Illumina's MiSeq has become the dominant platform for gene amplicon sequencing in microbial ecology studies; however, various technical concerns, such as reproducibility, still exist. To assess reproducibility, 16S rRNA gene amplicons from 18 soil samples of a reciprocal transplantation experiment were sequenced on an Illumina MiSeq. The V4 region of 16S rRNA gene from each sample was sequenced in triplicate with each replicate having a unique barcode. The average OTU overlap, without considering sequence abundance, at a rarefaction level of 10,323 sequences was 33.4±2.1% and 20.2±1.7% between two and among three technical replicates, respectively. When OTU sequence abundance was considered, the average sequence abundance weighted OTU overlap was 85.6±1.6% and 81.2±2.1% for two and three replicates, respectively. Removing singletons significantly increased the overlap for both (~1-3%, p<0.001). Increasing the sequencing depth to 160,000 reads by deep sequencing increased OTU overlap both when sequence abundance was considered (95%) and when not (44%). However, if singletons were not removed the overlap between two technical replicates (not considering sequence abundance) plateaus at 39% with 30,000 sequences. Diversity measures were not affected by the low overlap as α-diversities were similar among technical replicates while β-diversities (Bray-Curtis) were much smaller among technical replicates than among treatment replicates (e.g., 0.269 vs. 0.374). Higher diversity coverage, but lower OTU overlap, was observed when replicates were sequenced in separate runs. Detrended correspondence analysis indicated that while there was considerable variation among technical replicates, the reproducibility was sufficient for detecting treatment effects for the samples examined. These results suggest that although there is variation among technical replicates, amplicon sequencing on MiSeq is useful for analyzing microbial community structure if used appropriately and with caution. For example, including technical replicates, removing spurious sequences and unrepresentative OTUs, using a clustering method with a high stringency for OTU generation, estimating treatment effects at higher taxonomic levels, and adapting the unique molecular identifier (UMI) and other newly developed methods to lower PCR and sequencing error and to identify true low abundance rare species all can increase reproducibility.

  8. Serial analysis of gene expression (SAGE) in normal human trabecular meshwork.

    PubMed

    Liu, Yutao; Munro, Drew; Layfield, David; Dellinger, Andrew; Walter, Jeffrey; Peterson, Katherine; Rickman, Catherine Bowes; Allingham, R Rand; Hauser, Michael A

    2011-04-08

    To identify the genes expressed in normal human trabecular meshwork tissue, a tissue critical to the pathogenesis of glaucoma. Total RNA was extracted from human trabecular meshwork (HTM) harvested from 3 different donors. Extracted RNA was used to synthesize individual SAGE (serial analysis of gene expression) libraries using the I-SAGE Long kit from Invitrogen. Libraries were analyzed using SAGE 2000 software to extract the 17 base pair sequence tags. The extracted sequence tags were mapped to the genome using SAGE Genie map. A total of 298,834 SAGE tags were identified from all HTM libraries (96,842, 88,126, and 113,866 tags, respectively). Collectively, there were 107,325 unique tags. There were 10,329 unique tags with a minimum of 2 counts from a single library. These tags were mapped to known unique Unigene clusters. Approximately 29% of the tags (orphan tags) did not map to a known Unigene cluster. Thirteen percent of the tags mapped to at least 2 Unigene clusters. Sequence tags from many glaucoma-related genes, including myocilin, optineurin, and WD repeat domain 36, were identified. This is the first time SAGE analysis has been used to characterize the gene expression profile in normal HTM. SAGE analysis provides an unbiased sampling of gene expression of the target tissue. These data will provide new and valuable information to improve understanding of the biology of human aqueous outflow.

  9. On the joint spectral density of bivariate random sequences. Thesis Technical Report No. 21

    NASA Technical Reports Server (NTRS)

    Aalfs, David D.

    1995-01-01

    For univariate random sequences, the power spectral density acts like a probability density function of the frequencies present in the sequence. This dissertation extends that concept to bivariate random sequences. For this purpose, a function called the joint spectral density is defined that represents a joint probability weighing of the frequency content of pairs of random sequences. Given a pair of random sequences, the joint spectral density is not uniquely determined in the absence of any constraints. Two approaches to constraining the sequences are suggested: (1) assume the sequences are the margins of some stationary random field, (2) assume the sequences conform to a particular model that is linked to the joint spectral density. For both approaches, the properties of the resulting sequences are investigated in some detail, and simulation is used to corroborate theoretical results. It is concluded that under either of these two constraints, the joint spectral density can be computed from the non-stationary cross-correlation.

  10. Applications of Single-Cell Sequencing for Multiomics.

    PubMed

    Xu, Yungang; Zhou, Xiaobo

    2018-01-01

    Single-cell sequencing interrogates the sequence or chromatin information from individual cells with advanced next-generation sequencing technologies. It provides a higher resolution of cellular differences and a better understanding of the underlying genetic and epigenetic mechanisms of an individual cell in the context of its survival and adaptation to microenvironment. However, it is more challenging to perform single-cell sequencing and downstream data analysis, owing to the minimal amount of starting materials, sample loss, and contamination. In addition, due to the picogram level of the amount of nucleic acids used, heavy amplification is often needed during sample preparation of single-cell sequencing, resulting in the uneven coverage, noise, and inaccurate quantification of sequencing data. All these unique properties raise challenges in and thus high demands for computational methods that specifically fit single-cell sequencing data. We here comprehensively survey the current strategies and challenges for multiple single-cell sequencing, including single-cell transcriptome, genome, and epigenome, beginning with a brief introduction to multiple sequencing techniques for single cells.

  11. Molecular structures of centromeric heterochromatin and karyotypic evolution in the Siamese crocodile (Crocodylus siamensis) (Crocodylidae, Crocodylia).

    PubMed

    Kawagoshi, Taiki; Nishida, Chizuko; Ota, Hidetoshi; Kumazawa, Yoshinori; Endo, Hideki; Matsuda, Yoichi

    2008-01-01

    Crocodilians have several unique karyotypic features, such as small diploid chromosome numbers (30-42) and the absence of dot-shaped microchromosomes. Of the extant crocodilian species, the Siamese crocodile (Crocodylus siamensis) has no more than 2n = 30, comprising mostly bi-armed chromosomes with large centromeric heterochromatin blocks. To investigate the molecular structures of C-heterochromatin and genomic compartmentalization in the karyotype, characterized by the disappearance of tiny microchromosomes and reduced chromosome number, we performed molecular cloning of centromeric repetitive sequences and chromosome mapping of the 18S-28S rDNA and telomeric (TTAGGG)( n ) sequences. The centromeric heterochromatin was composed mainly of two repetitive sequence families whose characteristics were quite different. Two types of GC-rich CSI-HindIII family sequences, the 305 bp CSI-HindIII-S (G+C content, 61.3%) and 424 bp CSI-HindIII-M (63.1%), were localized to the intensely PI-stained centric regions of all chromosomes, except for chromosome 2 with PI-negative heterochromatin. The 94 bp CSI-DraI (G+C content, 48.9%) was tandem-arrayed satellite DNA and localized to chromosome 2 and four pairs of small-sized chromosomes. The chromosomal size-dependent genomic compartmentalization that is supposedly unique to the Archosauromorpha was probably lost in the crocodilian lineage with the disappearance of microchromosomes followed by the homogenization of centromeric repetitive sequences between chromosomes, except for chromosome 2.

  12. Analysis of secreted proteins from Aspergillus flavus.

    PubMed

    Medina, Martha L; Haynes, Paul A; Breci, Linda; Francisco, Wilson A

    2005-08-01

    MS/MS techniques in proteomics make possible the identification of proteins from organisms with little or no genome sequence information available. Peptide sequences are obtained from tandem mass spectra by matching peptide mass and fragmentation information to protein sequence information from related organisms, including unannotated genome sequence data. This peptide identification data can then be grouped and reconstructed into protein data. In this study, we have used this approach to study protein secretion by Aspergillus flavus, a filamentous fungus for which very little genome sequence information is available. A. flavus is capable of degrading the flavonoid rutin (quercetin 3-O-glycoside), as the only source of carbon via an extracellular enzyme system. In this continuing study, a proteomic analysis was used to identify secreted proteins from A. flavus when grown on rutin. The growth media glucose and potato dextrose were used to identify differentially expressed secreted proteins. The secreted proteins were analyzed by 1- and 2-DE and MS/MS. A total of 51 unique A. flavus secreted proteins were identified from the three growth conditions. Ten proteins were unique to rutin-, five to glucose- and one to potato dextrose-grown A. flavus. Sixteen secreted proteins were common to all three media. Fourteen identifications were of hypothetical proteins or proteins of unknown functions. To our knowledge, this is the first extensive proteomic study conducted to identify the secreted proteins from a filamentous fungus.

  13. Comparison of methanogen diversity of yak (Bos grunniens) and cattle (Bos taurus) from the Qinghai-Tibetan plateau, China.

    PubMed

    Huang, Xiao Dan; Tan, Hui Yin; Long, Ruijun; Liang, Juan Boo; Wright, André-Denis G

    2012-10-19

    Methane emissions by methanogen from livestock ruminants have significantly contributed to the agricultural greenhouse gas effect. It is worthwhile to compare methanogen from "energy-saving" animal (yak) and normal animal (cattle) in order to investigate the link between methanogen structure and low methane production. Diversity of methanogens from the yak and cattle rumen was investigated by analysis of 16S rRNA gene sequences from rumen digesta samples from four yaks (209 clones) and four cattle (205 clones) from the Qinghai-Tibetan Plateau area (QTP). Overall, a total of 414 clones (i.e. sequences) were examined and assigned to 95 operational taxonomic units (OTUs) using MOTHUR, based upon a 98% species-level identity criterion. Forty-six OTUs were unique to the yak clone library and 34 OTUs were unique to the cattle clone library, while 15 OTUs were found in both libraries. Of the 95 OTUs, 93 putative new species were identified. Sequences belonging to the Thermoplasmatales-affiliated Linage C (TALC) were found to dominate in both libraries, accounting for 80.9% and 62.9% of the sequences from the yak and cattle clone libraries, respectively. Sequences belonging to the Methanobacteriales represented the second largest clade in both libraries. However, Methanobrevibacter wolinii (QTPC 110) was only found in the cattle library. The number of clones from the order Methanomicrobiales was greater in cattle than in the yak clone library. Although the Shannon index value indicated similar diversity between the two libraries, the Libshuff analysis indicated that the methanogen community structure of the yak was significantly different than those from cattle. This study revealed for the first time the molecular diversity of methanogen community in yaks and cattle in Qinghai-Tibetan Plateau area in China. From the analysis, we conclude that yaks have a unique rumen microbial ecosystem that is significantly different from that of cattle, this may also help to explain why yak produce less methane than cattle.

  14. Comparison of methanogen diversity of yak (Bos grunniens) and cattle (Bos taurus) from the Qinghai-Tibetan plateau, China

    PubMed Central

    2012-01-01

    Background Methane emissions by methanogen from livestock ruminants have significantly contributed to the agricultural greenhouse gas effect. It is worthwhile to compare methanogen from “energy-saving” animal (yak) and normal animal (cattle) in order to investigate the link between methanogen structure and low methane production. Results Diversity of methanogens from the yak and cattle rumen was investigated by analysis of 16S rRNA gene sequences from rumen digesta samples from four yaks (209 clones) and four cattle (205 clones) from the Qinghai-Tibetan Plateau area (QTP). Overall, a total of 414 clones (i.e. sequences) were examined and assigned to 95 operational taxonomic units (OTUs) using MOTHUR, based upon a 98% species-level identity criterion. Forty-six OTUs were unique to the yak clone library and 34 OTUs were unique to the cattle clone library, while 15 OTUs were found in both libraries. Of the 95 OTUs, 93 putative new species were identified. Sequences belonging to the Thermoplasmatales-affiliated Linage C (TALC) were found to dominate in both libraries, accounting for 80.9% and 62.9% of the sequences from the yak and cattle clone libraries, respectively. Sequences belonging to the Methanobacteriales represented the second largest clade in both libraries. However, Methanobrevibacter wolinii (QTPC 110) was only found in the cattle library. The number of clones from the order Methanomicrobiales was greater in cattle than in the yak clone library. Although the Shannon index value indicated similar diversity between the two libraries, the Libshuff analysis indicated that the methanogen community structure of the yak was significantly different than those from cattle. Conclusion This study revealed for the first time the molecular diversity of methanogen community in yaks and cattle in Qinghai-Tibetan Plateau area in China. From the analysis, we conclude that yaks have a unique rumen microbial ecosystem that is significantly different from that of cattle, this may also help to explain why yak produce less methane than cattle. PMID:23078429

  15. Genome of Horsepox Virus

    PubMed Central

    Tulman, E. R.; Delhon, G.; Afonso, C. L.; Lu, Z.; Zsak, L.; Sandybaev, N. T.; Kerembekova, U. Z.; Zaitsev, V. L.; Kutish, G. F.; Rock, D. L.

    2006-01-01

    Here we present the genomic sequence of horsepox virus (HSPV) isolate MNR-76, an orthopoxvirus (OPV) isolated in 1976 from diseased Mongolian horses. The 212-kbp genome contained 7.5-kbp inverted terminal repeats and lacked extensive terminal tandem repetition. HSPV contained 236 open reading frames (ORFs) with similarity to those in other OPVs, with those in the central 100-kbp region most conserved relative to other OPVs. Phylogenetic analysis of the conserved region indicated that HSPV is closely related to sequenced isolates of vaccinia virus (VACV) and rabbitpox virus, clearly grouping together these VACV-like viruses. Fifty-four HSPV ORFs likely represented fragments of 25 orthologous OPV genes, including in the central region the only known fragmented form of an OPV ribonucleotide reductase large subunit gene. In terminal genomic regions, HSPV lacked full-length homologues of genes variably fragmented in other VACV-like viruses but was unique in fragmentation of the homologue of VACV strain Copenhagen B6R, a gene intact in other known VACV-like viruses. Notably, HSPV contained in terminal genomic regions 17 kbp of OPV-like sequence absent in known VACV-like viruses, including fragments of genes intact in other OPVs and approximately 1.4 kb of sequence present only in cowpox virus (CPXV). HSPV also contained seven full-length genes fragmented or missing in other VACV-like viruses, including intact homologues of the CPXV strain GRI-90 D2L/I4R CrmB and D13L CD30-like tumor necrosis factor receptors, D3L/I3R and C1L ankyrin repeat proteins, B19R kelch-like protein, D7L BTB/POZ domain protein, and B22R variola virus B22R-like protein. These results indicated that HSPV contains unique genomic features likely contributing to a unique virulence/host range phenotype. They also indicated that while closely related to known VACV-like viruses, HSPV contains additional, potentially ancestral sequences absent in other VACV-like viruses. PMID:16940536

  16. Genetic Architecture of the Delis-Kaplan Executive Function System Trail Making Test: Evidence for Distinct Genetic Influences on Executive Function

    PubMed Central

    Vasilopoulos, Terrie; Franz, Carol E.; Panizzon, Matthew S.; Xian, Hong; Grant, Michael D.; Lyons, Michael J; Toomey, Rosemary; Jacobson, Kristen C.; Kremen, William S.

    2012-01-01

    Objective To examine how genes and environments contribute to relationships among Trail Making test conditions and the extent to which these conditions have unique genetic and environmental influences. Method Participants included 1237 middle-aged male twins from the Vietnam-Era Twin Study of Aging (VESTA). The Delis-Kaplan Executive Function System Trail Making test included visual searching, number and letter sequencing, and set-shifting components. Results Phenotypic correlations among Trails conditions ranged from 0.29 – 0.60, and genes accounted for the majority (58–84%) of each correlation. Overall heritability ranged from 0.34 to 0.62 across conditions. Phenotypic factor analysis suggested a single factor. In contrast, genetic models revealed a single common genetic factor but also unique genetic influences separate from the common factor. Genetic variance (i.e., heritability) of number and letter sequencing was completely explained by the common genetic factor while unique genetic influences separate from the common factor accounted for 57% and 21% of the heritabilities of visual search and set-shifting, respectively. After accounting for general cognitive ability, unique genetic influences accounted for 64% and 31% of those heritabilities. Conclusions A common genetic factor, most likely representing a combination of speed and sequencing accounted for most of the correlation among Trails 1–4. Distinct genetic factors, however, accounted for a portion of variance in visual scanning and set-shifting. Thus, although traditional phenotypic shared variance analysis techniques suggest only one general factor underlying different neuropsychological functions in non-patient populations, examining the genetic underpinnings of cognitive processes with twin analysis can uncover more complex etiological processes. PMID:22201299

  17. A New Way to Introduce Microarray Technology in a Lecture/Laboratory Setting by Studying the Evolution of This Modern Technology

    ERIC Educational Resources Information Center

    Rowland-Goldsmith, Melissa

    2009-01-01

    DNA microarray is an ordered grid containing known sequences of DNA, which represent many of the genes in a particular organism. Each DNA sequence is unique to a specific gene. This technology enables the researcher to screen many genes from cells or tissue grown in different conditions. We developed an undergraduate lecture and laboratory…

  18. Draft Genome Sequence of Mycobacterium chimaera Type Strain Fl-0169.

    PubMed

    Pfaller, Stacy; Tokarev, Vasily; Kessler, Collin; McLimans, Christopher; Gomez-Alvarez, Vicente; Wright, Justin; King, Dawn; Lamendella, Regina

    2017-02-23

    We report here the draft genome sequence of the type strain Mycobacterium chimaera Fl-0169, a member of the Mycobacterium avium complex (MAC). M. chimaera Fl-0169 T was isolated from a patient in Italy and is highly similar to strains of M. chimaera isolated in Ireland, although Fl-0169 T possesses unique virulence genes. Copyright © 2017 Pfaller et al.

  19. Clustered regularly interspaced short palindromic repeats (CRISPRs) for the genotyping of bacterial pathogens.

    PubMed

    Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

    2009-01-01

    Clustered regularly interspaced short palindromic repeats (CRISPRs) are DNA sequences composed of a succession of repeats (23- to 47-bp long) separated by unique sequences called spacers. Polymorphism can be observed in different strains of a species and may be used for genotyping. We describe protocols and bioinformatics tools that allow the identification of CRISPRs from sequenced genomes, their comparison, and their component determination (the direct repeats and the spacers). A schematic representation of the spacer organization can be produced, allowing an easy comparison between strains.

  20. Nucleotide Sequence Analysis of RNA Synthesized from Rabbit Globin Complementary DNA

    PubMed Central

    Poon, Raymond; Paddock, Gary V.; Heindell, Howard; Whitcome, Philip; Salser, Winston; Kacian, Dan; Bank, Arthur; Gambino, Roberto; Ramirez, Francesco

    1974-01-01

    Rabbit globin complementary DNA made with RNA-dependent DNA polymerase (reverse transcriptase) was used as template for in vitro synthesis of 32P-labeled RNA. The sequences of the nucleotides in most of the fragments resulting from combined ribonuclease T1 and alkaline phosphatase digestion have been determined. Several fragments were long enough to fit uniquely with the α or β globin amino-acid sequences. These data demonstrate that the cDNA was copied from globin mRNA and contained no detectable contaminants. Images PMID:4139714

  1. Amelogenin Evolution and Tetrapod Enamel Structure

    PubMed Central

    Diekwisch, Thomas G.H.; Jin, Tianquan; Wang, Xinping; Ito, Yoshihiro; Schmidt, Marcella; Druzinsky, Robert; Yamane, Akira; Luan, Xianghong

    2009-01-01

    Amelogenins are the major proteins involved in tooth enamel formation. In the present study we have cloned and sequenced four novel amelogenins from three amphibian species in order to analyze similarities and differences between mammalian and non-mammalian amelogenins. The newly sequenced amphibian amelogenin sequences were from a Red-eyed tree frog (Litoria chloris) and a Mexican axolotl (Ambystoma mexicanum). We identified two amelogenin isoforms in the Eastern Red-backed Salamander (Plethodon cinereus). Sequence comparisons confirmed that non-mammalian amelogenins are overall shorter than their mammalian counterparts, contain less proline and less glutamine, and feature shorter polyproline tripeptide repeat stretches than mammalian amelogenins. We propose that unique sequence parameters of mammalian amelogenins might be a pre-requisite for complex mammalian enamel prism architecture. PMID:19828974

  2. A transcriptome resource for the koala (Phascolarctos cinereus): insights into koala retrovirus transcription and sequence diversity.

    PubMed

    Hobbs, Matthew; Pavasovic, Ana; King, Andrew G; Prentis, Peter J; Eldridge, Mark D B; Chen, Zhiliang; Colgan, Donald J; Polkinghorne, Adam; Wilkins, Marc R; Flanagan, Cheyne; Gillett, Amber; Hanger, Jon; Johnson, Rebecca N; Timms, Peter

    2014-09-11

    The koala, Phascolarctos cinereus, is a biologically unique and evolutionarily distinct Australian arboreal marsupial. The goal of this study was to sequence the transcriptome from several tissues of two geographically separate koalas, and to create the first comprehensive catalog of annotated transcripts for this species, enabling detailed analysis of the unique attributes of this threatened native marsupial, including infection by the koala retrovirus. RNA-Seq data was generated from a range of tissues from one male and one female koala and assembled de novo into transcripts using Velvet-Oases. Transcript abundance in each tissue was estimated. Transcripts were searched for likely protein-coding regions and a non-redundant set of 117,563 putative protein sequences was produced. In similarity searches there were 84,907 (72%) sequences that aligned to at least one sequence in the NCBI nr protein database. The best alignments were to sequences from other marsupials. After applying a reciprocal best hit requirement of koala sequences to those from tammar wallaby, Tasmanian devil and the gray short-tailed opossum, we estimate that our transcriptome dataset represents approximately 15,000 koala genes. The marsupial alignment information was used to look for potential gene duplications and we report evidence for copy number expansion of the alpha amylase gene, and of an aldehyde reductase gene.Koala retrovirus (KoRV) transcripts were detected in the transcriptomes. These were analysed in detail and the structure of the spliced envelope gene transcript was determined. There was appreciable sequence diversity within KoRV, with 233 sites in the KoRV genome showing small insertions/deletions or single nucleotide polymorphisms. Both koalas had sequences from the KoRV-A subtype, but the male koala transcriptome has, in addition, sequences more closely related to the KoRV-B subtype. This is the first report of a KoRV-B-like sequence in a wild population. This transcriptomic dataset is a useful resource for molecular genetic studies of the koala, for evolutionary genetic studies of marsupials, for validation and annotation of the koala genome sequence, and for investigation of koala retrovirus. Annotated transcripts can be browsed and queried at http://koalagenome.org.

  3. Bacterial CRISPR Regions: General Features and their Potential for Epidemiological Molecular Typing Studies.

    PubMed

    Karimi, Zahra; Ahmadi, Ali; Najafi, Ali; Ranjbar, Reza

    2018-01-01

    CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) loci as novel and applicable regions in prokaryotic genomes have gained great attraction in the post genomics era. These unique regions are diverse in number and sequence composition in different pathogenic bacteria and thereby can be a suitable candidate for molecular epidemiology and genotyping studies. Results:Furthermore, the arrayed structure of CRISPR loci (several unique repeats spaced with the variable sequence) and associated cas genes act as an active prokaryotic immune system against viral replication and conjugative elements. This property can be used as a tool for RNA editing in bioengineering studies. The aim of this review was to survey some details about the history, nature, and potential applications of CRISPR arrays in both genetic engineering and bacterial genotyping studies.

  4. Transcriptome profile and unique genetic evolution of positively selected genes in yak lungs.

    PubMed

    Lan, DaoLiang; Xiong, XianRong; Ji, WenHui; Li, Jian; Mipam, Tserang-Donko; Ai, Yi; Chai, ZhiXin

    2018-04-01

    The yak (Bos grunniens), which is a unique bovine breed that is distributed mainly in the Qinghai-Tibetan Plateau, is considered a good model for studying plateau adaptability in mammals. The lungs are important functional organs that enable animals to adapt to their external environment. However, the genetic mechanism underlying the adaptability of yak lungs to harsh plateau environments remains unknown. To explore the unique evolutionary process and genetic mechanism of yak adaptation to plateau environments, we performed transcriptome sequencing of yak and cattle (Bos taurus) lungs using RNA-Seq technology and a subsequent comparison analysis to identify the positively selected genes in the yak. After deep sequencing, a normal transcriptome profile of yak lung that containing a total of 16,815 expressed genes was obtained, and the characteristics of yak lungs transcriptome was described by functional analysis. Furthermore, Ka/Ks comparison statistics result showed that 39 strong positively selected genes are identified from yak lungs. Further GO and KEGG analysis was conducted for the functional annotation of these genes. The results of this study provide valuable data for further explorations of the unique evolutionary process of high-altitude hypoxia adaptation in yaks in the Tibetan Plateau and the genetic mechanism at the molecular level.

  5. Mapping the Geometric Evolution of Protein Folding Motor.

    PubMed

    Jerath, Gaurav; Hazam, Prakash Kishore; Shekhar, Shashi; Ramakrishnan, Vibin

    2016-01-01

    Polypeptide chain has an invariant main-chain and a variant side-chain sequence. How the side-chain sequence determines fold in terms of its chemical constitution has been scrutinized extensively and verified periodically. However, a focussed investigation on the directive effect of side-chain geometry may provide important insights supplementing existing algorithms in mapping the geometrical evolution of protein chains and its structural preferences. Geometrically, folding of protein structure may be envisaged as the evolution of its geometric variables: ϕ, and ψ dihedral angles of polypeptide main-chain directed by χ1, and χ2 of side chain. In this work, protein molecule is metaphorically modelled as a machine with 4 rotors ϕ, ψ, χ1 and χ2, with its evolution to the functional fold is directed by combinations of its rotor directions. We observe that differential rotor motions lead to different secondary structure formations and the combinatorial pattern is unique and consistent for particular secondary structure type. Further, we found that combination of rotor geometries of each amino acid is unique which partly explains how different amino acid sequence combinations have unique structural evolution and functional adaptation. Quantification of these amino acid rotor preferences, resulted in the generation of 3 substitution matrices, which later on plugged in the BLAST tool, for evaluating their efficiency in aligning sequences. We have employed BLOSUM62 and PAM30 as standard for primary evaluation. Generation of substitution matrices is a logical extension of the conceptual framework we attempted to build during the development of this work. Optimization of matrices following the conventional routines and possible application with biologically relevant data sets are beyond the scope of this manuscript, though it is a part of the larger project design.

  6. Androgen Receptor and its Splice Variant, AR-V7, Differentially Regulate FOXA1 Sensitive Genes in LNCaP Prostate Cancer Cells

    PubMed Central

    Krause, William C.; Shafi, Ayesha A.; Nakka, Manjula; Weigel, Nancy L.

    2014-01-01

    Prostate cancer (PCa) is an androgen-dependent disease, and tumors that are resistant to androgen ablation therapy often remain androgen receptor (AR) dependent. Among the contributors to castration-resistant PCa are AR splice variants that lack the ligand-binding domain (LBD). Instead, they have small amounts of unique sequence derived from cryptic exons or from out of frame translation. The AR-V7 (or AR3) variant is constitutively active and is expressed under conditions consistent with CRPC. AR-V7 is reported to regulate a transcriptional program that is similar but not identical to that of AR. However, it is unknown whether these differences are due to the unique sequence in AR-V7, or simply to loss of the LBD. To examine transcriptional regulation by AR-V7, we have used lentiviruses encoding AR-V7 (amino acids 1-627 of AR with the 16 amino acids unique to the variant) to prepare a derivative of the androgen-dependent LNCaP cells with inducible expression of AR-V7. An additional cell line was generated with regulated expression of AR-NTD (amino acids 1-660 of AR); this mutant lacks the LBD but does not have the AR-V7 specific sequence. We find that AR and AR-V7 have distinct activities on target genes that are co-regulated by FOXA1. Transcripts regulated by AR-V7 were similarly regulated by AR-NTD, indicating that loss of the LBD is sufficient for the observed differences. Differential regulation of target genes correlates with preferential recruitment of AR or AR-V7 to specific cis-regulatory DNA sequences providing an explanation for some of the observed differences in target gene regulation. PMID:25008967

  7. Comparative Genomics of Carp Herpesviruses

    PubMed Central

    Kurobe, Tomofumi; Gatherer, Derek; Cunningham, Charles; Korf, Ian; Fukuda, Hideo; Hedrick, Ronald P.; Waltzek, Thomas B.

    2013-01-01

    Three alloherpesviruses are known to cause disease in cyprinid fish: cyprinid herpesviruses 1 and 3 (CyHV1 and CyHV3) in common carp and koi and cyprinid herpesvirus 2 (CyHV2) in goldfish. We have determined the genome sequences of CyHV1 and CyHV2 and compared them with the published CyHV3 sequence. The CyHV1 and CyHV2 genomes are 291,144 and 290,304 bp, respectively, in size, and thus the CyHV3 genome, at 295,146 bp, remains the largest recorded among the herpesviruses. Each of the three genomes consists of a unique region flanked at each terminus by a sizeable direct repeat. The CyHV1, CyHV2, and CyHV3 genomes are predicted to contain 137, 150, and 155 unique, functional protein-coding genes, respectively, of which six, four, and eight, respectively, are duplicated in the terminal repeat. The three viruses share 120 orthologous genes in a largely colinear arrangement, of which up to 55 are also conserved in the other member of the genus Cyprinivirus, anguillid herpesvirus 1. Twelve genes are conserved convincingly in all sequenced alloherpesviruses, and two others are conserved marginally. The reference CyHV3 strain has been reported to contain five fragmented genes that are presumably nonfunctional. The CyHV2 strain has two fragmented genes, and the CyHV1 strain has none. CyHV1, CyHV2, and CyHV3 have five, six, and five families of paralogous genes, respectively. One family unique to CyHV1 is related to cellular JUNB, which encodes a transcription factor involved in oncogenesis. To our knowledge, this is the first time that JUNB-related sequences have been reported in a herpesvirus. PMID:23269803

  8. Identical mitochondrial somatic mutations unique to chronic periodontitis and coronary artery disease

    PubMed Central

    Pallavi, Tokala; Chandra, Rampalli Viswa; Reddy, Aileni Amarender; Reddy, Bavigadda Harish; Naveen, Anumala

    2016-01-01

    Context: The inflammatory processes involved in chronic periodontitis and coronary artery diseases (CADs) are similar and produce reactive oxygen species that may result in similar somatic mutations in mitochondrial deoxyribonucleic acid (mtDNA). Aims: The aims of the present study were to identify somatic mtDNA mutations in periodontal and cardiac tissues from subjects undergoing coronary artery bypass surgery and determine what fraction was identical and unique to these tissues. Settings and Design: The study population consisted of 30 chronic periodontitis subjects who underwent coronary artery surgery after an angiogram had indicated CAD. Materials and Methods: Gingival tissue samples were taken from the site with deepest probing depth; coronary artery tissue samples were taken during the coronary artery bypass grafting procedures, and blood samples were drawn during this surgical procedure. These samples were stored under aseptic conditions and later transported for mtDNA analysis. Statistical Analysis Used: Complete mtDNA sequences were obtained and aligned with the revised Cambridge reference sequence (NC_012920) using sequence analysis and auto assembler tools. Results: Among the complete mtDNA sequences, a total of 162 variations were spread across the whole mitochondrial genome and present only in the coronary artery and the gingival tissue samples but not in the blood samples. Among the 162 variations, 12 were novel and four of the 12 novel variations were found in mitochondrial NADH dehydrogenase subunit 5 complex I gene (33.3%). Conclusions: Analysis of mtDNA mutations indicated 162 variants unique to periodontitis and CAD. Of these, 12 were novel and may have resulted from destructive oxidative forces common to these two diseases. PMID:27041832

  9. Comparative fecal metagenomics unveils unique functional capacity of the swine gut

    PubMed Central

    2011-01-01

    Background Uncovering the taxonomic composition and functional capacity within the swine gut microbial consortia is of great importance to animal physiology and health as well as to food and water safety due to the presence of human pathogens in pig feces. Nonetheless, limited information on the functional diversity of the swine gut microbiome is available. Results Analysis of 637, 722 pyrosequencing reads (130 megabases) generated from Yorkshire pig fecal DNA extracts was performed to help better understand the microbial diversity and largely unknown functional capacity of the swine gut microbiome. Swine fecal metagenomic sequences were annotated using both MG-RAST and JGI IMG/M-ER pipelines. Taxonomic analysis of metagenomic reads indicated that swine fecal microbiomes were dominated by Firmicutes and Bacteroidetes phyla. At a finer phylogenetic resolution, Prevotella spp. dominated the swine fecal metagenome, while some genes associated with Treponema and Anareovibrio species were found to be exclusively within the pig fecal metagenomic sequences analyzed. Functional analysis revealed that carbohydrate metabolism was the most abundant SEED subsystem, representing 13% of the swine metagenome. Genes associated with stress, virulence, cell wall and cell capsule were also abundant. Virulence factors associated with antibiotic resistance genes with highest sequence homology to genes in Bacteroidetes, Clostridia, and Methanosarcina were numerous within the gene families unique to the swine fecal metagenomes. Other abundant proteins unique to the distal swine gut shared high sequence homology to putative carbohydrate membrane transporters. Conclusions The results from this metagenomic survey demonstrated the presence of genes associated with resistance to antibiotics and carbohydrate metabolism suggesting that the swine gut microbiome may be shaped by husbandry practices. PMID:21575148

  10. On the role of the SMA in the discrete sequence production task: a TMS study. Transcranial Magnetic Stimulation.

    PubMed

    Verwey, Willem B; Lammens, Robin; van Honk, Jack

    2002-01-01

    Participants practiced two discrete six-key sequences for a total of 420 trials. The 1 x 6 sequence had a unique order of key presses while the 2 x 3 sequence involved repetition of a three-key segment. Both sequences showed a long interkey interval halfway the sequence indicating hierarchical sequence control in that not only the 2 x 3 but also the 1 x 6 sequence was executed as two successive motor chunks. Besides, the second part of both sequences was executed faster than the first part. This supports the earlier notion of a motor processor executing the elements of familiar motor chunks and a cognitive processor triggering either these motor chunks or individual sequence elements. Low-frequency, off-line transcranial magnetic stimulation (TMS) of the supplementary motor area (SMA) counteracted normal improvement with practice of key presses at all sequence positions. Together, these results are in line with the notion that with moderate practice, the SMA executes short sequence fragments that are concatenated by other brain structures.

  11. In silico Comparison of 19 Porphyromonas gingivalis Strains in Genomics, Phylogenetics, Phylogenomics and Functional Genomics.

    PubMed

    Chen, Tsute; Siddiqui, Huma; Olsen, Ingar

    2017-01-01

    Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica . All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/.

  12. Transcriptome sequencing and whole genome expression profiling of chrysanthemum under dehydration stress

    PubMed Central

    2013-01-01

    Background Chrysanthemum is one of the most important ornamental crops in the world and drought stress seriously limits its production and distribution. In order to generate a functional genomics resource and obtain a deeper understanding of the molecular mechanisms regarding chrysanthemum responses to dehydration stress, we performed large-scale transcriptome sequencing of chrysanthemum plants under dehydration stress using the Illumina sequencing technology. Results Two cDNA libraries constructed from mRNAs of control and dehydration-treated seedlings were sequenced by Illumina technology. A total of more than 100 million reads were generated and de novo assembled into 98,180 unique transcripts which were further extensively annotated by comparing their sequencing to different protein databases. Biochemical pathways were predicted from these transcript sequences. Furthermore, we performed gene expression profiling analysis upon dehydration treatment in chrysanthemum and identified 8,558 dehydration-responsive unique transcripts, including 307 transcription factors and 229 protein kinases and many well-known stress responsive genes. Gene ontology (GO) term enrichment and biochemical pathway analyses showed that dehydration stress caused changes in hormone response, secondary and amino acid metabolism, and light and photoperiod response. These findings suggest that drought tolerance of chrysanthemum plants may be related to the regulation of hormone biosynthesis and signaling, reduction of oxidative damage, stabilization of cell proteins and structures, and maintenance of energy and carbon supply. Conclusions Our transcriptome sequences can provide a valuable resource for chrysanthemum breeding and research and novel insights into chrysanthemum responses to dehydration stress and offer candidate genes or markers that can be used to guide future studies attempting to breed drought tolerant chrysanthemum cultivars. PMID:24074255

  13. In silico Comparison of 19 Porphyromonas gingivalis Strains in Genomics, Phylogenetics, Phylogenomics and Functional Genomics

    PubMed Central

    Chen, Tsute; Siddiqui, Huma; Olsen, Ingar

    2017-01-01

    Currently, genome sequences of a total of 19 Porphyromonas gingivalis strains are available, including eight completed genomes (strains W83, ATCC 33277, TDC60, HG66, A7436, AJW4, 381, and A7A1-28) and 11 high-coverage draft sequences (JCVI SC001, F0185, F0566, F0568, F0569, F0570, SJD2, W4087, W50, Ando, and MP4-504) that are assembled into fewer than 300 contigs. The objective was to compare these genomes at both nucleotide and protein sequence levels in order to understand their phylogenetic and functional relatedness. Four copies of 16S rRNA gene sequences were identified in each of the eight complete genomes and one in the other 11 unfinished genomes. These 43 16S rRNA sequences represent only 24 unique sequences and the derived phylogenetic tree suggests a possible evolutionary history for these strains. Phylogenomic comparison based on shared proteins and whole genome nucleotide sequences consistently showed two groups with closely related members: one consisted of ATCC 33277, 381, and HG66, another of W83, W50, and A7436. At least 1,037 core/shared proteins were identified in the 19 P. gingivalis genomes based on the most stringent detecting parameters. Comparative functional genomics based on genome-wide comparisons between NCBI and RAST annotations, as well as additional approaches, revealed functions that are unique or missing in individual P. gingivalis strains, or species-specific in all P. gingivalis strains, when compared to a neighboring species P. asaccharolytica. All the comparative results of this study are available online for download at ftp://www.homd.org/publication_data/20160425/. PMID:28261563

  14. The role of heterologous chloroplast sequence elements in transgene integration and expression.

    PubMed

    Ruhlman, Tracey; Verma, Dheeraj; Samson, Nalapalli; Daniell, Henry

    2010-04-01

    Heterologous regulatory elements and flanking sequences have been used in chloroplast transformation of several crop species, but their roles and mechanisms have not yet been investigated. Nucleotide sequence identity in the photosystem II protein D1 (psbA) upstream region is 59% across all taxa; similar variation was consistent across all genes and taxa examined. Secondary structure and predicted Gibbs free energy values of the psbA 5' untranslated region (UTR) among different families reflected this variation. Therefore, chloroplast transformation vectors were made for tobacco (Nicotiana tabacum) and lettuce (Lactuca sativa), with endogenous (Nt-Nt, Ls-Ls) or heterologous (Nt-Ls, Ls-Nt) psbA promoter, 5' UTR and 3' UTR, regulating expression of the anthrax protective antigen (PA) or human proinsulin (Pins) fused with the cholera toxin B-subunit (CTB). Unique lettuce flanking sequences were completely eliminated during homologous recombination in the transplastomic tobacco genomes but not unique tobacco sequences. Nt-Ls or Ls-Nt transplastomic lines showed reduction of 80% PA and 97% CTB-Pins expression when compared with endogenous psbA regulatory elements, which accumulated up to 29.6% total soluble protein PA and 72.0% total leaf protein CTB-Pins, 2-fold higher than Rubisco. Transgene transcripts were reduced by 84% in Ls-Nt-CTB-Pins and by 72% in Nt-Ls-PA lines. Transcripts containing endogenous 5' UTR were stabilized in nonpolysomal fractions. Stromal RNA-binding proteins were preferentially associated with endogenous psbA 5' UTR. A rapid and reproducible regeneration system was developed for lettuce commercial cultivars by optimizing plant growth regulators. These findings underscore the need for sequencing complete crop chloroplast genomes, utilization of endogenous regulatory elements and flanking sequences, as well as optimization of plant growth regulators for efficient chloroplast transformation.

  15. The Role of Heterologous Chloroplast Sequence Elements in Transgene Integration and Expression1[W][OA

    PubMed Central

    Ruhlman, Tracey; Verma, Dheeraj; Samson, Nalapalli; Daniell, Henry

    2010-01-01

    Heterologous regulatory elements and flanking sequences have been used in chloroplast transformation of several crop species, but their roles and mechanisms have not yet been investigated. Nucleotide sequence identity in the photosystem II protein D1 (psbA) upstream region is 59% across all taxa; similar variation was consistent across all genes and taxa examined. Secondary structure and predicted Gibbs free energy values of the psbA 5′ untranslated region (UTR) among different families reflected this variation. Therefore, chloroplast transformation vectors were made for tobacco (Nicotiana tabacum) and lettuce (Lactuca sativa), with endogenous (Nt-Nt, Ls-Ls) or heterologous (Nt-Ls, Ls-Nt) psbA promoter, 5′ UTR and 3′ UTR, regulating expression of the anthrax protective antigen (PA) or human proinsulin (Pins) fused with the cholera toxin B-subunit (CTB). Unique lettuce flanking sequences were completely eliminated during homologous recombination in the transplastomic tobacco genomes but not unique tobacco sequences. Nt-Ls or Ls-Nt transplastomic lines showed reduction of 80% PA and 97% CTB-Pins expression when compared with endogenous psbA regulatory elements, which accumulated up to 29.6% total soluble protein PA and 72.0% total leaf protein CTB-Pins, 2-fold higher than Rubisco. Transgene transcripts were reduced by 84% in Ls-Nt-CTB-Pins and by 72% in Nt-Ls-PA lines. Transcripts containing endogenous 5′ UTR were stabilized in nonpolysomal fractions. Stromal RNA-binding proteins were preferentially associated with endogenous psbA 5′ UTR. A rapid and reproducible regeneration system was developed for lettuce commercial cultivars by optimizing plant growth regulators. These findings underscore the need for sequencing complete crop chloroplast genomes, utilization of endogenous regulatory elements and flanking sequences, as well as optimization of plant growth regulators for efficient chloroplast transformation. PMID:20130101

  16. RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis

    PubMed Central

    Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

    2012-01-01

    RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. Availability http://www.cemb.edu.pk/sw.html Abbreviations RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language. PMID:23055611

  17. The diploid genome sequence of an Asian individual

    PubMed Central

    Wang, Jun; Wang, Wei; Li, Ruiqiang; Li, Yingrui; Tian, Geng; Goodman, Laurie; Fan, Wei; Zhang, Junqing; Li, Jun; Zhang, Juanbin; Guo, Yiran; Feng, Binxiao; Li, Heng; Lu, Yao; Fang, Xiaodong; Liang, Huiqing; Du, Zhenglin; Li, Dong; Zhao, Yiqing; Hu, Yujie; Yang, Zhenzhen; Zheng, Hancheng; Hellmann, Ines; Inouye, Michael; Pool, John; Yi, Xin; Zhao, Jing; Duan, Jinjie; Zhou, Yan; Qin, Junjie; Ma, Lijia; Li, Guoqing; Yang, Zhentao; Zhang, Guojie; Yang, Bin; Yu, Chang; Liang, Fang; Li, Wenjie; Li, Shaochuan; Li, Dawei; Ni, Peixiang; Ruan, Jue; Li, Qibin; Zhu, Hongmei; Liu, Dongyuan; Lu, Zhike; Li, Ning; Guo, Guangwu; Zhang, Jianguo; Ye, Jia; Fang, Lin; Hao, Qin; Chen, Quan; Liang, Yu; Su, Yeyang; san, A.; Ping, Cuo; Yang, Shuang; Chen, Fang; Li, Li; Zhou, Ke; Zheng, Hongkun; Ren, Yuanyuan; Yang, Ling; Gao, Yang; Yang, Guohua; Li, Zhuo; Feng, Xiaoli; Kristiansen, Karsten; Wong, Gane Ka-Shu; Nielsen, Rasmus; Durbin, Richard; Bolund, Lars; Zhang, Xiuqing; Li, Songgang; Yang, Huanming; Wang, Jian

    2009-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J. D. Watson and J. C. Venter), and structural variation identification. These variations were considered for their potential biological impact. Our sequence data and analyses demonstrate the potential usefulness of next-generation sequencing technologies for personal genomics. PMID:18987735

  18. Complete Chloroplast Genome Sequences of Mongolia Medicine Artemisia frigida and Phylogenetic Relationships with Other Plants

    PubMed Central

    Liu, Yue; Huo, Naxin; Dong, Lingli; Wang, Yi; Zhang, Shuixian; Young, Hugh A.; Feng, Xiaoxiao; Gu, Yong Qiang

    2013-01-01

    Background Artemisia frigida Willd. is an important Mongolian traditional medicinal plant with pharmacological functions of stanch and detumescence. However, there is little sequence and genomic information available for Artemisia frigida, which makes phylogenetic identification, evolutionary studies, and genetic improvement of its value very difficult. We report the complete chloroplast genome sequence of Artemisia frigida based on 454 pyrosequencing. Methodology/Principal Findings The complete chloroplast genome of Artemisia frigida is 151,076 bp including a large single copy (LSC) region of 82,740 bp, a small single copy (SSC) region of 18,394 bp and a pair of inverted repeats (IRs) of 24,971 bp. The genome contains 114 unique genes and 18 duplicated genes. The chloroplast genome of Artemisia frigida contains a small 3.4 kb inversion within a large 23 kb inversion in the LSC region, a unique feature in Asteraceae. The gene order in the SSC region of Artemisia frigida is inverted compared with the other 6 Asteraceae species with the chloroplast genomes sequenced. This inversion is likely caused by an intramolecular recombination event only occurred in Artemisia frigida. The existence of rich SSR loci in the Artemisia frigida chloroplast genome provides a rare opportunity to study population genetics of this Mongolian medicinal plant. Phylogenetic analysis demonstrates a sister relationship between Artemisia frigida and four other species in Asteraceae, including Ageratina adenophora, Helianthus annuus, Guizotia abyssinica and Lactuca sativa, based on 61 protein-coding sequences. Furthermore, Artemisia frigida was placed in the tribe Anthemideae in the subfamily Asteroideae (Asteraceae) based on ndhF and trnL-F sequence comparisons. Conclusion The chloroplast genome sequence of Artemisia frigida was assembled and analyzed in this study, representing the first plastid genome sequenced in the Anthemideae tribe. This complete chloroplast genome sequence will be useful for molecular ecology and molecular phylogeny studies within Artemisia species and also within the Asteraceae family. PMID:23460871

  19. 25 Years of GenBank

    MedlinePlus

    ... this page please turn Javascript on. Unique DNA database has helped advance scientific discoveries worldwide Since its origin 25 years ago, the database of nucleic acid sequences known as GenBank has ...

  20. Using DNA-labelled nano- and microparticles to track particle transport in the environment

    NASA Astrophysics Data System (ADS)

    McNew, Coy; Wang, Chaozi; Dahlke, Helen; Lyon, Steve; Walter, Todd

    2017-04-01

    By utilizing bio-molecular nanotechnology developed for nano-medicines and drug delivery, we are able to produce DNA-labelled nano- and microparticle tracers for use in a myriad of environmental systems. The use of custom sequenced DNA allows for the fabrication of an enormous number of uniquely labelled tracers with identical transport properties (approximately 1.61 x 1060 unique sequences), each independently quantifiable, that can be applied simultaneously in any hydrologic system. By controlling the fabrication procedure to produce particles of custom size and charge, we are able to tag each size-charge combination uniquely in order to directly probe the effect of these variables on the transport properties of the particles. Here we present our methods for fabrication, extraction, and analysis of the DNA nano- and microparticle tracers, along with results from several successful applications of the tracers, including transport and retention analysis at the lab, continuum, and field scales. To date, our DNA-labelled nano- and microparticle tracers have proved useful in surface and subsurface water applications, soil retention, and even subglacial flow pathways. The range of potential applications continue to prove nearly limitless.

  1. The visual pigments of the West Indian manatee (Trichechus manatus).

    PubMed

    Newman, Lucy A; Robinson, Phyllis R

    2006-10-01

    Manatees are unique among the fully aquatic marine mammals in that they are herbivorous creatures, with hunting strategies restricted to grazing on sea-grasses. Since the other groups of (carnivorous) marine mammals have been found to possess various visual system adaptations to their unique visual environments, it was of interest to investigate the visual capability of the manatee. Previous work, both behavioral (Griebel & Schmid, 1996), and ultrastructural (Cohen, Tucker, & Odell, 1982; unpublished work cited by Griebel & Peichl, 2003), has suggested that manatees have the dichromatic color vision typical of diurnal mammals. This study uses molecular techniques to investigate the cone visual pigments of the manatee. The aim was to clone and sequence cone opsins from the retina, and, if possible, express and reconstitute functional visual pigments to perform spectral analysis. Both LWS and SWS cone opsins were cloned and sequenced from manatee retinae, which, upon expression and spectral analysis, had lambda(max) values of 555 and 410 nm, respectively. The expression of both the LWS and SWS cone opsin in the manatee retina is unique as both pinnipeds and cetaceans only express a cone LWS opsin.

  2. Insights from Human/Mouse genome comparisons

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pennacchio, Len A.

    2003-03-30

    Large-scale public genomic sequencing efforts have provided a wealth of vertebrate sequence data poised to provide insights into mammalian biology. These include deep genomic sequence coverage of human, mouse, rat, zebrafish, and two pufferfish (Fugu rubripes and Tetraodon nigroviridis) (Aparicio et al. 2002; Lander et al. 2001; Venter et al. 2001; Waterston et al. 2002). In addition, a high-priority has been placed on determining the genomic sequence of chimpanzee, dog, cow, frog, and chicken (Boguski 2002). While only recently available, whole genome sequence data have provided the unique opportunity to globally compare complete genome contents. Furthermore, the shared evolutionary ancestrymore » of vertebrate species has allowed the development of comparative genomic approaches to identify ancient conserved sequences with functionality. Accordingly, this review focuses on the initial comparison of available mammalian genomes and describes various insights derived from such analysis.« less

  3. Plasmid Characterization and Chromosome Analysis of Two netF+ Clostridium perfringens Isolates Associated with Foal and Canine Necrotizing Enteritis

    PubMed Central

    Mehdizadeh Gohari, Iman; Kropinski, Andrew M.; Weese, Scott J.; Parreira, Valeria R.; Whitehead, Ashley E.; Boerlin, Patrick; Prescott, John F.

    2016-01-01

    The recent discovery of a novel beta-pore-forming toxin, NetF, which is strongly associated with canine and foal necrotizing enteritis should improve our understanding of the role of type A Clostridium perfringens associated disease in these animals. The current study presents the complete genome sequence of two netF-positive strains, JFP55 and JFP838, which were recovered from cases of foal necrotizing enteritis and canine hemorrhagic gastroenteritis, respectively. Genome sequencing was done using Single Molecule, Real-Time (SMRT) technology-PacBio and Illumina Hiseq2000. The JFP55 and JFP838 genomes include a single 3.34 Mb and 3.53 Mb chromosome, respectively, and both genomes include five circular plasmids. Plasmid annotation revealed that three plasmids were shared by the two newly sequenced genomes, including a NetF/NetE toxins-encoding tcp-conjugative plasmid, a CPE/CPB2 toxins-encoding tcp-conjugative plasmid and a putative bacteriocin-encoding plasmid. The putative beta-pore-forming toxin genes, netF, netE and netG, were located in unique pathogenicity loci on tcp-conjugative plasmids. The C. perfringens JFP55 chromosome carries 2,825 protein-coding genes whereas the chromosome of JFP838 contains 3,014 protein-encoding genes. Comparison of these two chromosomes with three available reference C. perfringens chromosome sequences identified 48 (~247 kb) and 81 (~430 kb) regions unique to JFP55 and JFP838, respectively. Some of these divergent genomic regions in both chromosomes are phage- and plasmid-related segments. Sixteen of these unique chromosomal regions (~69 kb) were shared between the two isolates. Five of these shared regions formed a mosaic of plasmid-integrated segments, suggesting that these elements were acquired early in a clonal lineage of netF-positive C. perfringens strains. These results provide significant insight into the basis of canine and foal necrotizing enteritis and are the first to demonstrate that netF resides on a large and unique plasmid-encoded locus. PMID:26859667

  4. Draft Genome Sequence of a Tetrabromobisphenol A–Degrading Strain, Ochrobactrum sp. T, Isolated from an Electronic Waste Recycling Site

    PubMed Central

    Liang, Zhishu; Li, Guiying; Zhang, Guoxia; Das, Ranjit

    2016-01-01

    Ochrobactrum sp. T was previously isolated from a sludge sample collected from an electronic waste recycling site and characterized as a unique tetrabromobisphenol A (TBBPA)–degrading bacterium. Here, the draft genome sequence (3.9 Mb) of Ochrobactrum sp. T is reported to provide insights into its diversity and its TBBPA biodegradation mechanism in polluted environments. PMID:27445374

  5. Whole-Genome Sequence of Escherichia coli Serotype O157:H7 Strain B6914-ARS.

    PubMed

    Uhlich, Gaylen A; Reichenberger, Erin R; Cottrell, Bryan J; Fratamico, Pina; Andreozzi, Elisa

    2017-11-02

    Escherichia coli serotype O157:H7 strain B6914-MS1 is an isolate from the Centers for Disease Control and Prevention that is missing both Shiga toxin genes and has been used extensively in applied research studies. Here we report the genome sequence of strain B6914-ARS, a B6914-MS1 clone that has unique biofilm properties.

  6. Genome Sequence of the Novel Marine Member of the Gammaproteobacteria Strain HTCC5015▿

    PubMed Central

    Thrash, J. Cameron; Stingl, Ulrich; Cho, Jang-Cheon; Ferriera, Steve; Johnson, Justin; Vergin, Kevin L.; Giovannoni, Stephen J.

    2010-01-01

    HTCC5015 is a novel, highly divergent marine member of the Gammaproteobacteria, currently without a cultured representative with greater than 89% 16S rRNA gene identity to itself. The organism was isolated from water collected from Hydrostation S south of Bermuda using high-throughput dilution-to-extinction culturing techniques. Here we present the genome sequence of the unique Gammaproteobacterium strain HTCC5015. PMID:20472792

  7. The complete nucleotide sequences of the five genetically distinct plastid genomes of Oenothera, subsection Oenothera: I. sequence evaluation and plastome evolution.

    PubMed

    Greiner, Stephan; Wang, Xi; Rauwolf, Uwe; Silber, Martina V; Mayer, Klaus; Meurer, Jörg; Haberer, Georg; Herrmann, Reinhold G

    2008-04-01

    The flowering plant genus Oenothera is uniquely suited for studying molecular mechanisms of speciation. It assembles an intriguing combination of genetic features, including permanent translocation heterozygosity, biparental transmission of plastids, and a general interfertility of well-defined species. This allows an exchange of plastids and nuclei between species often resulting in plastome-genome incompatibility. For evaluation of its molecular determinants we present the complete nucleotide sequences of the five basic, genetically distinguishable plastid chromosomes of subsection Oenothera (=Euoenothera) of the genus, which are associated in distinct combinations with six basic genomes. Sizes of the chromosomes range from 163 365 bp (plastome IV) to 165 728 bp (plastome I), display between 96.3% and 98.6% sequence similarity and encode a total of 113 unique genes. Plastome diversification is caused by an abundance of nucleotide substitutions, small insertions, deletions and repetitions. The five plastomes deviate from the general ancestral design of plastid chromosomes of vascular plants by a subsection-specific 56 kb inversion within the large single-copy segment. This inversion disrupted operon structures and predates the divergence of the subsection presumably 1 My ago. Phylogenetic relationships suggest plastomes I-III in one clade, while plastome IV appears to be closest to the common ancestor.

  8. Molecular Identification of Sex in Phoenix dactylifera Using Inter Simple Sequence Repeat Markers.

    PubMed

    Al-Ameri, Abdulhafed A; Al-Qurainy, Fahad; Gaafar, Abdel-Rhman Z; Khan, Salim; Nadeem, M

    2016-01-01

    Early sex identification of Date Palm (Phoenix dactylifera L.) at seedling stage is an economically desirable objective, which will significantly increase the profits of seed based cultivation. The utilization of molecular markers at this stage for early and rapid identification of sex is important due to the lack of morphological markers. In this study, a total of two hundred Inter Simple Sequence Repeat (ISSR) primers were screened among male and female Date palm plants to identify putative sex-specific marker, out of which only two primers (IS_A02 and IS_A71) were found to be associated with sex. The primer IS_A02 produced a unique band of size 390 bp and was found clearly in all female plants, while it was absent in all male plants. Contrary to this, the primer IS_A71 produced a unique band of size 380 bp and was clearly found in all male plants, whereas it was absent in all the female plants. Subsequently, these specific fragments were excised, purified, and sequenced for the development of sequence specific markers further in future for the implementation on dioecious Date Palm for sex determination. These markers are efficient, highly reliable, and reproducible for sex identification at the early stage of seedling.

  9. Revised systematics of Holospora-like bacteria and characterization of "Candidatus Gortzia infectiva", a novel macronuclear symbiont of Paramecium jenningsi.

    PubMed

    Boscaro, Vittorio; Fokin, Sergei I; Schrallhammer, Martina; Schweikert, Michael; Petroni, Giulio

    2013-01-01

    The genus Holospora (Rickettsiales) includes highly infectious nuclear symbionts of the ciliate Paramecium with unique morphology and life cycle. To date, nine species have been described, but a molecular characterization is lacking for most of them. In this study, we have characterized a novel Holospora-like bacterium (HLB) living in the macronuclei of a Paramecium jenningsi population. This bacterium was morphologically and ultrastructurally investigated in detail, and its life cycle and infection capabilities were described. We also obtained its 16S rRNA gene sequence and developed a specific probe for fluorescence in situ hybridization experiments. A new taxon, "Candidatus Gortzia infectiva", was established for this HLB according to its unique characteristics and the relatively low DNA sequence similarities shared with other bacteria. The phylogeny of the order Rickettsiales based on 16S rRNA gene sequences has been inferred, adding to the available data the sequence of the novel bacterium and those of two Holospora species (Holospora obtusa and Holospora undulata) characterized for the purpose. Our phylogenetic analysis provided molecular support for the monophyly of HLBs and showed a possible pattern of evolution for some of their features. We suggested to classify inside the family Holosporaceae only HLBs, excluding other more distantly related and phenotypically different Paramecium endosymbionts.

  10. Analyses of Evolutionary Characteristics of the Hemagglutinin-Esterase Gene of Influenza C Virus during a Period of 68 Years Reveals Evolutionary Patterns Different from Influenza A and B Viruses.

    PubMed

    Furuse, Yuki; Matsuzaki, Yoko; Nishimura, Hidekazu; Oshitani, Hitoshi

    2016-11-26

    Infections with the influenza C virus causing respiratory symptoms are common, particularly among children. Since isolation and detection of the virus are rarely performed, compared with influenza A and B viruses, the small number of available sequences of the virus makes it difficult to analyze its evolutionary dynamics. Recently, we reported the full genome sequence of 102 strains of the virus. Here, we exploited the data to elucidate the evolutionary characteristics and phylodynamics of the virus compared with influenza A and B viruses. Along with our data, we obtained public sequence data of the hemagglutinin-esterase gene of the virus; the dataset consists of 218 unique sequences of the virus collected from 14 countries between 1947 and 2014. Informatics analyses revealed that (1) multiple lineages have been circulating globally; (2) there have been weak and infrequent selective bottlenecks; (3) the evolutionary rate is low because of weak positive selection and a low capability to induce mutations; and (4) there is no significant positive selection although a few mutations affecting its antigenicity have been induced. The unique evolutionary dynamics of the influenza C virus must be shaped by multiple factors, including virological, immunological, and epidemiological characteristics.

  11. The complete nucleotide sequences of the five genetically distinct plastid genomes of Oenothera, subsection Oenothera: I. Sequence evaluation and plastome evolution†

    PubMed Central

    Greiner, Stephan; Wang, Xi; Rauwolf, Uwe; Silber, Martina V.; Mayer, Klaus; Meurer, Jörg; Haberer, Georg; Herrmann, Reinhold G.

    2008-01-01

    The flowering plant genus Oenothera is uniquely suited for studying molecular mechanisms of speciation. It assembles an intriguing combination of genetic features, including permanent translocation heterozygosity, biparental transmission of plastids, and a general interfertility of well-defined species. This allows an exchange of plastids and nuclei between species often resulting in plastome–genome incompatibility. For evaluation of its molecular determinants we present the complete nucleotide sequences of the five basic, genetically distinguishable plastid chromosomes of subsection Oenothera (=Euoenothera) of the genus, which are associated in distinct combinations with six basic genomes. Sizes of the chromosomes range from 163 365 bp (plastome IV) to 165 728 bp (plastome I), display between 96.3% and 98.6% sequence similarity and encode a total of 113 unique genes. Plastome diversification is caused by an abundance of nucleotide substitutions, small insertions, deletions and repetitions. The five plastomes deviate from the general ancestral design of plastid chromosomes of vascular plants by a subsection-specific 56 kb inversion within the large single-copy segment. This inversion disrupted operon structures and predates the divergence of the subsection presumably 1 My ago. Phylogenetic relationships suggest plastomes I–III in one clade, while plastome IV appears to be closest to the common ancestor. PMID:18299283

  12. Analyses of Evolutionary Characteristics of the Hemagglutinin-Esterase Gene of Influenza C Virus during a Period of 68 Years Reveals Evolutionary Patterns Different from Influenza A and B Viruses

    PubMed Central

    Furuse, Yuki; Matsuzaki, Yoko; Nishimura, Hidekazu; Oshitani, Hitoshi

    2016-01-01

    Infections with the influenza C virus causing respiratory symptoms are common, particularly among children. Since isolation and detection of the virus are rarely performed, compared with influenza A and B viruses, the small number of available sequences of the virus makes it difficult to analyze its evolutionary dynamics. Recently, we reported the full genome sequence of 102 strains of the virus. Here, we exploited the data to elucidate the evolutionary characteristics and phylodynamics of the virus compared with influenza A and B viruses. Along with our data, we obtained public sequence data of the hemagglutinin-esterase gene of the virus; the dataset consists of 218 unique sequences of the virus collected from 14 countries between 1947 and 2014. Informatics analyses revealed that (1) multiple lineages have been circulating globally; (2) there have been weak and infrequent selective bottlenecks; (3) the evolutionary rate is low because of weak positive selection and a low capability to induce mutations; and (4) there is no significant positive selection although a few mutations affecting its antigenicity have been induced. The unique evolutionary dynamics of the influenza C virus must be shaped by multiple factors, including virological, immunological, and epidemiological characteristics. PMID:27898037

  13. Molecular epidemiology demonstrated three emerging clusters of human immunodeficiency virus type 1 subtype B infection in Hong Kong.

    PubMed

    Leung, Tommy W C; Mak, Darwin; Wong, K H; Wang, Y; Song, Y H; Tsang, D N C; Wong, C; Shao, Y M; Lim, W L

    2008-07-01

    We conducted a molecular epidemiological study on newly diagnosed human immunodeficiency virus type 1 (HIV-1)-infected patients in Hong Kong to identify the epidemiological linkage of HIV-1 infection in the locality. Reverse transcription polymerase chain reaction (RT-PCR) for HIV-1 was performed on newly diagnosed HIV-1-positive sera collected from January 2002 to December 2006. PCR products correspond to the env C2V3V4 region and gag p17/p24 junction of the HIV-1 genome were nucleotide sequenced. Phylogenetic analyses performed on the acquired nucleotide sequences revealed that CRF01_AE and subtype B were the two dominant HIV-1 subtypes. Analyses also demonstrated the presence of three emerging HIV-1 clusters among the subtype B sequences in Hong Kong. Individual cluster possesses a unique cluster-specific amino acid signature for identification. Data show that one of the clusters (Cluster I) is rapidly expanding. In addition to the unique cluster-specific amino acid signature, the majority of sequences in Cluster I harbor a 6-amino acid insertion at the gag p17/p24 junction in a region that is thought to be closely associated with HIV-1 infectivity.

  14. Complete chloroplast genome sequence of a major allogamous forage species, perennial ryegrass (Lolium perenne L.).

    PubMed

    Diekmann, Kerstin; Hodkinson, Trevor R; Wolfe, Kenneth H; van den Bekerom, Rob; Dix, Philip J; Barth, Susanne

    2009-06-01

    Lolium perenne L. (perennial ryegrass) is globally one of the most important forage and grassland crops. We sequenced the chloroplast (cp) genome of Lolium perenne cultivar Cashel. The L. perenne cp genome is 135 282 bp with a typical quadripartite structure. It contains genes for 76 unique proteins, 30 tRNAs and four rRNAs. As in other grasses, the genes accD, ycf1 and ycf2 are absent. The genome is of average size within its subfamily Pooideae and of medium size within the Poaceae. Genome size differences are mainly due to length variations in non-coding regions. However, considerable length differences of 1-27 codons in comparison of L. perenne to other Poaceae and 1-68 codons among all Poaceae were also detected. Within the cp genome of this outcrossing cultivar, 10 insertion/deletion polymorphisms and 40 single nucleotide polymorphisms were detected. Two of the polymorphisms involve tiny inversions within hairpin structures. By comparing the genome sequence with RT-PCR products of transcripts for 33 genes, 31 mRNA editing sites were identified, five of them unique to Lolium. The cp genome sequence of L. perenne is available under Accession number AM777385 at the European Molecular Biology Laboratory, National Center for Biotechnology Information and DNA DataBank of Japan.

  15. Applying the Concept of Peptide Uniqueness to Anti-Polio Vaccination

    PubMed Central

    Kanduc, Darja; Fasano, Candida; Capone, Giovanni; Pesce Delfino, Antonella; Calabrò, Michele; Polimeno, Lorenzo

    2015-01-01

    Background. Although rare, adverse events may associate with anti-poliovirus vaccination thus possibly hampering global polio eradication worldwide. Objective. To design peptide-based anti-polio vaccines exempt from potential cross-reactivity risks and possibly able to reduce rare potential adverse events such as the postvaccine paralytic poliomyelitis due to the tendency of the poliovirus genome to mutate. Methods. Proteins from poliovirus type 1, strain Mahoney, were analyzed for amino acid sequence identity to the human proteome at the pentapeptide level, searching for sequences that (1) have zero percent of identity to human proteins, (2) are potentially endowed with an immunologic potential, and (3) are highly conserved among poliovirus strains. Results. Sequence analyses produced a set of consensus epitopic peptides potentially able to generate specific anti-polio immune responses exempt from cross-reactivity with the human host. Conclusion. Peptide sequences unique to poliovirus proteins and conserved among polio strains might help formulate a specific and universal anti-polio vaccine able to react with multiple viral strains and exempt from the burden of possible cross-reactions with human proteins. As an additional advantage, using a peptide-based vaccine instead of current anti-polio DNA vaccines would eliminate the rare post-polio poliomyelitis cases and other disabling symptoms that may appear following vaccination. PMID:26568962

  16. Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

    PubMed

    Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

    2017-07-01

    PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.

  17. Insights on genome size evolution from a miniature inverted repeat transposon driving a satellite DNA.

    PubMed

    Scalvenzi, Thibault; Pollet, Nicolas

    2014-12-01

    The genome size in eukaryotes does not correlate well with the number of genes they contain. We can observe this so-called C-value paradox in amphibian species. By analyzing an amphibian genome we asked how repetitive DNA can impact genome size and architecture. We describe here our discovery of a Tc1/mariner miniature inverted-repeat transposon family present in Xenopus frogs. These transposons named miDNA4 are unique since they contain a satellite DNA motif. We found that miDNA4 measured 331 bp, contained 25 bp long inverted terminal repeat sequences and a sequence motif of 119 bp present as a unique copy or as an array of 2-47 copies. We characterized the structure, dynamics, impact and evolution of the miDNA4 family and its satellite DNA in Xenopus frog genomes. This led us to propose a model for the evolution of these two repeated sequences and how they can synergize to increase genome size. Copyright © 2014 Elsevier Inc. All rights reserved.

  18. Nucleotide sequence of the Kaposi sarcoma-associated herpesvirus (HHV8)

    PubMed Central

    Russo, James J.; Bohenzky, Roy A.; Chien, Ming-Cheng; Chen, Jing; Yan, Ming; Maddalena, Dawn; Parry, J. Preston; Peruzzi, Daniela; Edelman, Isidore S.; Chang, Yuan; Moore, Patrick S.

    1996-01-01

    The genome of the Kaposi sarcoma-associated herpesvirus (KSHV or HHV8) was mapped with cosmid and phage genomic libraries from the BC-1 cell line. Its nucleotide sequence was determined except for a 3-kb region at the right end of the genome that was refractory to cloning. The BC-1 KSHV genome consists of a 140.5-kb-long unique coding region flanked by multiple G+C-rich 801-bp terminal repeat sequences. A genomic duplication that apparently arose in the parental tumor is present in this cell culture-derived strain. At least 81 ORFs, including 66 with homology to herpesvirus saimiri ORFs, and 5 internal repeat regions are present in the long unique region. The virus encodes homologs to complement-binding proteins, three cytokines (two macrophage inflammatory proteins and interleukin 6), dihydrofolate reductase, bcl-2, interferon regulatory factors, interleukin 8 receptor, neural cell adhesion molecule-like adhesin, and a D-type cyclin, as well as viral structural and metabolic proteins. Terminal repeat analysis of virus DNA from a KS lesion suggests a monoclonal expansion of KSHV in the KS tumor. PMID:8962146

  19. The Hydrologic Implications Of Unique Urban Soil Horizon Sequencing On The Functions Of Passive Green Infrastructure

    NASA Astrophysics Data System (ADS)

    Shuster, W.; Schifman, L. A.; Herrmann, D.

    2017-12-01

    Green infrastructure represents a broad set of site- to landscape-scale practices that can be flexibly implemented to increase sewershed retention capacity, and can thereby improve on the management of water quantity and quality. Although much green infrastructure presents as formal engineered designs, urbanized landscapes with highly-interspersed pervious surfaces (e.g., right-of-way, parks, lawns, vacant land) may offer ecosystem services as passive, infiltrative green infrastructure. Yet, infiltration and drainage processes are regulated by soil surface conditions, and then the layering of subsoil horizons, respectively. Drawing on a unique urban soil taxonomic and hydrologic dataset collected in 12 cities (each city representing a major soil order), we determined how urbanization processes altered the sequence of soil horizons (compared to pre-urbanized reference soil pedons) and modeled the hydrologic implications of these shifts in layering with an unsaturated zone code (HYDRUS2D). We found that the different layering sequences in urbanized soils render different types and extents of supporting (plant-available soil water), provisioning (productive vegetation), and regulating (runoff mitigation) ecosystem services.

  20. Mutation in a primate-conserved retrotransposon reveals a noncoding RNA as a mediator of infantile encephalopathy

    PubMed Central

    Cartault, François; Munier, Patrick; Benko, Edgar; Desguerre, Isabelle; Hanein, Sylvain; Boddaert, Nathalie; Bandiera, Simonetta; Vellayoudom, Jeanine; Krejbich-Trotot, Pascale; Bintner, Marc; Hoarau, Jean-Jacques; Girard, Muriel; Génin, Emmanuelle; de Lonlay, Pascale; Fourmaintraux, Alain; Naville, Magali; Rodriguez, Diana; Feingold, Josué; Renouil, Michel; Munnich, Arnold; Westhof, Eric; Fähling, Michael; Lyonnet, Stanislas; Henrion-Caude, Alexandra

    2012-01-01

    The human genome is densely populated with transposons and transposon-like repetitive elements. Although the impact of these transposons and elements on human genome evolution is recognized, the significance of subtle variations in their sequence remains mostly unexplored. Here we report homozygosity mapping of an infantile neurodegenerative disease locus in a genetic isolate. Complete DNA sequencing of the 400-kb linkage locus revealed a point mutation in a primate-specific retrotransposon that was transcribed as part of a unique noncoding RNA, which was expressed in the brain. In vitro knockdown of this RNA increased neuronal apoptosis, consistent with the inappropriate dosage of this RNA in vivo and with the phenotype. Moreover, structural analysis of the sequence revealed a small RNA-like hairpin that was consistent with the putative gain of a functional site when mutated. We show here that a mutation in a unique transposable element-containing RNA is associated with lethal encephalopathy, and we suggest that RNAs that harbor evolutionarily recent repetitive elements may play important roles in human brain development. PMID:22411793

  1. Diversity amongst trigeminal neurons revealed by high throughput single cell sequencing

    PubMed Central

    Nguyen, Minh Q.; Wu, Youmei; Bonilla, Lauren S.; von Buchholtz, Lars J.

    2017-01-01

    The trigeminal ganglion contains somatosensory neurons that detect a range of thermal, mechanical and chemical cues and innervate unique sensory compartments in the head and neck including the eyes, nose, mouth, meninges and vibrissae. We used single-cell sequencing and in situ hybridization to examine the cellular diversity of the trigeminal ganglion in mice, defining thirteen clusters of neurons. We show that clusters are well conserved in dorsal root ganglia suggesting they represent distinct functional classes of somatosensory neurons and not specialization associated with their sensory targets. Notably, functionally important genes (e.g. the mechanosensory channel Piezo2 and the capsaicin gated ion channel Trpv1) segregate into multiple clusters and often are expressed in subsets of cells within a cluster. Therefore, the 13 genetically-defined classes are likely to be physiologically heterogeneous rather than highly parallel (i.e., redundant) lines of sensory input. Our analysis harnesses the power of single-cell sequencing to provide a unique platform for in silico expression profiling that complements other approaches linking gene-expression with function and exposes unexpected diversity in the somatosensory system. PMID:28957441

  2. Health Education a Conceptual Approach. Growing and Developing, Interacting, Decision Making. Concept 2: Growing and Developing Follows a Predictable Sequence, Yet is Unique for Each Individual. Teacher-Student Resources.

    ERIC Educational Resources Information Center

    Creswell, William H., Jr.; And Others

    The following resource guide is one in a series which presents extensive bibliographic material oriented around a specific concept, in this guide, the predictability and uniqueness of growing and developing. A section is devoted to selected materials related to the concept; grade levels for which each resource might be useful are indicated beside…

  3. Existence and global attractivity of unique positive periodic solution for a model of hematopoiesis

    NASA Astrophysics Data System (ADS)

    Liu, Guirong; Yan, Jurang; Zhang, Fengqin

    2007-10-01

    In this paper, we consider the generalized model of hematopoiesis By using a fixed point theorem, some criteria are established for the existence of the unique positive [omega]-periodic solution of the above equation. In particular, we not only give the conclusion of convergence of xk to , where {xk} is a successive sequence, but also show that is a global attractor of all other positive solutions.

  4. alpha-Amylase gene of Streptomyces limosus: nucleotide sequence, expression motifs, and amino acid sequence homology to mammalian and invertebrate alpha-amylases.

    PubMed Central

    Long, C M; Virolle, M J; Chang, S Y; Chang, S; Bibb, M J

    1987-01-01

    The nucleotide sequence of the coding and regulatory regions of the alpha-amylase gene (aml) of Streptomyces limosus was determined. High-resolution S1 mapping was used to locate the 5' end of the transcript and demonstrated that the gene is transcribed from a unique promoter. The predicted amino acid sequence has considerable identity to mammalian and invertebrate alpha-amylases, but not to those of plant, fungal, or eubacterial origin. Consistent with this is the susceptibility of the enzyme to an inhibitor of mammalian alpha-amylases. The amino-terminal sequence of the extracellular enzyme was determined, revealing the presence of a typical signal peptide preceding the mature form of the alpha-amylase. Images PMID:3500166

  5. Isolation and characterization of major histocompatibility complex class II B genes in cranes.

    PubMed

    Kohyama, Tetsuo I; Akiyama, Takuya; Nishida, Chizuko; Takami, Kazutoshi; Onuma, Manabu; Momose, Kunikazu; Masuda, Ryuichi

    2015-11-01

    In this study, we isolated and characterized the major histocompatibility complex (MHC) class II B genes in cranes. Genomic sequences spanning exons 1 to 4 were amplified and determined in 13 crane species and three other species closely related to cranes. In all, 55 unique sequences were identified, and at least two polymorphic MHC class II B loci were found in most species. An analysis of sequence polymorphisms showed the signature of positive selection and recombination. A phylogenetic reconstruction based on exon 2 sequences indicated that trans-species polymorphism has persisted for at least 10 million years, whereas phylogenetic analyses of the sequences flanking exon 2 revealed a pattern of concerted evolution. These results suggest that both balancing selection and recombination play important roles in the crane MHC evolution.

  6. Genomics dataset of unidentified disclosed isolates.

    PubMed

    Rekadwad, Bhagwan N

    2016-09-01

    Analysis of DNA sequences is necessary for higher hierarchical classification of the organisms. It gives clues about the characteristics of organisms and their taxonomic position. This dataset is chosen to find complexities in the unidentified DNA in the disclosed patents. A total of 17 unidentified DNA sequences were thoroughly analyzed. The quick response codes were generated. AT/GC content of the DNA sequences analysis was carried out. The QR is helpful for quick identification of isolates. AT/GC content is helpful for studying their stability at different temperatures. Additionally, a dataset on cleavage code and enzyme code studied under the restriction digestion study, which helpful for performing studies using short DNA sequences was reported. The dataset disclosed here is the new revelatory data for exploration of unique DNA sequences for evaluation, identification, comparison and analysis.

  7. Designing a Bioengine for Detection and Analysis of Base String on an Affected Sequence in High-Concentration Regions

    PubMed Central

    Mandal, Bijoy Kumar; Kim, Tai-hoon

    2013-01-01

    We design an Algorithm for bioengine. As a program are enable optimal alignments searching between two sequences, the host sequence (normal plant) as well as query sequence (virus). Searching for homologues has become a routine operation of biological sequences in 4 × 4 combination with different subsequence (word size). This program takes the advantage of the high degree of homology between such sequences to construct an alignment of the matching regions. There is a main aim which is to detect the overlapping reading frames. This program also enables to find out the highly infected colones selection highest matching region with minimum gap or mismatch zones and unique virus colones matches. This is a small, portable, interactive, front-end program intended to be used to find out the regions of matching between host sequence and query subsequences. All the operations are carried out in fraction of seconds, depending on the required task and on the sequence length. PMID:24000321

  8. Enabling large-scale next-generation sequence assembly with Blacklight

    PubMed Central

    Couger, M. Brian; Pipes, Lenore; Squina, Fabio; Prade, Rolf; Siepel, Adam; Palermo, Robert; Katze, Michael G.; Mason, Christopher E.; Blood, Philip D.

    2014-01-01

    Summary A variety of extremely challenging biological sequence analyses were conducted on the XSEDE large shared memory resource Blacklight, using current bioinformatics tools and encompassing a wide range of scientific applications. These include genomic sequence assembly, very large metagenomic sequence assembly, transcriptome assembly, and sequencing error correction. The data sets used in these analyses included uncategorized fungal species, reference microbial data, very large soil and human gut microbiome sequence data, and primate transcriptomes, composed of both short-read and long-read sequence data. A new parallel command execution program was developed on the Blacklight resource to handle some of these analyses. These results, initially reported previously at XSEDE13 and expanded here, represent significant advances for their respective scientific communities. The breadth and depth of the results achieved demonstrate the ease of use, versatility, and unique capabilities of the Blacklight XSEDE resource for scientific analysis of genomic and transcriptomic sequence data, and the power of these resources, together with XSEDE support, in meeting the most challenging scientific problems. PMID:25294974

  9. BAUM: improving genome assembly by adaptive unique mapping and local overlap-layout-consensus approach.

    PubMed

    Wang, Anqi; Wang, Zhanyu; Li, Zheng; Li, Lei M

    2018-06-15

    It is highly desirable to assemble genomes of high continuity and consistency at low cost. The current bottleneck of draft genome continuity using the second generation sequencing (SGS) reads is primarily caused by uncertainty among repetitive sequences. Even though the single-molecule real-time sequencing technology is very promising to overcome the uncertainty issue, its relatively high cost and error rate add burden on budget or computation. Many long-read assemblers take the overlap-layout-consensus (OLC) paradigm, which is less sensitive to sequencing errors, heterozygosity and variability of coverage. However, current assemblers of SGS data do not sufficiently take advantage of the OLC approach. Aiming at minimizing uncertainty, the proposed method BAUM, breaks the whole genome into regions by adaptive unique mapping; then the local OLC is used to assemble each region in parallel. BAUM can (i) perform reference-assisted assembly based on the genome of a close species (ii) or improve the results of existing assemblies that are obtained based on short or long sequencing reads. The tests on two eukaryote genomes, a wild rice Oryza longistaminata and a parrot Melopsittacus undulatus, show that BAUM achieved substantial improvement on genome size and continuity. Besides, BAUM reconstructed a considerable amount of repetitive regions that failed to be assembled by existing short read assemblers. We also propose statistical approaches to control the uncertainty in different steps of BAUM. http://www.zhanyuwang.xin/wordpress/index.php/2017/07/21/baum. Supplementary data are available at Bioinformatics online.

  10. Tissue-Specific Transcriptomics in the Field Cricket Teleogryllus oceanicus

    PubMed Central

    Bailey, Nathan W.; Veltsos, Paris; Tan, Yew-Foon; Millar, A. Harvey; Ritchie, Michael G.; Simmons, Leigh W.

    2013-01-01

    Field crickets (family Gryllidae) frequently are used in studies of behavioral genetics, sexual selection, and sexual conflict, but there have been no studies of transcriptomic differences among different tissue types. We evaluated transcriptome variation among testis, accessory gland, and the remaining whole-body preparations from males of the field cricket, Teleogryllus oceanicus. Non-normalized cDNA libraries from each tissue were sequenced on the Roche 454 platform, and a master assembly was constructed using testis, accessory gland, and whole-body preparations. A total of 940,200 reads were assembled into 41,962 contigs, to which 36,856 singletons (reads not assembled into a contig) were added to provide a total of 78,818 sequences used in annotation analysis. A total of 59,072 sequences (75%) were unique to one of the three tissues. Testis tissue had the greatest proportion of tissue-specific sequences (62.6%), followed by general body (56.43%) and accessory gland tissue (44.16%). We tested the hypothesis that tissues expressing gene products expected to evolve rapidly as a result of sexual selection—testis and accessory gland—would yield a smaller proportion of BLASTx matches to homologous genes in the model organism Drosophila melanogaster compared with whole-body tissue. Uniquely expressed sequences in both testis and accessory gland showed a significantly lower rate of matching to annotated D. melanogaster genes compared with those from general body tissue. These results correspond with empirical evidence that genes expressed in testis and accessory gland tissue are rapidly evolving targets of selection. PMID:23390599

  11. Tissue-specific transcriptomics in the field cricket Teleogryllus oceanicus.

    PubMed

    Bailey, Nathan W; Veltsos, Paris; Tan, Yew-Foon; Millar, A Harvey; Ritchie, Michael G; Simmons, Leigh W

    2013-02-01

    Field crickets (family Gryllidae) frequently are used in studies of behavioral genetics, sexual selection, and sexual conflict, but there have been no studies of transcriptomic differences among different tissue types. We evaluated transcriptome variation among testis, accessory gland, and the remaining whole-body preparations from males of the field cricket, Teleogryllus oceanicus. Non-normalized cDNA libraries from each tissue were sequenced on the Roche 454 platform, and a master assembly was constructed using testis, accessory gland, and whole-body preparations. A total of 940,200 reads were assembled into 41,962 contigs, to which 36,856 singletons (reads not assembled into a contig) were added to provide a total of 78,818 sequences used in annotation analysis. A total of 59,072 sequences (75%) were unique to one of the three tissues. Testis tissue had the greatest proportion of tissue-specific sequences (62.6%), followed by general body (56.43%) and accessory gland tissue (44.16%). We tested the hypothesis that tissues expressing gene products expected to evolve rapidly as a result of sexual selection--testis and accessory gland--would yield a smaller proportion of BLASTx matches to homologous genes in the model organism Drosophila melanogaster compared with whole-body tissue. Uniquely expressed sequences in both testis and accessory gland showed a significantly lower rate of matching to annotated D. melanogaster genes compared with those from general body tissue. These results correspond with empirical evidence that genes expressed in testis and accessory gland tissue are rapidly evolving targets of selection.

  12. Diversity of Basidiomycetes in Michigan Agricultural Soils▿

    PubMed Central

    Lynch, Michael D. J.; Thorn, R. Greg

    2006-01-01

    We analyzed the communities of soil basidiomycetes in agroecosystems that differ in tillage history at the Kellogg Biological Station Long-Term Ecological Research site near Battle Creek, Michigan. The approach combined soil DNA extraction through a bead-beating method modified to increase recovery of fungal DNA, PCR amplification with basidiomycete-specific primers, cloning and restriction fragment length polymorphism screening of mixed PCR products, and sequencing of unique clones. Much greater diversity was detected than was anticipated in this habitat on the basis of culture-based methods or surveys of fruiting bodies. With “species” defined as organisms yielding PCR products with ≥99% identity in the 5′ 650 bases of the nuclear large-subunit ribosomal DNA, 241 “species” were detected among 409 unique basidiomycete sequences recovered. Almost all major clades of basidiomycetes from basidiomycetous yeasts and other heterobasidiomycetes through polypores and euagarics (gilled mushrooms and relatives) were represented, with a majority from the latter clade. Only 24 of 241 “species” had 99% or greater sequence similarity to named reference sequences in GenBank, and several clades with multiple “species” could not be identified at the genus level by phylogenetic comparisons with named sequences. The total estimated “species” richness for this 11.2-ha site was 367 “species” of basidiomycetes. Since >99% of the study area has not been sampled, the accuracy of our diversity estimate is uncertain. Replication in time and space is required to detect additional diversity and the underlying community structure. PMID:16950900

  13. Round and pointed-head grenadier fishes (Actinopterygii: Gadiformes) represent a single sister group: evidence from the complete mitochondrial genome sequences.

    PubMed

    Satoh, Takashi P; Miya, Masaki; Endo, Hiromitsu; Nishida, Mutsumi

    2006-07-01

    The gene order of mitochondrial genomes (mitogenomes) has been employed as a useful phylogenetic marker in various metazoan animals, because it may represent uniquely derived characters shared by members of monophyletic groups. During the course of molecular phylogenetic studies of the order Gadiformes (cods and their relatives) based on whole mitogenome sequences, we found that two deep-sea grenadiers (Squalogadus modificatus and Trachyrincus murrayi: family Macrouridae) revealed a unusually identical gene order (translocation of the tRNA(Leu (UUR))). Both are members of the same family, although their external morphologies differed so greatly (e.g., round vs. pointed head) that they have been placed in different subfamilies Macrouroidinae and Trachyrincinae, respectively. Additionally, we determined the whole mitogenome sequences of two other species, Bathygadus antrodes and Ventrifossa garmani, representing a total of four subfamilies currently recognized within Macrouridae. The latter two species also exhibited gene rearrangements, resulting in a total of three different patterns of unique gene order being observed in the four subfamilies. Partitioned Bayesian analysis was conducted using available whole mitogenome sequences from five macrourids plus five outgroups. The resultant trees clearly indicated that S. modificatus and T. murrayi formed a monophyletic group, having a sister relationship to other macrourids. Thus, monophyly of the two species with disparate head morphologies was corroborated by two different lines of evidence (nucleotide sequences and gene order). The overall topology of the present tree differed from any of the previously proposed, morphology-based phylogenetic hypotheses.

  14. Deep sequencing reveals cell-type-specific patterns of single-cell transcriptome variation.

    PubMed

    Dueck, Hannah; Khaladkar, Mugdha; Kim, Tae Kyung; Spaethling, Jennifer M; Francis, Chantal; Suresh, Sangita; Fisher, Stephen A; Seale, Patrick; Beck, Sheryl G; Bartfai, Tamas; Kuhn, Bernhard; Eberwine, James; Kim, Junhyong

    2015-06-09

    Differentiation of metazoan cells requires execution of different gene expression programs but recent single-cell transcriptome profiling has revealed considerable variation within cells of seeming identical phenotype. This brings into question the relationship between transcriptome states and cell phenotypes. Additionally, single-cell transcriptomics presents unique analysis challenges that need to be addressed to answer this question. We present high quality deep read-depth single-cell RNA sequencing for 91 cells from five mouse tissues and 18 cells from two rat tissues, along with 30 control samples of bulk RNA diluted to single-cell levels. We find that transcriptomes differ globally across tissues with regard to the number of genes expressed, the average expression patterns, and within-cell-type variation patterns. We develop methods to filter genes for reliable quantification and to calibrate biological variation. All cell types include genes with high variability in expression, in a tissue-specific manner. We also find evidence that single-cell variability of neuronal genes in mice is correlated with that in rats consistent with the hypothesis that levels of variation may be conserved. Single-cell RNA-sequencing data provide a unique view of transcriptome function; however, careful analysis is required in order to use single-cell RNA-sequencing measurements for this purpose. Technical variation must be considered in single-cell RNA-sequencing studies of expression variation. For a subset of genes, biological variability within each cell type appears to be regulated in order to perform dynamic functions, rather than solely molecular noise.

  15. Bacterial CRISPR Regions: General Features and their Potential for Epidemiological Molecular Typing Studies

    PubMed Central

    Karimi, Zahra; Ahmadi, Ali; Najafi, Ali; Ranjbar, Reza

    2018-01-01

    Introduction: CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) loci as novel and applicable regions in prokaryotic genomes have gained great attraction in the post genomics era. Methods: These unique regions are diverse in number and sequence composition in different pathogenic bacteria and thereby can be a suitable candidate for molecular epidemiology and genotyping studies. Results:Furthermore, the arrayed structure of CRISPR loci (several unique repeats spaced with the variable sequence) and associated cas genes act as an active prokaryotic immune system against viral replication and conjugative elements. This property can be used as a tool for RNA editing in bioengineering studies. Conclusion: The aim of this review was to survey some details about the history, nature, and potential applications of CRISPR arrays in both genetic engineering and bacterial genotyping studies. PMID:29755603

  16. Further delineation of nonhomologous-based recombination and evidence for subtelomeric segmental duplications in 1p36 rearrangements.

    PubMed

    D'Angelo, Carla S; Gajecka, Marzena; Kim, Chong A; Gentles, Andrew J; Glotzbach, Caron D; Shaffer, Lisa G; Koiffmann, Célia P

    2009-06-01

    The mechanisms involved in the formation of subtelomeric rearrangements are now beginning to be elucidated. Breakpoint sequencing analysis of 1p36 rearrangements has made important contributions to this line of inquiry. Despite the unique architecture of segmental duplications inherent to human subtelomeres, no common mechanism has been identified thus far and different nonexclusive recombination-repair mechanisms seem to predominate. In order to gain further insights into the mechanisms of chromosome breakage, repair, and stabilization mediating subtelomeric rearrangements in humans, we investigated the constitutional rearrangements of 1p36. Cloning of the breakpoint junctions in a complex rearrangement and three non-reciprocal translocations revealed similarities at the junctions, such as microhomology of up to three nucleotides, along with no significant sequence identity in close proximity to the breakpoint regions. All the breakpoints appeared to be unique and their occurrence was limited to non-repetitive, unique DNA sequences. Several recombination- or cleavage-associated motifs that may promote non-homologous recombination were observed in close proximity to the junctions. We conclude that NHEJ is likely the mechanism of DNA repair that generates these rearrangements. Additionally, two apparently pure terminal deletions were also investigated, and the refinement of the breakpoint regions identified two distinct genomic intervals ~25-kb apart, each containing a series of 1p36 specific segmental duplications with 90-98% identity. Segmental duplications can serve as substrates for ectopic homologous recombination or stimulate genomic rearrangements.

  17. Isolation of an inhibitory insulin-like growth factor (IGF) binding protein from bone cell-conditioned medium: A potential local regulator of IGF action

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mohan, S.; Bautista, C.M.; Wergedal, J.

    1989-11-01

    Inhibitory insulin-like growth factor binding protein (In-IGF-BP) has been purified to homogeneity from medium conditioned by TE89 human osteosarcoma cells by two different methods using Sephadex G-100 gel filtration, FPLC Mono Q ion-exchange, HPLC C{sub 4} reverse-phase, HPLC CN reverse-phase and affinity chromatographies. In-IGF-BP thus purified appeared to be homogeneous and unique by the following criteria. (i) N-terminal sequence analysis yielded a unique sequence (Asp-Glu-Ala-Ile-His-Cys-Pro-Pro-Glu-Ser-Glu-Ala-Lys-Leu-Ala). (ii) Amino acid composition of In-IGF-BP revealed marked differences with the amino acid compositions of other known PBs. (iii) In-IGF-BP exhibited a single band with molecular mass of 25 kDa under reducing conditions on sodiummore » dodecyl sulfate/polyacrylamide gels. IGF-I and IGF-II but not insulin displaced the binding of {sup 125}I-labeled IGF-I or {sup 125}I-labeled IGF-II binding to In-IGF-BP. In-IGF-BP inhibited basal, IGF-stimulated bone cell proliferation and serum-stimulated bone cell proliferation. Forskolin increases synthesis of In-IGF-BP in TE85 human osteosarcoma cells in a dose-dependent manner. Based on these findings, the authors conclude that In-IGF-BP is a protein that has a unique sequence and significant biological actions on bone cells.« less

  18. AK Sco: a tidally induced atmospheric dynamo in a pre-main sequence binary?

    NASA Astrophysics Data System (ADS)

    Gómez de Castro, A. I.

    2009-02-01

    AK Sco is a unique source: a 10-30 Myrs old pre-main sequence spectroscopic binary composed by two nearly equal F5 stars that at periastron are separated by barely eleven stellar radii so, the stellar magnetospheres fill the Roche lobe at periastron. The orbit is not yet circularized (e = 0.47) and very strong tides are expected. This makes of AK Sco, the ideal laboratory to study the effect of gravitational tides in the stellar magnetic field building up during pre-main sequence evolution. Evidence of this effect is reported in this contribution.

  19. The CRISPRdb database and tools to display CRISPRs and to generate dictionaries of spacers and repeats

    PubMed Central

    Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

    2007-01-01

    Background In Archeae and Bacteria, the repeated elements called CRISPRs for "clustered regularly interspaced short palindromic repeats" are believed to participate in the defence against viruses. Short sequences called spacers are stored in-between repeated elements. In the current model, motifs comprising spacers and repeats may target an invading DNA and lead to its degradation through a proposed mechanism similar to RNA interference. Analysis of intra-species polymorphism shows that new motifs (one spacer and one repeated element) are added in a polarised fashion. Although their principal characteristics have been described, a lot remains to be discovered on the way CRISPRs are created and evolve. As new genome sequences become available it appears necessary to develop automated scanning tools to make available CRISPRs related information and to facilitate additional investigations. Description We have produced a program, CRISPRFinder, which identifies CRISPRs and extracts the repeated and unique sequences. Using this software, a database is constructed which is automatically updated monthly from newly released genome sequences. Additional tools were created to allow the alignment of flanking sequences in search for similarities between different loci and to build dictionaries of unique sequences. To date, almost six hundred CRISPRs have been identified in 475 published genomes. Two Archeae out of thirty-seven and about half of Bacteria do not possess a CRISPR. Fine analysis of repeated sequences strongly supports the current view that new motifs are added at one end of the CRISPR adjacent to the putative promoter. Conclusion It is hoped that availability of a public database, regularly updated and which can be queried on the web will help in further dissecting and understanding CRISPR structure and flanking sequences evolution. Subsequent analyses of the intra-species CRISPR polymorphism will be facilitated by CRISPRFinder and the dictionary creator. CRISPRdb is accessible at PMID:17521438

  20. A high-throughput venom-gland transcriptome for the Eastern Diamondback Rattlesnake (Crotalus adamanteus) and evidence for pervasive positive selection across toxin classes.

    PubMed

    Rokyta, Darin R; Wray, Kenneth P; Lemmon, Alan R; Lemmon, Emily Moriarty; Caudle, S Brian

    2011-04-01

    Despite causing considerable human mortality and morbidity, animal toxins represent a valuable source of pharmacologically active macromolecules, a unique system for studying molecular adaptation, and a powerful framework for examining structure-function relationships in proteins. Snake venoms are particularly useful in the latter regard as they consist primarily of a moderate number of proteins and peptides that have been found to belong to just a handful of protein families. As these proteins and peptides are produced in dedicated glands, transcriptome sequencing has proven to be an effective approach to identifying the expressed toxin genes. We generated a venom-gland transcriptome for the Eastern Diamondback Rattlesnake (Crotalus adamanteus) using Roche 454 sequencing technology. In the current work, we focus on transcripts encoding toxins. We identified 40 unique toxin transcripts, 30 of which have full-length coding sequences, and 10 have only partial coding sequences. These toxins account for 24% of the total sequencing reads. We found toxins from 11 previously described families of snake-venom toxins and have discovered two putative, previously undescribed toxin classes. The most diverse and highly expressed toxin classes in the C. adamanteus venom-gland transcriptome are the serine proteinases, metalloproteinases, and C-type lectins. The serine proteinases are the most abundant class, accounting for 35% of the toxin sequencing reads. Metalloproteinases are the most diverse; 11 different forms have been identified. Using our sequences and those available in public databases, we detected positive selection in seven of the eight toxin families for which sufficient sequences were available for the analysis. We find that the vast majority of the genes that contribute directly to this vertebrate trait show evidence for a role for positive selection in their evolutionary history. Copyright © 2011 Elsevier Ltd. All rights reserved.

  1. Comparative chloroplast genomics and phylogenetics of Fagopyrum esculentum ssp. ancestrale – A wild ancestor of cultivated buckwheat

    PubMed Central

    Logacheva, Maria D; Samigullin, Tahir H; Dhingra, Amit; Penin, Aleksey A

    2008-01-01

    Background Chloroplast genome sequences are extremely informative about species-interrelationships owing to its non-meiotic and often uniparental inheritance over generations. The subject of our study, Fagopyrum esculentum, is a member of the family Polygonaceae belonging to the order Caryophyllales. An uncertainty remains regarding the affinity of Caryophyllales and the asterids that could be due to undersampling of the taxa. With that background, having access to the complete chloroplast genome sequence for Fagopyrum becomes quite pertinent. Results We report the complete chloroplast genome sequence of a wild ancestor of cultivated buckwheat, Fagopyrum esculentum ssp. ancestrale. The sequence was rapidly determined using a previously described approach that utilized a PCR-based method and employed universal primers, designed on the scaffold of multiple sequence alignment of chloroplast genomes. The gene content and order in buckwheat chloroplast genome is similar to Spinacia oleracea. However, some unique structural differences exist: the presence of an intron in the rpl2 gene, a frameshift mutation in the rpl23 gene and extension of the inverted repeat region to include the ycf1 gene. Phylogenetic analysis of 61 protein-coding gene sequences from 44 complete plastid genomes provided strong support for the sister relationships of Caryophyllales (including Polygonaceae) to asterids. Further, our analysis also provided support for Amborella as sister to all other angiosperms, but interestingly, in the bayesian phylogeny inference based on first two codon positions Amborella united with Nymphaeales. Conclusion Comparative genomics analyses revealed that the Fagopyrum chloroplast genome harbors the characteristic gene content and organization as has been described for several other chloroplast genomes. However, it has some unique structural features distinct from previously reported complete chloroplast genome sequences. Phylogenetic analysis of the dataset, including this new sequence from non-core Caryophyllales supports the sister relationship between Caryophyllales and asterids. PMID:18492277

  2. The Complete Genomic Sequence of Pepper Yellow Leaf Curl Virus (PYLCV) and Its Implications for Our Understanding of Evolution Dynamics in the Genus Polerovirus

    PubMed Central

    Dombrovsky, Aviv; Glanz, Eyal; Lachman, Oded; Sela, Noa; Doron-Faigenboim, Adi; Antignus, Yehezkel

    2013-01-01

    We determined the complete sequence and organization of the genome of a putative member of the genus Polerovirus tentatively named Pepper yellow leaf curl virus (PYLCV). PYLCV has a wider host range than Tobacco vein-distorting virus (TVDV) and has a close serological relationship with Cucurbit aphid-borne yellows virus (CABYV) (both poleroviruses). The extracted viral RNA was subjected to SOLiD next-generation sequence analysis and used as a template for reverse transcription synthesis, which was followed by PCR amplification. The ssRNA genome of PYLCV includes 6,028 nucleotides encoding six open reading frames (ORFs), which is typical of the genus Polerovirus. Comparisons of the deduced amino acid sequences of the PYLCV ORFs 2-4 and ORF5, indicate that there are high levels of similarity between these sequences to ORFs 2-4 of TVDV (84-93%) and to ORF5 of CABYV (87%). Both PYLCV and Pepper vein yellowing virus (PeVYV) contain sequences that point to a common ancestral polerovirus. The recombination breakpoint which is located at CABYV ORF3, which encodes the viral coat protein (CP), may explain the CABYV-like sequences found in the genomes of the pepper infecting viruses PYLCV and PeVYV. Two additional regions unique to PYLCV (PY1 and PY2) were identified between nucleotides 4,962 and 5,061 (ORF 5) and between positions 5,866 and 6,028 in the 3' NCR. Sequence analysis of the pepper-infecting PeVYV revealed three unique regions (Pe1-Pe3) with no similarity to other members of the genus Polerovirus. Genomic analyses of PYLCV and PeVYV suggest that the speciation of these viruses occurred through putative recombination event(s) between poleroviruses co-infecting a common host(s), resulting in the emergence of PYLCV, a novel pathogen with a wider host range. PMID:23936244

  3. The complete genomic sequence of pepper yellow leaf curl virus (PYLCV) and its implications for our understanding of evolution dynamics in the genus polerovirus.

    PubMed

    Dombrovsky, Aviv; Glanz, Eyal; Lachman, Oded; Sela, Noa; Doron-Faigenboim, Adi; Antignus, Yehezkel

    2013-01-01

    We determined the complete sequence and organization of the genome of a putative member of the genus Polerovirus tentatively named Pepper yellow leaf curl virus (PYLCV). PYLCV has a wider host range than Tobacco vein-distorting virus (TVDV) and has a close serological relationship with Cucurbit aphid-borne yellows virus (CABYV) (both poleroviruses). The extracted viral RNA was subjected to SOLiD next-generation sequence analysis and used as a template for reverse transcription synthesis, which was followed by PCR amplification. The ssRNA genome of PYLCV includes 6,028 nucleotides encoding six open reading frames (ORFs), which is typical of the genus Polerovirus. Comparisons of the deduced amino acid sequences of the PYLCV ORFs 2-4 and ORF5, indicate that there are high levels of similarity between these sequences to ORFs 2-4 of TVDV (84-93%) and to ORF5 of CABYV (87%). Both PYLCV and Pepper vein yellowing virus (PeVYV) contain sequences that point to a common ancestral polerovirus. The recombination breakpoint which is located at CABYV ORF3, which encodes the viral coat protein (CP), may explain the CABYV-like sequences found in the genomes of the pepper infecting viruses PYLCV and PeVYV. Two additional regions unique to PYLCV (PY1 and PY2) were identified between nucleotides 4,962 and 5,061 (ORF 5) and between positions 5,866 and 6,028 in the 3' NCR. Sequence analysis of the pepper-infecting PeVYV revealed three unique regions (Pe1-Pe3) with no similarity to other members of the genus Polerovirus. Genomic analyses of PYLCV and PeVYV suggest that the speciation of these viruses occurred through putative recombination event(s) between poleroviruses co-infecting a common host(s), resulting in the emergence of PYLCV, a novel pathogen with a wider host range.

  4. The Unique hmuY Gene Sequence as a Specific Marker of Porphyromonas gingivalis

    PubMed Central

    Mackiewicz, Paweł; Radwan-Oczko, Małgorzata; Kantorowicz, Małgorzata; Chomyszyn-Gajewska, Maria; Frąszczak, Magdalena; Bielecki, Marcin; Olczak, Mariusz; Olczak, Teresa

    2013-01-01

    Porphyromonas gingivalis, a major etiological agent of chronic periodontitis, acquires heme from host hemoproteins using the HmuY hemophore. The aim of this study was to develop a specific P. gingivalis marker based on a hmuY gene sequence. Subgingival samples were collected from 66 patients with chronic periodontitis and 40 healthy subjects and the entire hmuY gene was analyzed in positive samples. Phylogenetic analyses demonstrated that both the amino acid sequence of the HmuY protein and the nucleotide sequence of the hmuY gene are unique among P. gingivalis strains/isolates and show low identity to sequences found in other species (below 50 and 56%, respectively). In agreement with these findings, a set of hmuY gene-based primers and standard/real-time PCR with SYBR Green chemistry allowed us to specifically detect P. gingivalis in patients with chronic periodontitis (77.3%) and healthy subjects (20%), the latter possessing lower number of P. gingivalis cells and total bacterial cells. Isolates from healthy subjects possess the hmuY gene-based nucleotide sequence pattern occurring in W83/W50/A7436 (n = 4), 381/ATCC 33277 (n = 3) or TDC60 (n = 1) strains, whereas those from patients typically have TDC60 (n = 21), W83/W50/A7436 (n = 17) and 381/ATCC 33277 (n = 13) strains. We observed a significant correlation between periodontal index of risk of infectiousness (PIRI) and the presence/absence of P. gingivalis (regardless of the hmuY gene-based sequence pattern of the isolate identified [r = 0.43; P = 0.0002] and considering particular isolate pattern [r = 0.38; P = 0.0012]). In conclusion, we demonstrated that the hmuY gene sequence or its fragments may be used as one of the molecular markers of P. gingivalis. PMID:23844074

  5. Pestoides F, an atypical Yersinia pestis strain from the former Soviet Union.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Garcia, Emilio; Worsham, Patricia; Bearden, S.

    2007-01-01

    Unlike the classical Yersinia pestis strains, members of an atypical group of Y. pestis from Central Asia, denominated Y. pestis subspecies caucasica (also known as one of several pestoides types), are distinguished by a number of characteristics including their ability to ferment rhamnose and melibiose, their lack of the small plasmid encoding the plasminogen activator (pla) and pesticin, and their exceptionally large variants of the virulence plasmid pMT (encoding murine toxin and capsular antigen). We have obtained the entire genome sequence of Y. pestis Pestoides F, an isolate from the former Soviet Union that has enabled us to carryout amore » comprehensive genome-wide comparison of this organism's genomic content against the six published sequences of Y. pestis and their Y. pseudotuberculosis ancestor. Based on classical glycerol fermentation (+ve) and nitrate reduction (+ve) Y. pestis Pestoides F is an isolate that belongs to the biovar antiqua. This strain is unusual in other characteristics such as the fact that it carries a non-consensus V antigen (lcrV) sequence, and that unlike other Pla(-) strains, Pestoides F retains virulence by the parenteral and aerosol routes. The chromosome of Pestoides F is 4,517,345 bp in size comprising some 3,936 predicted coding sequences, while its pCD and pMT plasmids are 71,507 bp and 137,010 bp in size respectively. Comparison of chromosome-associated genes in Pestoides F with those in the other sequenced Y. pestis strains reveals differences ranging from strain-specific rearrangements, insertions, deletions, single nucleotide polymorphisms, and a unique distribution of insertion sequences. There is a single approximately 7 kb unique region in the chromosome not found in any of the completed Y. pestis strains sequenced to date, but which is present in the Y. pseudotuberculosis ancestor. Taken together, these findings are consistent with Pestoides F being derived from the most ancient lineage of Y. pestis yet sequenced.« less

  6. Pestoides F, and Atypical Yersinia pestis Strain from the Former Soviet Union

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Garcia, E; Worsham, P; Bearden, S

    2007-01-05

    Unlike the classical Yersinia pestis strains, members of an atypical group of Y. pestis from Central Asia, denominated Y. pestis subspecies caucasica (also known as one of several pestoides types), are distinguished by a number of characteristics including their ability to ferment rhamnose and melibiose, their lacking the small plasmid encoding the plasminogen activator (pla) and pesticin, and their exceptionally large variants of the virulence plasmid pMT (encoding murine toxin and capsular antigen). We have obtained the entire genome sequence of Y. pestis Pestoides F, an isolate from the former Soviet Union that has enabled us to carryout a comprehensivemore » genome-wide comparison of this organism's genomic content against the six published sequences of Y. pestis and their Y. pseudotuberculosis ancestor. Based on classical glycerol fermentation (+ve) and nitrate reduction (+ve) Y. pestis Pestoides F is an isolate that belongs to the biovar antiqua. This strain is unusual in other characteristics such as the fact that it carries a non-consensus V antigen (lcrV) sequence, and that unlike other Pla{sup -} strains, Pestoides F retains virulence by the parenteral and aerosol routes. The chromosome of Pestoides F is 4,517,345 bp in size comprising some 3,936 predicted coding sequences, while its pCD and pMT plasmids are 71,507 bp and 137,010 bp in size respectively. Comparison of chromosome-associated genes in Pestoides F with those in the other sequenced Y. pestis strains, reveals a series of differences ranging from strain-specific rearrangements, insertions, deletions, single nucleotide polymorphisms, and a unique distribution of insertion sequences. There is a single {approx}7 kb unique region in the chromosome not found in any of the completed Y. pestis strains sequenced to date, but which is present in the Y. pseudotuberculosis ancestor. Taken together, these findings are consistent with Pestoides F being derived from the most ancient lineage of Y. pestis yet sequenced.« less

  7. A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1

    PubMed Central

    Reisman, Steven; Hatzopoulos, Thomas; Läufer, Konstantin; Thiruvathukal, George K.; Putonti, Catherine

    2016-01-01

    As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 sequences. Phylogenetic analyses were conducted for >6,000 HIV-1 sequences revealing spatial and temporal factors influence the evolution of the individual genes uniquely. Nevertheless, signatures of origin can be extrapolated even despite increased globalization. The approach developed here can easily be customized for any species of interest. PMID:26819543

  8. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  9. Classifying Noisy Protein Sequence Data: A Case Study of Immunoglobulin Light Chains

    DTIC Science & Technology

    2005-01-01

    collected from patients with and without amyloidosis , and indicates that the proposed modified classifi- ers are more robust to sequence variability than...piled from patients with and without amyloidosis provides unique features to serve as a model system, not only for conformational disease studies but...produced by patients with amyloidosis . SVMs have been used recently in a wide variety of applica- tions in computational biology (Noble, 2004). Most

  10. The Placenta in Twin-to-Twin Transfusion Syndrome and Twin Anemia Polycythemia Sequence.

    PubMed

    Couck, Isabel; Lewi, Liesbeth

    2016-06-01

    Twin-to-twin transfusion syndrome (TTTS) and twin anemia polycythemia sequence (TAPS) are complications unique to monochorionic twin pregnancies and their shared circulation. Both are the result of the transfusion imbalance in the intertwin circulation. TTTS is characterized by an amniotic fluid discordance, whereas in TAPS, there is a severe discordance in hemoglobin levels. The article gives an overview of the typical features of TTTS and TAPS placentas.

  11. Species specific identification of spore-producing microbes using the gene sequence of small acid-soluble spore coat proteins for amplification based diagnostics

    DOEpatents

    McKinney, Nancy

    2002-01-01

    PCR (polymerase chain reaction) primers for the detection of certain Bacillus species, such as Bacillus anthracis. The primers specifically amplify only DNA found in the target species and can distinguish closely related species. Species-specific PCR primers for Bacillus anthracis, Bacillus globigii and Clostridium perfringens are disclosed. The primers are directed to unique sequences within sasp (small acid soluble protein) genes.

  12. The sRNAome mining revealed existence of unique signature small RNAs derived from 5.8SrRNA from Piper nigrum and other plant lineages.

    PubMed

    Asha, Srinivasan; Soniya, E V

    2017-02-01

    Small RNAs derived from ribosomal RNAs (srRNAs) are rarely explored in the high-throughput data of plant systems. Here, we analyzed srRNAs from the deep-sequenced small RNA libraries of Piper nigrum, a unique magnoliid plant. The 5' end of the putative long form of 5.8S rRNA (5.8S L rRNA) was identified as the site for biogenesis of highly abundant srRNAs that are unique among the Piperaceae family of plants. A subsequent comparative analysis of the ninety-seven sRNAomes of diverse plants successfully uncovered the abundant existence and precise cleavage of unique rRF signature small RNAs upstream of a novel 5' consensus sequence of the 5.8S rRNA. The major cleavage process mapped identically among the different tissues of the same plant. The differential expression and cleavage of 5'5.8S srRNAs in Phytophthora capsici infected P. nigrum tissues indicated the critical biological functions of these srRNAs during stress response. The non-canonical short hairpin precursor structure, the association with Argonaute proteins, and the potential targets of 5'5.8S srRNAs reinforced their regulatory role in the RNAi pathway in plants. In addition, this novel lineage specific small RNAs may have tremendous biological potential in the taxonomic profiling of plants.

  13. The sRNAome mining revealed existence of unique signature small RNAs derived from 5.8SrRNA from Piper nigrum and other plant lineages

    PubMed Central

    Asha, Srinivasan; Soniya, E. V.

    2017-01-01

    Small RNAs derived from ribosomal RNAs (srRNAs) are rarely explored in the high-throughput data of plant systems. Here, we analyzed srRNAs from the deep-sequenced small RNA libraries of Piper nigrum, a unique magnoliid plant. The 5′ end of the putative long form of 5.8S rRNA (5.8SLrRNA) was identified as the site for biogenesis of highly abundant srRNAs that are unique among the Piperaceae family of plants. A subsequent comparative analysis of the ninety-seven sRNAomes of diverse plants successfully uncovered the abundant existence and precise cleavage of unique rRF signature small RNAs upstream of a novel 5′ consensus sequence of the 5.8S rRNA. The major cleavage process mapped identically among the different tissues of the same plant. The differential expression and cleavage of 5′5.8S srRNAs in Phytophthora capsici infected P. nigrum tissues indicated the critical biological functions of these srRNAs during stress response. The non-canonical short hairpin precursor structure, the association with Argonaute proteins, and the potential targets of 5′5.8S srRNAs reinforced their regulatory role in the RNAi pathway in plants. In addition, this novel lineage specific small RNAs may have tremendous biological potential in the taxonomic profiling of plants. PMID:28145468

  14. Deep sequencing reveals unique small RNA repertoire that is regulated during head regeneration in Hydra magnipapillata.

    PubMed

    Krishna, Srikar; Nair, Aparna; Cheedipudi, Sirisha; Poduval, Deepak; Dhawan, Jyotsna; Palakodeti, Dasaradhi; Ghanekar, Yashoda

    2013-01-07

    Small non-coding RNAs such as miRNAs, piRNAs and endo-siRNAs fine-tune gene expression through post-transcriptional regulation, modulating important processes in development, differentiation, homeostasis and regeneration. Using deep sequencing, we have profiled small non-coding RNAs in Hydra magnipapillata and investigated changes in small RNA expression pattern during head regeneration. Our results reveal a unique repertoire of small RNAs in hydra. We have identified 126 miRNA loci; 123 of these miRNAs are unique to hydra. Less than 50% are conserved across two different strains of Hydra vulgaris tested in this study, indicating a highly diverse nature of hydra miRNAs in contrast to bilaterian miRNAs. We also identified siRNAs derived from precursors with perfect stem-loop structure and that arise from inverted repeats. piRNAs were the most abundant small RNAs in hydra, mapping to transposable elements, the annotated transcriptome and unique non-coding regions on the genome. piRNAs that map to transposable elements and the annotated transcriptome display a ping-pong signature. Further, we have identified several miRNAs and piRNAs whose expression is regulated during hydra head regeneration. Our study defines different classes of small RNAs in this cnidarian model system, which may play a role in orchestrating gene expression essential for hydra regeneration.

  15. Deep sequencing reveals unique small RNA repertoire that is regulated during head regeneration in Hydra magnipapillata

    PubMed Central

    Krishna, Srikar; Nair, Aparna; Cheedipudi, Sirisha; Poduval, Deepak; Dhawan, Jyotsna; Palakodeti, Dasaradhi; Ghanekar, Yashoda

    2013-01-01

    Small non-coding RNAs such as miRNAs, piRNAs and endo-siRNAs fine-tune gene expression through post-transcriptional regulation, modulating important processes in development, differentiation, homeostasis and regeneration. Using deep sequencing, we have profiled small non-coding RNAs in Hydra magnipapillata and investigated changes in small RNA expression pattern during head regeneration. Our results reveal a unique repertoire of small RNAs in hydra. We have identified 126 miRNA loci; 123 of these miRNAs are unique to hydra. Less than 50% are conserved across two different strains of Hydra vulgaris tested in this study, indicating a highly diverse nature of hydra miRNAs in contrast to bilaterian miRNAs. We also identified siRNAs derived from precursors with perfect stem–loop structure and that arise from inverted repeats. piRNAs were the most abundant small RNAs in hydra, mapping to transposable elements, the annotated transcriptome and unique non-coding regions on the genome. piRNAs that map to transposable elements and the annotated transcriptome display a ping–pong signature. Further, we have identified several miRNAs and piRNAs whose expression is regulated during hydra head regeneration. Our study defines different classes of small RNAs in this cnidarian model system, which may play a role in orchestrating gene expression essential for hydra regeneration. PMID:23166307

  16. The feoABC Locus of Yersinia pestis Likely Has Two Promoters Causing Unique Iron Regulation

    PubMed Central

    O'Connor, Lauren; Fetherston, Jacqueline D.; Perry, Robert D.

    2017-01-01

    The FeoABC ferrous transporter is a wide-spread bacterial system. While the feoABC locus is regulated by a number of factors in the bacteria studied, we have previously found that regulation of feoABC in Yersinia pestis appears to be unique. None of the non-iron responsive transcriptional regulators that control expression of feoABC in other bacteria do so in Y. pestis. Another unique factor is the iron and Fur regulation of the Y. pestis feoABC locus occurs during microaerobic but not aerobic growth. Here we show that this unique iron-regulation is not due to a unique aspect of the Y. pestis Fur protein but to DNA sequences that regulate transcription. We have used truncations, alterations, and deletions of the feoA::lacZ reporter to assess the mechanism behind the failure of iron to repress transcription under aerobic conditions. These studies plus EMSAs and DNA sequence analysis have led to our proposal that the feoABC locus has two promoters: an upstream P1 promoter whose expression is relatively iron-independent but repressed under microaerobic conditions and the known downstream Fur-regulated P2 promoter. In addition, we have identified two regions that bind Y. pestis protein(s), although we have not identified these protein(s) or their function. Finally we used iron uptake assays to demonstrate that both FeoABC and YfeABCD transport ferrous iron in an energy-dependent manner and also use ferric iron as a substrate for uptake. PMID:28785546

  17. The Medicago sativa gene index 1.2: a web-accessible gene expression atlas for investigating expression differences between Medicago sativa subspecies.

    PubMed

    O'Rourke, Jamie A; Fu, Fengli; Bucciarelli, Bruna; Yang, S Sam; Samac, Deborah A; Lamb, JoAnn F S; Monteros, Maria J; Graham, Michelle A; Gronwald, John W; Krom, Nick; Li, Jun; Dai, Xinbin; Zhao, Patrick X; Vance, Carroll P

    2015-07-07

    Alfalfa (Medicago sativa L.) is the primary forage legume crop species in the United States and plays essential economic and ecological roles in agricultural systems across the country. Modern alfalfa is the result of hybridization between tetraploid M. sativa ssp. sativa and M. sativa ssp. falcata. Due to its large and complex genome, there are few genomic resources available for alfalfa improvement. A de novo transcriptome assembly from two alfalfa subspecies, M. sativa ssp. sativa (B47) and M. sativa ssp. falcata (F56) was developed using Illumina RNA-seq technology. Transcripts from roots, nitrogen-fixing root nodules, leaves, flowers, elongating stem internodes, and post-elongation stem internodes were assembled into the Medicago sativa Gene Index 1.2 (MSGI 1.2) representing 112,626 unique transcript sequences. Nodule-specific and transcripts involved in cell wall biosynthesis were identified. Statistical analyses identified 20,447 transcripts differentially expressed between the two subspecies. Pair-wise comparisons of each tissue combination identified 58,932 sequences differentially expressed in B47 and 69,143 sequences differentially expressed in F56. Comparing transcript abundance in floral tissues of B47 and F56 identified expression differences in sequences involved in anthocyanin and carotenoid synthesis, which determine flower pigmentation. Single nucleotide polymorphisms (SNPs) unique to each M. sativa subspecies (110,241) were identified. The Medicago sativa Gene Index 1.2 increases the expressed sequence data available for alfalfa by ninefold and can be expanded as additional experiments are performed. The MSGI 1.2 transcriptome sequences, annotations, expression profiles, and SNPs were assembled into the Alfalfa Gene Index and Expression Database (AGED) at http://plantgrn.noble.org/AGED/ , a publicly available genomic resource for alfalfa improvement and legume research.

  18. The first complete chloroplast genome sequence of a lycophyte,Huperzia lucidula (Lycopodiaceae)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wolf, Paul G.; Karol, Kenneth G.; Mandoli, Dina F.

    2005-02-01

    We used a unique combination of techniques to sequence the first complete chloroplast genome of a lycophyte, Huperzia lucidula. This plant belongs to a significant clade hypothesized to represent the sister group to all other vascular plants. We used fluorescence-activated cell sorting (FACS) to isolate the organelles, rolling circle amplification (RCA) to amplify the genome, and shotgun sequencing to 8x depth coverage to obtain the complete chloroplast genome sequence. The genome is 154,373bp, containing inverted repeats of 15,314 bp each, a large single-copy region of 104,088 bp, and a small single-copy region of 19,671 bp. Gene order is more similarmore » to those of mosses, liverworts, and hornworts than to gene order for other vascular plants. For example, the Huperziachloroplast genome possesses the bryophyte gene order for a previously characterized 30 kb inversion, thus supporting the hypothesis that lycophytes are sister to all other extant vascular plants. The lycophytechloroplast genome data also enable a better reconstruction of the basaltracheophyte genome, which is useful for inferring relationships among bryophyte lineages. Several unique characters are observed in Huperzia, such as movement of the gene ndhF from the small single copy region into the inverted repeat. We present several analyses of evolutionary relationships among land plants by using nucleotide data, amino acid sequences, and by comparing gene arrangements from chloroplast genomes. The results, while still tentative pending the large number of chloroplast genomes from other key lineages that are soon to be sequenced, are intriguing in themselves, and contribute to a growing comparative database of genomic and morphological data across the green plants.« less

  19. Russell body inducing threshold depends on the variable domain sequences of individual human IgG clones and the cellular protein homeostasis.

    PubMed

    Stoops, Janelle; Byrd, Samantha; Hasegawa, Haruki

    2012-10-01

    Russell bodies are intracellular aggregates of immunoglobulins. Although the mechanism of Russell body biogenesis has been extensively studied by using truncated mutant heavy chains, the importance of the variable domain sequences in this process and in immunoglobulin biosynthesis remains largely unknown. Using a panel of structurally and functionally normal human immunoglobulin Gs, we show that individual immunoglobulin G clones possess distinctive Russell body inducing propensities that can surface differently under normal and abnormal cellular conditions. Russell body inducing predisposition unique to each immunoglobulin G clone was corroborated by the intrinsic physicochemical properties encoded in the heavy chain variable domain/light chain variable domain sequence combinations that define each immunoglobulin G clone. While the sequence based intrinsic factors predispose certain immunoglobulin G clones to be more prone to induce Russell bodies, extrinsic factors such as stressful cell culture conditions also play roles in unmasking Russell body propensity from immunoglobulin G clones that are normally refractory to developing Russell bodies. By taking advantage of heterologous expression systems, we dissected the roles of individual subunit chains in Russell body formation and examined the effect of non-cognate subunit chain pair co-expression on Russell body forming propensity. The results suggest that the properties embedded in the variable domain of individual light chain clones and their compatibility with the partnering heavy chain variable domain sequences underscore the efficiency of immunoglobulin G biosynthesis, the threshold for Russell body induction, and the level of immunoglobulin G secretion. We propose that an interplay between the unique properties encoded in variable domain sequences and the state of protein homeostasis determines whether an immunoglobulin G expressing cell will develop the Russell body phenotype in a dynamic cellular setting. Copyright © 2012 Elsevier B.V. All rights reserved.

  20. Enrichment allows identification of diverse, rare elements in metagenomic resistome-virulome sequencing.

    PubMed

    Noyes, Noelle R; Weinroth, Maggie E; Parker, Jennifer K; Dean, Chris J; Lakin, Steven M; Raymond, Robert A; Rovira, Pablo; Doster, Enrique; Abdo, Zaid; Martin, Jennifer N; Jones, Kenneth L; Ruiz, Jaime; Boucher, Christina A; Belk, Keith E; Morley, Paul S

    2017-10-17

    Shotgun metagenomic sequencing is increasingly utilized as a tool to evaluate ecological-level dynamics of antimicrobial resistance and virulence, in conjunction with microbiome analysis. Interest in use of this method for environmental surveillance of antimicrobial resistance and pathogenic microorganisms is also increasing. In published metagenomic datasets, the total of all resistance- and virulence-related sequences accounts for < 1% of all sequenced DNA, leading to limitations in detection of low-abundance resistome-virulome elements. This study describes the extent and composition of the low-abundance portion of the resistome-virulome, using a bait-capture and enrichment system that incorporates unique molecular indices to count DNA molecules and correct for enrichment bias. The use of the bait-capture and enrichment system significantly increased on-target sequencing of the resistome-virulome, enabling detection of an additional 1441 gene accessions and revealing a low-abundance portion of the resistome-virulome that was more diverse and compositionally different than that detected by more traditional metagenomic assays. The low-abundance portion of the resistome-virulome also contained resistance genes with public health importance, such as extended-spectrum betalactamases, that were not detected using traditional shotgun metagenomic sequencing. In addition, the use of the bait-capture and enrichment system enabled identification of rare resistance gene haplotypes that were used to discriminate between sample origins. These results demonstrate that the rare resistome-virulome contains valuable and unique information that can be utilized for both surveillance and population genetic investigations of resistance. Access to the rare resistome-virulome using the bait-capture and enrichment system validated in this study can greatly advance our understanding of microbiome-resistome dynamics.

  1. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric

    2010-03-23

    Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities tomore » known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.« less

  2. Developing JSequitur to Study the Hierarchical Structure of Biological Sequences in a Grammatical Inference Framework of String Compression Algorithms.

    PubMed

    Galbadrakh, Bulgan; Lee, Kyung-Eun; Park, Hyun-Seok

    2012-12-01

    Grammatical inference methods are expected to find grammatical structures hidden in biological sequences. One hopes that studies of grammar serve as an appropriate tool for theory formation. Thus, we have developed JSequitur for automatically generating the grammatical structure of biological sequences in an inference framework of string compression algorithms. Our original motivation was to find any grammatical traits of several cancer genes that can be detected by string compression algorithms. Through this research, we could not find any meaningful unique traits of the cancer genes yet, but we could observe some interesting traits in regards to the relationship among gene length, similarity of sequences, the patterns of the generated grammar, and compression rate.

  3. Isolates of viral hemorrhagic septicemia virus from North America and Europe can be detected and distinguished by DNA probes

    USGS Publications Warehouse

    Batts, W.N.; Arakawa, C.K.; Bernard, J.; Winton, J.R.

    1993-01-01

    Biotinylated DNA probes were constructed to hybndize with speclfic sequences within the messenger RNA (mRNA) of the nucleoprotein (N) gene of vlral hemorrhagic septicemia virus (VHSV) reference strains from Europe (07-71) and North Arnenca (Makah) Probes were synthesized that were complementary to (1) a 29-nucleotide sequence near the center of the N gene conlmon to both the 07-71 and Makah reference strains of the virus (2) a unique 28- nucleotide sequence that followed the open readng frame of the Makah N gene mRNA most of which was absent In the 07-71 strain, and (3) a 22-nucleobde sequence wthin the 07-71 N gene that had 6 nllsmatches \

  4. Identification of maca (Lepidium meyenii Walp.) and its adulterants by a DNA-barcoding approach based on the ITS sequence.

    PubMed

    Chen, Jin-Jin; Zhao, Qing-Sheng; Liu, Yi-Lan; Zha, Sheng-Hua; Zhao, Bing

    2015-09-01

    Maca (Lepidium meyenii) is an herbaceous plant that grows in high plateaus and has been used as both food and folk medicine for centuries because of its benefits to human health. In the present study, ITS (internal transcribed spacer) sequences of forty-three maca samples, collected from different regions or vendors, were amplified and analyzed. The ITS sequences of nineteen potential adulterants of maca were also collected and analyzed. The results indicated that the ITS sequence of maca was consistent in all samples and unique when compared with its adulterants. Therefore, this DNA-barcoding approach based on the ITS sequence can be used for the molecular identification of maca and its adulterants. Copyright © 2015 China Pharmaceutical University. Published by Elsevier B.V. All rights reserved.

  5. Helicobacter spp. from captive bottlenose dolphins (Tursiops spp.) and polar bears (Ursus maritimus).

    PubMed

    Oxley, Andrew P A; Argo, Jeffrey A; McKay, David B

    2005-11-01

    The gastric fluid of six bottlenose dolphins and the faeces of four polar bears from the same oceanarium were examined for the presence of Helicobacter. As detected by PCR, all dolphins and 8/12 samples collected from polar bears were positive for Helicobacter. Novel sequence types were identified in samples collected from these animals of which several were unique to either the dolphins or the polar bears. At least one sequence type was, however, detected in both animal taxa. In addition, a sequence type from a dolphin shared a 98.2-100% identity to sequences from other Helicobacter species from harp seals, sea otters and sea lions. This study reports on the occurrence of novel Helicobacter sequence types in polar bears and dolphins and demonstrates the broad-host range of some species within these animals.

  6. Nucleotide sequences of Japanese isolates of citrus vein enation virus.

    PubMed

    Nakazono-Nagaoka, Eiko; Fujikawa, Takashi; Iwanami, Toru

    2017-03-01

    The genomic sequences of five Japanese isolates of citrus vein enation virus (CVEV) isolates that induce vein enation were determined and compared with that of the Spanish isolate VE-1. The nucleotide sequences of all Japanese isolates were 5,983 nt in length. The genomic RNA of Japanese isolates had five potential open reading frames (ORF 0, ORF 1, ORF 2, ORF 3, and ORF 5) in the positive-sense strand. The nucleotide sequence identity among the Japanese isolates and Spanish isolate VE-1 ranged from 98.0% to 99.8%. Comparison of the partial amino acid sequences of ten Japanese isolates and three Spanish isolates suggested that four amino acid residues, at positions of 83, 104, and 113 in ORF 2 and position 41 in ORF 5, might be unique to some Japanese isolates.

  7. Mapping DNA polymerase errors by single-molecule sequencing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, David F.; Lu, Jenny; Chang, Seungwoo

    Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replicationmore » product is tagged with a unique nucleotide sequence before amplification. Here, this allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases.« less

  8. A DNA sequence obtained by replacement of the dopamine RNA aptamer bases is not an aptamer.

    PubMed

    Álvarez-Martos, Isabel; Ferapontova, Elena E

    2017-08-05

    A unique specificity of the aptamer-ligand biorecognition and binding facilitates bioanalysis and biosensor development, contributing to discrimination of structurally related molecules, such as dopamine and other catecholamine neurotransmitters. The aptamer sequence capable of specific binding of dopamine is a 57 nucleotides long RNA sequence reported in 1997 (Biochemistry, 1997, 36, 9726). Later, it was suggested that the DNA homologue of the RNA aptamer retains the specificity of dopamine binding (Biochem. Biophys. Res. Commun., 2009, 388, 732). Here, we show that the DNA sequence obtained by the replacement of the RNA aptamer bases for their DNA analogues is not able of specific biorecognition of dopamine, in contrast to the original RNA aptamer sequence. This DNA sequence binds dopamine and structurally related catecholamine neurotransmitters non-specifically, as any DNA sequence, and, thus, is not an aptamer and cannot be used neither for in vivo nor in situ analysis of dopamine in the presence of structurally related neurotransmitters. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. Mapping DNA polymerase errors by single-molecule sequencing

    DOE PAGES

    Lee, David F.; Lu, Jenny; Chang, Seungwoo; ...

    2016-05-16

    Genomic integrity is compromised by DNA polymerase replication errors, which occur in a sequence-dependent manner across the genome. Accurate and complete quantification of a DNA polymerase's error spectrum is challenging because errors are rare and difficult to detect. We report a high-throughput sequencing assay to map in vitro DNA replication errors at the single-molecule level. Unlike previous methods, our assay is able to rapidly detect a large number of polymerase errors at base resolution over any template substrate without quantification bias. To overcome the high error rate of high-throughput sequencing, our assay uses a barcoding strategy in which each replicationmore » product is tagged with a unique nucleotide sequence before amplification. Here, this allows multiple sequencing reads of the same product to be compared so that sequencing errors can be found and removed. We demonstrate the ability of our assay to characterize the average error rate, error hotspots and lesion bypass fidelity of several DNA polymerases.« less

  10. Diatom centromeres suggest a mechanism for nuclear DNA acquisition

    DOE PAGES

    Diner, Rachel E.; Noddings, Chari M.; Lian, Nathan C.; ...

    2017-07-18

    Centromeres are essential for cell division and growth in all eukaryotes, and knowledge of their sequence and structure guides the development of artificial chromosomes for functional cellular biology studies. Centromeric proteins are conserved among eukaryotes; however, centromeric DNA sequences are highly variable. We combined forward and reverse genetic approaches with chromatin immunoprecipitation to identify centromeres of the model diatom Phaeodactylum tricornutum. We observed 25 unique centromere sequences typically occurring once per chromosome, a finding that helps to resolve nuclear genome organization and indicates monocentric regional centromeres. Diatom centromere sequences contain low-GC content regions but lack repeats or other conserved sequencemore » features. Native and foreign sequences with similar GC content to P. tricornutum centromeres can maintain episomes and recruit the diatom centromeric histone protein CENH3, suggesting nonnative sequences can also function as diatom centromeres. Thus, simple sequence requirements may enable DNA from foreign sources to persist in the nucleus as extrachromosomal episomes, revealing a potential mechanism for organellar and foreign DNA acquisition.« less

  11. Diatom centromeres suggest a mechanism for nuclear DNA acquisition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Diner, Rachel E.; Noddings, Chari M.; Lian, Nathan C.

    Centromeres are essential for cell division and growth in all eukaryotes, and knowledge of their sequence and structure guides the development of artificial chromosomes for functional cellular biology studies. Centromeric proteins are conserved among eukaryotes; however, centromeric DNA sequences are highly variable. We combined forward and reverse genetic approaches with chromatin immunoprecipitation to identify centromeres of the model diatom Phaeodactylum tricornutum. We observed 25 unique centromere sequences typically occurring once per chromosome, a finding that helps to resolve nuclear genome organization and indicates monocentric regional centromeres. Diatom centromere sequences contain low-GC content regions but lack repeats or other conserved sequencemore » features. Native and foreign sequences with similar GC content to P. tricornutum centromeres can maintain episomes and recruit the diatom centromeric histone protein CENH3, suggesting nonnative sequences can also function as diatom centromeres. Thus, simple sequence requirements may enable DNA from foreign sources to persist in the nucleus as extrachromosomal episomes, revealing a potential mechanism for organellar and foreign DNA acquisition.« less

  12. Origin and spread of photosynthesis based upon conserved sequence features in key bacteriochlorophyll biosynthesis proteins.

    PubMed

    Gupta, Radhey S

    2012-11-01

    The origin of photosynthesis and how this capability has spread to other bacterial phyla remain important unresolved questions. I describe here a number of conserved signature indels (CSIs) in key proteins involved in bacteriochlorophyll (Bchl) biosynthesis that provide important insights in these regards. The proteins BchL and BchX, which are essential for Bchl biosynthesis, are derived by gene duplication in a common ancestor of all phototrophs. More ancient gene duplication gave rise to the BchX-BchL proteins and the NifH protein of the nitrogenase complex. The sequence alignment of NifH-BchX-BchL proteins contain two CSIs that are uniquely shared by all NifH and BchX homologs, but not by any BchL homologs. These CSIs and phylogenetic analysis of NifH-BchX-BchL protein sequences strongly suggest that the BchX homologs are ancestral to BchL and that the Bchl-based anoxygenic photosynthesis originated prior to the chlorophyll (Chl)-based photosynthesis in cyanobacteria. Another CSI in the BchX-BchL sequence alignment that is uniquely shared by all BchX homologs and the BchL sequences from Heliobacteriaceae, but absent in all other BchL homologs, suggests that the BchL homologs from Heliobacteriaceae are primitive in comparison to all other photosynthetic lineages. Several other identified CSIs in the BchN homologs are commonly shared by all proteobacterial homologs and a clade consisting of the marine unicellular Cyanobacteria (Clade C). These CSIs in conjunction with the results of phylogenetic analyses and pair-wise sequence similarity on the BchL, BchN, and BchB proteins, where the homologs from Clade C Cyanobacteria and Proteobacteria exhibited close relationship, provide strong evidence that these two groups have incurred lateral gene transfers. Additionally, phylogenetic analyses and several CSIs in the BchL-N-B proteins that are uniquely shared by all Chlorobi and Chloroflexi homologs provide evidence that the genes for these proteins have also been laterally transferred between these groups. Other results and observations reported here indicate that the genes for the BchL-N-B proteins in Proteobacteria are derived from the Clade C Cyanobacteria, whereas those in Chlorobi were acquired from Chloroflexus or related bacteria by means of LGTs. Some implications of these observations regarding the origin and spread of photosynthesis are discussed.

  13. Identification of reproduction-related genes and SSR-markers through expressed sequence tags analysis of a monsoon breeding carp rohu, Labeo rohita (Hamilton).

    PubMed

    Sahu, Dinesh K; Panda, Soumya P; Panda, Sujata; Das, Paramananda; Meher, Prem K; Hazra, Rupenangshu K; Peatman, Eric; Liu, Zhanjiang J; Eknath, Ambekar E; Nandi, Samiran

    2013-07-15

    Labeo rohita (Ham.) also called rohu is the most important freshwater aquaculture species on the Indian sub continent. Monsoon dependent breeding restricts its seed production beyond season indicating a strong genetic control about which very limited information is available. Additionally, few genomic resources are publicly available for this species. Here we sought to identify reproduction-relevant genes from normalized cDNA libraries of the brain-pituitary-gonad-liver (BPGL-axis) tissues of adult L. rohita collected during post preparatory phase. 6161 random clones sequenced (Sanger-based) from these libraries produced 4642 (75.34%) high-quality sequences. They were assembled into 3631 (78.22%) unique sequences composed of 709 contigs and 2922 singletons. A total of 182 unique sequences were found to be associated with reproduction-related genes, mainly under the GO term categories of reproduction, neuro-peptide hormone activity, hormone and receptor binding, receptor activity, signal transduction, embryonic development, cell-cell signaling, cell death and anti-apoptosis process. Several important reproduction-related genes reported here for the first time in L. rohita are zona pellucida sperm-binding protein 3, aquaporin-12, spermine oxidase, sperm associated antigen 7, testis expressed 261, progesterone receptor membrane component, Neuropeptide Y and Pro-opiomelanocortin. Quantitative RT-PCR-based analyses of 8 known and 8 unknown transcripts during preparatory and post-spawning phase showed increased expression level of most of the transcripts during preparatory phase (except Neuropeptide Y) in comparison to post-spawning phase indicating possible roles in initiation of gonad maturation. Expression of unknown transcripts was also found in prolific breeder common carp and tilapia, but levels of expression were much higher in seasonal breeder rohu. 3631 unique sequences contained 236 (6.49%) putative microsatellites with the AG (28.16%) repeat as the most frequent motif. Twenty loci showed polymorphism in 36 unrelated individuals with allele frequency ranging from 2 to 7 per locus. The observed heterozygosity ranged from 0.096 to 0.774 whereas the expected heterozygosity ranged from 0.109 to 0.801. Identification of 182 important reproduction-related genes and expression pattern of 16 transcripts in preparatory and post-spawning phase along with 20 polymorphic EST-SSRs should be highly useful for the future reproductive molecular studies and selection program in Labeo rohita. Copyright © 2013 Elsevier B.V. All rights reserved.

  14. Reproducibility of Illumina platform deep sequencing errors allows accurate determination of DNA barcodes in cells.

    PubMed

    Beltman, Joost B; Urbanus, Jos; Velds, Arno; van Rooij, Nienke; Rohr, Jan C; Naik, Shalin H; Schumacher, Ton N

    2016-04-02

    Next generation sequencing (NGS) of amplified DNA is a powerful tool to describe genetic heterogeneity within cell populations that can both be used to investigate the clonal structure of cell populations and to perform genetic lineage tracing. For applications in which both abundant and rare sequences are biologically relevant, the relatively high error rate of NGS techniques complicates data analysis, as it is difficult to distinguish rare true sequences from spurious sequences that are generated by PCR or sequencing errors. This issue, for instance, applies to cellular barcoding strategies that aim to follow the amount and type of offspring of single cells, by supplying these with unique heritable DNA tags. Here, we use genetic barcoding data from the Illumina HiSeq platform to show that straightforward read threshold-based filtering of data is typically insufficient to filter out spurious barcodes. Importantly, we demonstrate that specific sequencing errors occur at an approximately constant rate across different samples that are sequenced in parallel. We exploit this observation by developing a novel approach to filter out spurious sequences. Application of our new method demonstrates its value in the identification of true sequences amongst spurious sequences in biological data sets.

  15. Analysis and Functional Annotation of an Expressed Sequence Tag Collection for Tropical Crop Sugarcane

    PubMed Central

    Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo

    2003-01-01

    To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979

  16. Flavivirus and Filovirus EvoPrinters: New alignment tools for the comparative analysis of viral evolution.

    PubMed

    Brody, Thomas; Yavatkar, Amarendra S; Park, Dong Sun; Kuzin, Alexander; Ross, Jermaine; Odenwald, Ward F

    2017-06-01

    Flavivirus and Filovirus infections are serious epidemic threats to human populations. Multi-genome comparative analysis of these evolving pathogens affords a view of their essential, conserved sequence elements as well as progressive evolutionary changes. While phylogenetic analysis has yielded important insights, the growing number of available genomic sequences makes comparisons between hundreds of viral strains challenging. We report here a new approach for the comparative analysis of these hemorrhagic fever viruses that can superimpose an unlimited number of one-on-one alignments to identify important features within genomes of interest. We have adapted EvoPrinter alignment algorithms for the rapid comparative analysis of Flavivirus or Filovirus sequences including Zika and Ebola strains. The user can input a full genome or partial viral sequence and then view either individual comparisons or generate color-coded readouts that superimpose hundreds of one-on-one alignments to identify unique or shared identity SNPs that reveal ancestral relationships between strains. The user can also opt to select a database genome in order to access a library of pre-aligned genomes of either 1,094 Flaviviruses or 460 Filoviruses for rapid comparative analysis with all database entries or a select subset. Using EvoPrinter search and alignment programs, we show the following: 1) superimposing alignment data from many related strains identifies lineage identity SNPs, which enable the assessment of sublineage complexity within viral outbreaks; 2) whole-genome SNP profile screens uncover novel Dengue2 and Zika recombinant strains and their parental lineages; 3) differential SNP profiling identifies host cell A-to-I hyper-editing within Ebola and Marburg viruses, and 4) hundreds of superimposed one-on-one Ebola genome alignments highlight ultra-conserved regulatory sequences, invariant amino acid codons and evolutionarily variable protein-encoding domains within a single genome. EvoPrinter allows for the assessment of lineage complexity within Flavivirus or Filovirus outbreaks, identification of recombinant strains, highlights sequences that have undergone host cell A-to-I editing, and identifies unique input and database SNPs within highly conserved sequences. EvoPrinter's ability to superimpose alignment data from hundreds of strains onto a single genome has allowed us to identify unique Zika virus sublineages that are currently spreading in South, Central and North America, the Caribbean, and in China. This new set of integrated alignment programs should serve as a useful addition to existing tools for the comparative analysis of these viruses.

  17. Structural Studies of Geosmin Synthase, a Bifunctional Sesquiterpene Synthase with Alpha-Alpha Domain Architecture that Catalyzes a Unique Cyclization-Fragmentation Reaction Sequence

    PubMed Central

    Harris, Golda G.; Lombardi, Patrick M.; Pemberton, Travis A.; Matsui, Tsutomu; Weiss, Thomas M.; Cole, Kathryn E.; Köksal, Mustafa; Murphy, Frank V.; Vedula, L. Sangeetha; Chou, Wayne K.W.; Cane, David E.; Christianson, David W.

    2015-01-01

    Geosmin synthase from Streptomyces coelicolor (ScGS) catalyzes an unusual, metal-dependent terpenoid cyclization and fragmentation reaction sequence. Two distinct active sites are required for catalysis: the N-terminal domain catalyzes the ionization and cyclization of farnesyl diphosphate to form germacradienol and inorganic pyrophosphate (PPi), and the C-terminal domain catalyzes the protonation, cyclization, and fragmentation of germacradienol to form geosmin and acetone through a retro-Prins reaction. A unique αα domain architecture is predicted for ScGS based on amino acid sequence: each domain contains the metal-binding motifs typical of a class I terpenoid cyclase, and each domain requires Mg2+ for catalysis. Here, we report the X-ray crystal structure of the unliganded N-terminal domain of ScGS and the structure of its complex with 3 Mg2+ ions and alendronate. These structures highlight conformational changes required for active site closure and catalysis. Although neither full-length ScGS nor constructs of the C-terminal domain could be crystallized, homology models of the C-terminal domain were constructed based on ~36% sequence identity with the N-terminal domain. Small-angle X-ray scattering experiments yield low resolution molecular envelopes into which the N-terminal domain crystal structure and the C-terminal domain homology model were fit, suggesting possible αα domain architectures as frameworks for bifunctional catalysis. PMID:26598179

  18. The mitochondrial genome sequence of Enterobius vermicularis (Nematoda: Oxyurida)--an idiosyncratic gene order and phylogenetic information for chromadorean nematodes.

    PubMed

    Kang, Seokha; Sultana, Tahera; Eom, Keeseon S; Park, Yung Chul; Soonthornpong, Nathan; Nadler, Steven A; Park, Joong-Ki

    2009-01-15

    The complete mitochondrial genome sequence was determined for the human pinworm Enterobius vermicularis (Oxyurida: Nematoda) and used to infer its phylogenetic relationship to other major groups of chromadorean nematodes. The E. vermicularis genome is a 14,010-bp circular DNA molecule that encodes 36 genes (12 proteins, 22 tRNAs, and 2 rRNAs). This mtDNA genome lacks atp8, as reported for almost all other nematode species investigated. Phylogenetic analyses (maximum parsimony, maximum likelihood, neighbor joining, and Bayesian inference) of nucleotide sequences for the 12 protein-coding genes of 25 nematode species placed E. vermicularis, a representative of the order Oxyurida, as sister to the main Ascaridida+Rhabditida group. Tree topology comparisons using statistical tests rejected an alternative hypothesis favoring a closer relationship among Ascaridida, Spirurida, and Oxyurida, which has been supported from most studies based on nuclear ribosomal DNA sequences. Unlike the relatively conserved gene arrangement found for most chromadorean taxa, E. vermicularis mtDNA gene order is very unique, not sharing similarity to any other nematode species reported to date. This lack of gene order similarity may represent idiosyncratic gene rearrangements unique to this specific lineage of the oxyurids. To more fully understand the extent of gene rearrangement and its evolutionary significance within the nematode phylogenetic framework, additional mitochondrial genomes representing a greater evolutionary diversity of species must be characterized.

  19. Comparative Analysis of the Shared Sex-Determination Region (SDR) among Salmonid Fishes.

    PubMed

    Faber-Hammond, Joshua J; Phillips, Ruth B; Brown, Kim H

    2015-06-25

    Salmonids present an excellent model for studying evolution of young sex-chromosomes. Within the genus, Oncorhynchus, at least six independent sex-chromosome pairs have evolved, many unique to individual species. This variation results from the movement of the sex-determining gene, sdY, throughout the salmonid genome. While sdY is known to define sexual differentiation in salmonids, the mechanism of its movement throughout the genome has remained elusive due to high frequencies of repetitive elements, rDNA sequences, and transposons surrounding the sex-determining regions (SDR). Despite these difficulties, bacterial artificial chromosome (BAC) library clones from both rainbow trout and Atlantic salmon containing the sdY region have been reported. Here, we report the sequences for these BACs as well as the extended sequence for the known SDR in Chinook gained through genome walking methods. Comparative analysis allowed us to study the overlapping SDRs from three unique salmonid Y chromosomes to define the specific content, size, and variation present between the species. We found approximately 4.1 kb of orthologous sequence common to all three species, which contains the genetic content necessary for masculinization. The regions contain transposable elements that may be responsible for the translocations of the SDR throughout salmonid genomes and we examine potential mechanistic roles of each one. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  20. Japanese Wolves are Genetically Divided into Two Groups Based on an 8-Nucleotide Insertion/Deletion within the mtDNA Control Region.

    PubMed

    Ishiguro, Naotaka; Inoshima, Yasuo; Yanai, Tokuma; Sasaki, Motoki; Matsui, Akira; Kikuchi, Hiroki; Maruyama, Masashi; Hongo, Hitomi; Vostretsov, Yuri E; Gasilin, Viatcheslav; Kosintsev, Pavel A; Quanjia, Chen; Chunxue, Wang

    2016-02-01

    The mitochondrial DNA (mtDNA) control region (198- to 598-bp) of four ancient Canis specimens (two Canis mandibles, a cranium, and a first phalanx) was examined, and each specimen was genetically identified as Japanese wolf. Two unique nucleotide substitutions, the 78-C insertion and the 482-G deletion, both of which are specific for Japanese wolf, were observed in each sample. Based on the mtDNA sequences analyzed, these four specimens and 10 additional Japanese wolf samples could be classified into two groups- Group A (10 samples) and Group B (4 samples)-which contain or lack an 8-bp insertion/deletion (indel), respectively. Interestingly, three dogs (Akita-b, Kishu 25, and S-husky 102) that each contained Japanese wolf-specific features were also classified into Group A or B based on the 8-bp indel. To determine the origin or ancestor of the Japanese wolf, mtDNA control regions of ancient continental Canis specimens were examined; 84 specimens were from Russia, and 29 were from China. However, none of these 113 specimens contained Japanese wolf-specific sequences. Moreover, none of 426 Japanese modern hunting dogs examined contained these Japanese wolf-specific mtDNA sequences. The mtDNA control region sequences of Groups A and B appeared to be unique to grey wolf and dog populations.

  1. Molecular characterization of subgenotype A1 (subgroup Aa) of hepatitis B virus.

    PubMed

    Kramvis, Anna; Kew, Michael C

    2007-07-01

    Subgenotypes of hepatitis B virus (HBV) were first recognized after a unique segment of genotype A was identified when sequencing the preS2/S region of southern African HBV isolates. Originally named subgroup A', subsequently called subgroup Aa (for Africa) or subgenotype A1, this subgenotype is found in South Africa, Malawi, Uganda, Tanzania, Somalia, Yemen, India, Nepal, the Philippines and Brazil. The relatively higher mean nucleotide divergence of subgenotype A1 suggests that it has been endemic and has a long evolutionary history in the populations where it prevails. Distinctive sequence characteristics could account for the high hepatitis B e-antigen (HBeAg) negativity and low HBV DNA levels in carriers of this subgenotype. Substitutions or mutations can reduce HBeAg expression at three levels: (i) 1762T1764A atthe transcriptional level; (ii) substitutions at nt 1809-1812 at the translational level; and (iii) 1862T at the post-translational level. Co-existence of 1762T1764A and nt 1809-1812 mutations reduces HBeAg expression in an additive manner. In addition, subgenotype A1 has unique sequence alterations in the transcriptional regulatory elements and the polymerase coding region. The distinct sequence characteristics of subgenotype A1 may contribute to the 4.5-fold increased risk of heptocellular carcinoma in HBV carriers infected with genotype A, which is entirely attributable to subgenotype A1.

  2. De Novo Transcriptome of the Hemimetabolous German Cockroach (Blattella germanica)

    PubMed Central

    Zhou, Xiaojie; Qian, Kun; Tong, Ying; Zhu, Junwei Jerry; Qiu, Xinghui; Zeng, Xiaopeng

    2014-01-01

    Background The German cockroach, Blattella germanica, is an important insect pest that transmits various pathogens mechanically and causes severe allergic diseases. This insect has long served as a model system for studies of insect biology, physiology and ecology. However, the lack of genome or transcriptome information heavily hinder our further understanding about the German cockroach in every aspect at a molecular level and on a genome-wide scale. To explore the transcriptome and identify unique sequences of interest, we subjected the B. germanica transcriptome to massively parallel pyrosequencing and generated the first reference transcriptome for B. germanica. Methodology/Principal Findings A total of 1,365,609 raw reads with an average length of 529 bp were generated via pyrosequencing the mixed cDNA library from different life stages of German cockroach including maturing oothecae, nymphs, adult females and males. The raw reads were de novo assembled to 48,800 contigs and 3,961 singletons with high-quality unique sequences. These sequences were annotated and classified functionally in terms of BLAST, GO and KEGG, and the genes putatively coding detoxification enzyme systems, insecticide targets, key components in systematic RNA interference, immunity and chemoreception pathways were identified. A total of 3,601 SSRs (Simple Sequence Repeats) loci were also predicted. Conclusions/Significance The whole transcriptome pyrosequencing data from this study provides a usable genetic resource for future identification of potential functional genes involved in various biological processes. PMID:25265537

  3. Is MMTV associated with human breast cancer? Maybe, but probably not.

    PubMed

    Perzova, Raisa; Abbott, Lynn; Benz, Patricia; Landas, Steve; Khan, Seema; Glaser, Jordan; Cunningham, Coleen K; Poiesz, Bernard

    2017-10-13

    Conflicting results regarding the association of MMTV with human breast cancer have been reported. Published sequence data have indicated unique MMTV strains in some human samples. However, concerns regarding contamination as a cause of false positive results have persisted. We performed PCR assays for MMTV on human breast cancer cell lines and fresh frozen and formalin fixed normal and malignant human breast epithelial samples. Assays were also performed on peripheral blood mononuclear cells from volunteer blood donors and subjects at risk for human retroviral infections. In addition, assays were performed on DNA samples from wild and laboratory mice. Sequencing of MMTV positive samples from both humans and mice were performed and phylogenetically compared. Using PCR under rigorous conditions to prevent and detect "carryover" contamination, we did detect MMTV DNA in human samples, including breast cancer. However, the results were not consistent and seemed to be an artifact. Further, experiments indicated that the probable source of false positives was murine DNA, containing endogenous MMTV, present in our building. However, comparison of published and, herein, newly described MMTV sequences with published data, indicates that there are some very unique human MMTV sequences in the literature. While we could not confirm the true presence of MMTV in our human breast cancer subjects, the data indicate that further, perhaps more traditional, retroviral studies are warranted to ascertain whether MMTV might rarely be the cause of human breast cancer.

  4. DNA Sequence Analysis of Sry Alleles (Subgenus Mus) Implicates Misregulation as the Cause of C57bl/6j-Y(pos) Sex Reversal and Defines the Sry Functional Unit

    PubMed Central

    Albrecht, K. H.; Eicher, E. M.

    1997-01-01

    The Sry (sex determining region, Y chromosome) open reading frame from mice representing four species of the genus Mus was sequenced in an effort to understand the conditional dysfunction of some M. domesticus Sry alleles when present on the C57BL/6J inbred strain genetic background and to delimit the functionally important protein regions. Twenty-two Sry alleles were sequenced, most from wild-derived Y chromosomes, including 11 M. domesticus alleles, seven M. musculus alleles and two alleles each from the related species M. spicilegus and M. spretus. We found that the HMG domain (high mobility group DNA binding domain) and the unique regions are well conserved, while the glutamine repeat cluster (GRC) region is quite variable. No correlation was found between the predicted protein isoforms and the ability of a Sry allele to allow differentiation of ovarian tissue when on the C57BL/6J genetic background, strongly suggesting that the cause of this sex reversal is not the Sry protein itself, but rather the regulation of SRY expression. Furthermore, our interspecies sequence analysis provides compelling evidence that the M. musculus and M. domesticus SRY functional domain is contained in the first 143 amino acids, which includes the HMG domain and adjacent unique region (UR-2). PMID:9383069

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sattison, M.B.; Schroeder, J.A.; Russell, K.D.

    The Idaho National Engineering Laboratory (INEL) over the past year has created 75 plant-specific Accident Sequence Precursor (ASP) models using the SAPHIRE suite of PRA codes. Along with the new models, the INEL has also developed a new module for SAPHIRE which is tailored specifically to the unique needs of ASP evaluations. These models and software will be the next generation of risk tools for the evaluation of accident precursors by both NRR and AEOD. This paper presents an overview of the models and software. Key characteristics include: (1) classification of the plant models according to plant response with amore » unique set of event trees for each plant class, (2) plant-specific fault trees using supercomponents, (3) generation and retention of all system and sequence cutsets, (4) full flexibility in modifying logic, regenerating cutsets, and requantifying results, and (5) user interface for streamlined evaluation of ASP events.« less

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sattison, M.B.; Schroeder, J.A.; Russell, K.D.

    The Idaho National Engineering Laboratory (INEL) over the past year has created 75 plant-specific Accident Sequence Precursor (ASP) models using the SAPHIRE suite of PRA codes. Along with the new models, the INEL has also developed a new module for SAPHIRE which is tailored specifically to the unique needs of conditional core damage probability (CCDP) evaluations. These models and software will be the next generation of risk tools for the evaluation of accident precursors by both NRR and AEOD. This paper presents an overview of the models and software. Key characteristics include: (1) classification of the plant models according tomore » plant response with a unique set of event trees for each plant class, (2) plant-specific fault trees using supercomponents, (3) generation and retention of all system and sequence cutsets, (4) full flexibility in modifying logic, regenerating cutsets, and requantifying results, and (5) user interface for streamlined evaluation of ASP events.« less

  7. Measles Outbreak with Unique Virus Genotyping, Ontario, Canada, 2015.

    PubMed

    Thomas, Shari; Hiebert, Joanne; Gubbay, Jonathan B; Gournis, Effie; Sharron, Jennifer; Severini, Alberto; Jiaravuthisan, Manisa; Shane, Amanda; Jaeger, Valerie; Crowcroft, Natasha S; Fediurek, Jill; Sander, Beate; Mazzulli, Tony; Schulz, Helene; Deeks, Shelley L

    2017-07-01

    The province of Ontario continues to experience measles virus transmissions despite the elimination of measles in Canada. We describe an unusual outbreak of measles in Ontario, Canada, in early 2015 that involved cases with a unique strain of virus and no known association among primary case-patients. A total of 18 cases of measles were reported from 4 public health units during the outbreak period (January 25-March 23, 2015); none of these cases occurred in persons who had recently traveled. Despite enhancements to case-patient interview methods and epidemiologic analyses, a source patient was not identified. However, the molecular epidemiologic analysis, which included extended sequencing, strongly suggested that all cases derived from a single importation of measles virus genotype D4. The use of timely genotype sequencing, rigorous epidemiologic investigation, and a better understanding of the gaps in surveillance are needed to maintain Ontario's measles elimination status.

  8. Reading the Second Code: Mapping Epigenomes to Understand Plant Growth, Development, and Adaptation to the Environment[OA

    PubMed Central

    2012-01-01

    We have entered a new era in agricultural and biomedical science made possible by remarkable advances in DNA sequencing technologies. The complete sequence of an individual’s set of chromosomes (collectively, its genome) provides a primary genetic code for what makes that individual unique, just as the contents of every personal computer reflect the unique attributes of its owner. But a second code, composed of “epigenetic” layers of information, affects the accessibility of the stored information and the execution of specific tasks. Nature’s second code is enigmatic and must be deciphered if we are to fully understand and optimize the genetic potential of crop plants. The goal of the Epigenomics of Plants International Consortium is to crack this second code, and ultimately master its control, to help catalyze a new green revolution. PMID:22751210

  9. The genome of Eucalyptus grandis.

    PubMed

    Myburg, Alexander A; Grattapaglia, Dario; Tuskan, Gerald A; Hellsten, Uffe; Hayes, Richard D; Grimwood, Jane; Jenkins, Jerry; Lindquist, Erika; Tice, Hope; Bauer, Diane; Goodstein, David M; Dubchak, Inna; Poliakov, Alexandre; Mizrachi, Eshchar; Kullan, Anand R K; Hussey, Steven G; Pinard, Desre; van der Merwe, Karen; Singh, Pooja; van Jaarsveld, Ida; Silva-Junior, Orzenil B; Togawa, Roberto C; Pappas, Marilia R; Faria, Danielle A; Sansaloni, Carolina P; Petroli, Cesar D; Yang, Xiaohan; Ranjan, Priya; Tschaplinski, Timothy J; Ye, Chu-Yu; Li, Ting; Sterck, Lieven; Vanneste, Kevin; Murat, Florent; Soler, Marçal; Clemente, Hélène San; Saidi, Naijib; Cassan-Wang, Hua; Dunand, Christophe; Hefer, Charles A; Bornberg-Bauer, Erich; Kersting, Anna R; Vining, Kelly; Amarasinghe, Vindhya; Ranik, Martin; Naithani, Sushma; Elser, Justin; Boyd, Alexander E; Liston, Aaron; Spatafora, Joseph W; Dharmwardhana, Palitha; Raja, Rajani; Sullivan, Christopher; Romanel, Elisson; Alves-Ferreira, Marcio; Külheim, Carsten; Foley, William; Carocha, Victor; Paiva, Jorge; Kudrna, David; Brommonschenkel, Sergio H; Pasquali, Giancarlo; Byrne, Margaret; Rigault, Philippe; Tibbits, Josquin; Spokevicius, Antanas; Jones, Rebecca C; Steane, Dorothy A; Vaillancourt, René E; Potts, Brad M; Joubert, Fourie; Barry, Kerrie; Pappas, Georgios J; Strauss, Steven H; Jaiswal, Pankaj; Grima-Pettenati, Jacqueline; Salse, Jérôme; Van de Peer, Yves; Rokhsar, Daniel S; Schmutz, Jeremy

    2014-06-19

    Eucalypts are the world's most widely planted hardwood trees. Their outstanding diversity, adaptability and growth have made them a global renewable resource of fibre and energy. We sequenced and assembled >94% of the 640-megabase genome of Eucalyptus grandis. Of 36,376 predicted protein-coding genes, 34% occur in tandem duplications, the largest proportion thus far in plant genomes. Eucalyptus also shows the highest diversity of genes for specialized metabolites such as terpenes that act as chemical defence and provide unique pharmaceutical oils. Genome sequencing of the E. grandis sister species E. globulus and a set of inbred E. grandis tree genomes reveals dynamic genome evolution and hotspots of inbreeding depression. The E. grandis genome is the first reference for the eudicot order Myrtales and is placed here sister to the eurosids. This resource expands our understanding of the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.

  10. The scorpion toxin Bot IX is a potent member of the α-like family and has a unique N-terminal sequence extension.

    PubMed

    Martin-Eauclaire, Marie-France; Salvatierra, Juan; Bosmans, Frank; Bougis, Pierre E

    2016-09-01

    We report the detailed chemical, immunological and pharmacological characterization of the α-toxin Bot IX from the Moroccan scorpion Buthus occitanus tunetanus venom. Bot IX, which consists of 70 amino acids, is a highly atypical toxin. It carries a unique N-terminal sequence extension and is highly lethal in mice. Voltage clamp recordings on oocytes expressing rat Nav1.2 or insect BgNav1 reveal that, similar to other α-like toxins, Bot IX inhibits fast inactivation of both variants. Moreover, Bot IX belongs to the same structural/immunological group as the α-like toxin Bot I. Remarkably, radioiodinated Bot IX competes efficiently with the classical α-toxin AaH II from Androctonus australis, and displays one of the highest affinities for Nav channels. © 2016 Federation of European Biochemical Societies.

  11. In vivo generation of DNA sequence diversity for cellular barcoding

    PubMed Central

    Peikon, Ian D.; Gizatullina, Diana I.; Zador, Anthony M.

    2014-01-01

    Heterogeneity is a ubiquitous feature of biological systems. A complete understanding of such systems requires a method for uniquely identifying and tracking individual components and their interactions with each other. We have developed a novel method of uniquely tagging individual cells in vivo with a genetic ‘barcode’ that can be recovered by DNA sequencing. Our method is a two-component system comprised of a genetic barcode cassette whose fragments are shuffled by Rci, a site-specific DNA invertase. The system is highly scalable, with the potential to generate theoretical diversities in the billions. We demonstrate the feasibility of this technique in Escherichia coli. Currently, this method could be employed to track the dynamics of populations of microbes through various bottlenecks. Advances of this method should prove useful in tracking interactions of cells within a network, and/or heterogeneity within complex biological samples. PMID:25013177

  12. Human embryonic stem cell phosphoproteome revealed by electron transfer dissociation tandem mass spectrometry.

    PubMed

    Swaney, Danielle L; Wenger, Craig D; Thomson, James A; Coon, Joshua J

    2009-01-27

    Protein phosphorylation is central to the understanding of cellular signaling, and cellular signaling is suggested to play a major role in the regulation of human embryonic stem (ES) cell pluripotency. Here, we describe the use of conventional tandem mass spectrometry-based sequencing technology--collision-activated dissociation (CAD)--and the more recently developed method electron transfer dissociation (ETD) to characterize the human ES cell phosphoproteome. In total, these experiments resulted in the identification of 11,995 unique phosphopeptides, corresponding to 10,844 nonredundant phosphorylation sites, at a 1% false discovery rate (FDR). Among these phosphorylation sites are 5 localized to 2 pluripotency critical transcription factors--OCT4 and SOX2. From these experiments, we conclude that ETD identifies a larger number of unique phosphopeptides than CAD (8,087 to 3,868), more frequently localizes the phosphorylation site to a specific residue (49.8% compared with 29.6%), and sequences whole classes of phosphopeptides previously unobserved.

  13. Functional Proteomics to Identify Moderators of CD8+ T Cell Function in Melanoma

    DTIC Science & Technology

    2015-05-01

    identified 17 phage that selectively bind TIL rather than effector cells. However, none of these phage influenced CD8+ TIL expansion or function in vitro...Using a novel NextGeneration sequencing approach, we have further defined another 1,000,000 phage that selectively bind TIL , of which 100,000 are unique...Using the original approach outlined in the application, we identified a total of 17 unique phage that selectively bind CD8+ TIL but not effector or

  14. Characterization of a species-specific repetitive DNA from a highly endangered wild animal, Rhinoceros unicornis, and assessment of genetic polymorphism by microsatellite associated sequence amplification (MASA).

    PubMed

    Ali, S; Azfer, M A; Bashamboo, A; Mathur, P K; Malik, P K; Mathur, V B; Raha, A K; Ansari, S

    1999-03-04

    We have cloned and sequenced a 906bp EcoRI repeat DNA fraction from Rhinoceros unicornis genome. The contig pSS(R)2 is AT rich with 340 A (37.53%), 187 C (20.64%), 173 G (19.09%) and 206 T (22.74%). The sequence contains MALT box, NF-E1, Poly-A signal, lariat consensus sequences, TATA box, translational initiation sequences and several stop codons. Translation of the contig showed seven different types of protein motifs, among which, EGF-like domain cysteine pattern signatures and Bowman-Birk serine protease inhibitor family signatures were prominent. The presence of eukaryotic transcriptional elements, protein signatures and analysis of subset sequences in the 5' region from 1 to 165nt indicating coding potential (test code value=0.97) suggest possible regulatory and/or functional role(s) of these sequences in the rhino genome. Translation of the complementary strand from 906 to 706nt and 190 to 2nt showed proteins of more than 7kDa rich in non-polar residues. This suggests that pSS(R)2 is either a part of, or adjacent to, a functional gene. The contig contains mostly non-consecutive simple repeat units from 2 to 17nt with varying frequencies, of which four base motifs were found to be predominant. Zoo-blot hybridization revealed that pSS(R)2 sequences are unique to R. unicornis genome because they do not cross-hybridize, even with the genomic DNA of South African black rhino Diceros bicornis. Southern blot analysis of R. unicornis genomic DNA with pSS(R)2 and other synthetic oligo probes revealed a high level of genetic homogeneity, which was also substantiated by microsatellite associated sequence amplification (MASA). Owing to its uniqueness, the pSS(R)2 probe has a potential application in the area of conservation biology for unequivocal identification of horn or other body tissues of R. unicornis. The evolutionary aspect of this repeat fraction in the context of comparative genome analysis is discussed.

  15. Characterization of minimal sequences associated with self-similar interval exchange maps

    NASA Astrophysics Data System (ADS)

    Cobo, Milton; Gutiérrez-Romo, Rodolfo; Maass, Alejandro

    2018-04-01

    The construction of affine interval exchange maps (IEMs) with wandering intervals that are semi-conjugate to a given self-similar IEM is strongly related to the existence of the so-called minimal sequences associated with local potentials, which are certain elements of the substitution subshift arising from the given IEM. In this article, under the condition called unique representation property, we characterize such minimal sequences for potentials coming from non-real eigenvalues of the substitution matrix. We also give conditions on the slopes of the affine extensions of a self-similar IEM that determine whether it exhibits a wandering interval or not.

  16. Draft Genome Sequence of Methanoculleus sediminis S3FaT, a Hydrogenotrophic Methanogen Isolated from a Submarine Mud Volcano in Taiwan.

    PubMed

    Chen, Sheng-Chung; Chen, Mei-Fei; Weng, Chieh-Yin; Lai, Mei-Chin; Wu, Sue-Yao

    2016-04-21

    Here, we announce the genome sequence of ITALIC! Methanoculleus sediminisS3Fa(T)(DSM 29354(T)), a strict anaerobic methanoarchaeon, which was isolated from sediments near the submarine mud volcano MV4 located offshore in southwestern Taiwan. The 2.49-Mb genome consists of 2,459 predicted genes, 3 rRNAs, 48 tRNAs, and 1 ncRNA. The sequence of this novel strain may provide more information for species delineation and the roles that this strain plays in the unique marine mud volcano habitat. Copyright © 2016 Chen et al.

  17. Potential Uses and Inherent Challenges of Using Genome-Scale Sequencing to Augment Current Newborn Screening.

    PubMed

    Berg, Jonathan S; Powell, Cynthia M

    2015-10-05

    Since newborn screening (NBS) began in the 1960s, technological advances have enabled its expansion to include an increasing number of disorders. Recent developments now make it possible to sequence an infant's genome relatively quickly and economically. Clinical application of whole-exome and whole-genome sequencing is expanding at a rapid pace but presents many challenges. Its utility in NBS has yet to be demonstrated and its application in the pediatric population requires examination, not only for potential clinical benefits, but also for the unique ethical challenges it presents. Copyright © 2015 Cold Spring Harbor Laboratory Press; all rights reserved.

  18. Identification of a novel astrovirus in domestic sheep in Hungary.

    PubMed

    Reuter, Gábor; Pankovics, Péter; Delwart, Eric; Boros, Ákos

    2012-02-01

    The family Astroviridae consists of two genera, Avastrovirus and Mamastrovirus, whose members are associated with gastroenteritis in avian and mammalian hosts, respectively. We serendipitously identified a novel ovine astrovirus in a fecal specimen from a domestic sheep (Ovis aries) in Hungary by viral metagenomic analysis. Sequencing of the fragment indicated that it was an ORF1b/ORF2/3'UTR sequence, and it has been submitted to the GenBank database as ovine astrovirus type 2 (OAstV-2/Hungary/2009) with accession number JN592482. The unique sequence characteristics and the phylogenetic position of OAstV-2 suggest that genetically divergent lineages of astroviruses exist in sheep.

  19. ACMES: fast multiple-genome searches for short repeat sequences with concurrent cross-species information retrieval

    PubMed Central

    Reneker, Jeff; Shyu, Chi-Ren; Zeng, Peiyu; Polacco, Joseph C.; Gassmann, Walter

    2004-01-01

    We have developed a web server for the life sciences community to use to search for short repeats of DNA sequence of length between 3 and 10 000 bases within multiple species. This search employs a unique and fast hash function approach. Our system also applies information retrieval algorithms to discover knowledge of cross-species conservation of repeat sequences. Furthermore, we have incorporated a part of the Gene Ontology database into our information retrieval algorithms to broaden the coverage of the search. Our web server and tutorial can be found at http://acmes.rnet.missouri.edu. PMID:15215469

  20. Peanut gene expression profiling in developing seeds at different reproduction stages during Aspergillus parasiticus infection

    PubMed Central

    Guo, Baozhu; Chen, Xiaoping; Dang, Phat; Scully, Brian T; Liang, Xuanqiang; Holbrook, C Corley; Yu, Jiujiang; Culbreath, Albert K

    2008-01-01

    Background Peanut (Arachis hypogaea L.) is an important crop economically and nutritionally, and is one of the most susceptible host crops to colonization of Aspergillus parasiticus and subsequent aflatoxin contamination. Knowledge from molecular genetic studies could help to devise strategies in alleviating this problem; however, few peanut DNA sequences are available in the public database. In order to understand the molecular basis of host resistance to aflatoxin contamination, a large-scale project was conducted to generate expressed sequence tags (ESTs) from developing seeds to identify resistance-related genes involved in defense response against Aspergillus infection and subsequent aflatoxin contamination. Results We constructed six different cDNA libraries derived from developing peanut seeds at three reproduction stages (R5, R6 and R7) from a resistant and a susceptible cultivated peanut genotypes, 'Tifrunner' (susceptible to Aspergillus infection with higher aflatoxin contamination and resistant to TSWV) and 'GT-C20' (resistant to Aspergillus with reduced aflatoxin contamination and susceptible to TSWV). The developing peanut seed tissues were challenged by A. parasiticus and drought stress in the field. A total of 24,192 randomly selected cDNA clones from six libraries were sequenced. After removing vector sequences and quality trimming, 21,777 high-quality EST sequences were generated. Sequence clustering and assembling resulted in 8,689 unique EST sequences with 1,741 tentative consensus EST sequences (TCs) and 6,948 singleton ESTs. Functional classification was performed according to MIPS functional catalogue criteria. The unique EST sequences were divided into twenty-two categories. A similarity search against the non-redundant protein database available from NCBI indicated that 84.78% of total ESTs showed significant similarity to known proteins, of which 165 genes had been previously reported in peanuts. There were differences in overall expression patterns in different libraries and genotypes. A number of sequences were expressed throughout all of the libraries, representing constitutive expressed sequences. In order to identify resistance-related genes with significantly differential expression, a statistical analysis to estimate the relative abundance (R) was used to compare the relative abundance of each gene transcripts in each cDNA library. Thirty six and forty seven unique EST sequences with threshold of R > 4 from libraries of 'GT-C20' and 'Tifrunner', respectively, were selected for examination of temporal gene expression patterns according to EST frequencies. Nine and eight resistance-related genes with significant up-regulation were obtained in 'GT-C20' and 'Tifrunner' libraries, respectively. Among them, three genes were common in both genotypes. Furthermore, a comparison of our EST sequences with other plant sequences in the TIGR Gene Indices libraries showed that the percentage of peanut EST matched to Arabidopsis thaliana, maize (Zea mays), Medicago truncatula, rapeseed (Brassica napus), rice (Oryza sativa), soybean (Glycine max) and wheat (Triticum aestivum) ESTs ranged from 33.84% to 79.46% with the sequence identity ≥ 80%. These results revealed that peanut ESTs are more closely related to legume species than to cereal crops, and more homologous to dicot than to monocot plant species. Conclusion The developed ESTs can be used to discover novel sequences or genes, to identify resistance-related genes and to detect the differences among alleles or markers between these resistant and susceptible peanut genotypes. Additionally, this large collection of cultivated peanut EST sequences will make it possible to construct microarrays for gene expression studies and for further characterization of host resistance mechanisms. It will be a valuable genomic resource for the peanut community. The 21,777 ESTs have been deposited to the NCBI GenBank database with accession numbers ES702769 to ES724546. PMID:18248674

  1. Whole-Genome Sequencing for Detecting Antimicrobial Resistance in Nontyphoidal Salmonella

    PubMed Central

    Tyson, Gregory H.; Kabera, Claudine; Chen, Yuansha; Li, Cong; Folster, Jason P.; Ayers, Sherry L.; Lam, Claudia; Tate, Heather P.; Zhao, Shaohua

    2016-01-01

    Laboratory-based in vitro antimicrobial susceptibility testing is the foundation for guiding anti-infective therapy and monitoring antimicrobial resistance trends. We used whole-genome sequencing (WGS) technology to identify known antimicrobial resistance determinants among strains of nontyphoidal Salmonella and correlated these with susceptibility phenotypes to evaluate the utility of WGS for antimicrobial resistance surveillance. Six hundred forty Salmonella of 43 different serotypes were selected from among retail meat and human clinical isolates that were tested for susceptibility to 14 antimicrobials using broth microdilution. The MIC for each drug was used to categorize isolates as susceptible or resistant based on Clinical and Laboratory Standards Institute clinical breakpoints or National Antimicrobial Resistance Monitoring System (NARMS) consensus interpretive criteria. Each isolate was subjected to whole-genome shotgun sequencing, and resistance genes were identified from assembled sequences. A total of 65 unique resistance genes, plus mutations in two structural resistance loci, were identified. There were more unique resistance genes (n = 59) in the 104 human isolates than in the 536 retail meat isolates (n = 36). Overall, resistance genotypes and phenotypes correlated in 99.0% of cases. Correlations approached 100% for most classes of antibiotics but were lower for aminoglycosides and beta-lactams. We report the first finding of extended-spectrum β-lactamases (ESBLs) (blaCTX-M1 and blaSHV2a) in retail meat isolates of Salmonella in the United States. Whole-genome sequencing is an effective tool for predicting antibiotic resistance in nontyphoidal Salmonella, although the use of more appropriate surveillance breakpoints and increased knowledge of new resistance alleles will further improve correlations. PMID:27381390

  2. Bacterial discrimination by means of a universal array approach mediated by LDR (ligase detection reaction)

    PubMed Central

    Busti, Elena; Bordoni, Roberta; Castiglioni, Bianca; Monciardini, Paolo; Sosio, Margherita; Donadio, Stefano; Consolandi, Clarissa; Rossi Bernardi, Luigi; Battaglia, Cristina; De Bellis, Gianluca

    2002-01-01

    Background PCR amplification of bacterial 16S rRNA genes provides the most comprehensive and flexible means of sampling bacterial communities. Sequence analysis of these cloned fragments can provide a qualitative and quantitative insight of the microbial population under scrutiny although this approach is not suited to large-scale screenings. Other methods, such as denaturing gradient gel electrophoresis, heteroduplex or terminal restriction fragment analysis are rapid and therefore amenable to field-scale experiments. A very recent addition to these analytical tools is represented by microarray technology. Results Here we present our results using a Universal DNA Microarray approach as an analytical tool for bacterial discrimination. The proposed procedure is based on the properties of the DNA ligation reaction and requires the design of two probes specific for each target sequence. One oligo carries a fluorescent label and the other a unique sequence (cZipCode or complementary ZipCode) which identifies a ligation product. Ligated fragments, obtained in presence of a proper template (a PCR amplified fragment of the 16s rRNA gene) contain either the fluorescent label or the unique sequence and therefore are addressed to the location on the microarray where the ZipCode sequence has been spotted. Such an array is therefore "Universal" being unrelated to a specific molecular analysis. Here we present the design of probes specific for some groups of bacteria and their application to bacterial diagnostics. Conclusions The combined use of selective probes, ligation reaction and the Universal Array approach yielded an analytical procedure with a good power of discrimination among bacteria. PMID:12243651

  3. The 193-base pair Gsg2 (haspin) promoter region regulates germ cell-specific expression bidirectionally and synchronously.

    PubMed

    Tokuhiro, Keizo; Miyagawa, Yasushi; Yamada, Shuichi; Hirose, Mika; Ohta, Hiroshi; Nishimune, Yoshitake; Tanaka, Hiromitsu

    2007-03-01

    Haspin is a unique protein kinase expressed predominantly in haploid male germ cells. The genomic structure of haspin (Gsg2) has revealed it to be intronless, and the entire transcription unit is in an intron of the integrin alphaE (Itgae) gene. Transcription occurs from a bidirectional promoter that also generates an alternatively spliced integrin alphaE-derived mRNA (Aed). In mice, the testis-specific alternative splicing of Aed is expressed bidirectionally downstream from the Gsg2 transcription initiation site, and a segment consisting of 26 bp transcribes both genomic DNA strands between Gsg2 and the Aed transcription initiation sites. To investigate the mechanisms for this unique gene regulation, we cloned and characterized the Gsg2 promoter region. The 193-bp genomic fragment from the 5' end of the Gsg2 and Aed genes, fused with EGFP and DsRed genes, drove the expression of both proteins in haploid germ cells of transgenic mice. This promoter element contained only a GC-rich sequence, and not the previously reported DNA sequences known to bind various transcription factors--with the exception of E2F1, TCFAP2A1 (AP2), and SP1. Here, we show that the 193-bp DNA sequence is sufficient for the specific, bidirectional, and synchronous expression in germ cells in the testis. We also demonstrate the existence of germ cell nuclear factors specifically bound to the promoter sequence. This activity may be regulated by binding to the promoter sequence with germ cell-specific nuclear complex(es) without regulation via DNA methylation.

  4. Generation of “LYmph Node Derived Antibody Libraries” (LYNDAL) for selecting fully human antibody fragments with therapeutic potential

    PubMed Central

    Diebolder, Philipp; Keller, Armin; Haase, Stephanie; Schlegelmilch, Anne; Kiefer, Jonathan D; Karimi, Tamana; Weber, Tobias; Moldenhauer, Gerhard; Kehm, Roland; Eis-Hübinger, Anna M; Jäger, Dirk; Federspil, Philippe A; Herold-Mende, Christel; Dyckhoff, Gerhard; Kontermann, Roland E; Arndt, Michaela AE; Krauss, Jürgen

    2014-01-01

    The development of efficient strategies for generating fully human monoclonal antibodies with unique functional properties that are exploitable for tailored therapeutic interventions remains a major challenge in the antibody technology field. Here, we present a methodology for recovering such antibodies from antigen-encountered human B cell repertoires. As the source for variable antibody genes, we cloned immunoglobulin G (IgG)-derived B cell repertoires from lymph nodes of 20 individuals undergoing surgery for head and neck cancer. Sequence analysis of unselected “LYmph Node Derived Antibody Libraries” (LYNDAL) revealed a naturally occurring distribution pattern of rearranged antibody sequences, representing all known variable gene families and most functional germline sequences. To demonstrate the feasibility for selecting antibodies with therapeutic potential from these repertoires, seven LYNDAL from donors with high serum titers against herpes simplex virus (HSV) were panned on recombinant glycoprotein B of HSV-1. Screening for specific binders delivered 34 single-chain variable fragments (scFvs) with unique sequences. Sequence analysis revealed extensive somatic hypermutation of enriched clones as a result of affinity maturation. Binding of scFvs to common glycoprotein B variants from HSV-1 and HSV-2 strains was highly specific, and the majority of analyzed antibody fragments bound to the target antigen with nanomolar affinity. From eight scFvs with HSV-neutralizing capacity in vitro, the most potent antibody neutralized 50% HSV-2 at 4.5 nM as a dimeric (scFv)2. We anticipate our approach to be useful for recovering fully human antibodies with therapeutic potential. PMID:24256717

  5. Generation of “LYmph Node Derived Antibody Libraries” (LYNDAL) for selecting fully human antibody fragments with therapeutic potential.

    PubMed

    Diebolder, Philipp; Keller, Armin; Haase, Stephanie; Schlegelmilch, Anne; Kiefer, Jonathan D; Karimi, Tamana; Weber, Tobias; Moldenhauer, Gerhard; Kehm, Roland; Eis-Hübinger, Anna M; Jäger, Dirk; Federspil, Philippe A; Herold-Mende, Christel; Dyckhoff, Gerhard; Kontermann, Roland E; Arndt, Michaela A E; Krauss, Jürgen

    2014-01-01

    The development of efficient strategies for generating fully human monoclonal antibodies with unique functional properties that are exploitable for tailored therapeutic interventions remains a major challenge in the antibody technology field. Here, we present a methodology for recovering such antibodies from antigen-encountered human B cell repertoires. As the source for variable antibody genes, we cloned immunoglobulin G (IgG)-derived B cell repertoires from lymph nodes of 20 individuals undergoing surgery for head and neck cancer. Sequence analysis of unselected “LYmph Node Derived Antibody Libraries” (LYNDAL) revealed a naturally occurring distribution pattern of rearranged antibody sequences, representing all known variable gene families and most functional germline sequences. To demonstrate the feasibility for selecting antibodies with therapeutic potential from these repertoires, seven LYNDAL from donors with high serum titers against herpes simplex virus (HSV) were panned on recombinant glycoprotein B of HSV-1. Screening for specific binders delivered 34 single-chain variable fragments (scFvs) with unique sequences. Sequence analysis revealed extensive somatic hypermutation of enriched clones as a result of affinity maturation. Binding of scFvs to common glycoprotein B variants from HSV-1 and HSV-2 strains was highly specific, and the majority of analyzed antibody fragments bound to the target antigen with nanomolar affinity. From eight scFvs with HSV-neutralizing capacity in vitro,the most potent antibody neutralized 50% HSV-2 at 4.5 nM as a dimeric (scFv)2. We anticipate our approach to be useful for recovering fully human antibodies with therapeutic potential.

  6. Prevalence of the F-type lectin domain.

    PubMed

    Bishnoi, Ritika; Khatri, Indu; Subramanian, Srikrishna; Ramya, T N C

    2015-08-01

    F-type lectins are fucolectins with characteristic fucose and calcium-binding sequence motifs and a unique lectin fold (the "F-type" fold). F-type lectins are phylogenetically widespread with selective distribution. Several eukaryotic F-type lectins have been biochemically and structurally characterized, and the F-type lectin domain (FLD) has also been studied in the bacterial proteins, Streptococcus mitis lectinolysin and Streptococcus pneumoniae SP2159. However, there is little knowledge about the extent of occurrence of FLDs and their domain organization, especially, in bacteria. We have now mined the extensive genomic sequence information available in the public databases with sensitive sequence search techniques in order to exhaustively survey prokaryotic and eukaryotic FLDs. We report 437 FLD sequence clusters (clustered at 80% sequence identity) from eukaryotic, eubacterial and viral proteins. Domain architectures are diverse but mostly conserved in closely related organisms, and domain organizations of bacterial FLD-containing proteins are very different from their eukaryotic counterparts, suggesting unique specialization of FLDs to suit different requirements. Several atypical phylogenetic associations hint at lateral transfer. Among eukaryotes, we observe an expansion of FLDs in terms of occurrence and domain organization diversity in the taxa Mollusca, Hemichordata and Branchiostomi, perhaps coinciding with greater emphasis on innate immune strategies in these organisms. The naturally occurring FLDs with diverse domain organizations that we have identified here will be useful for future studies aimed at creating designer molecular platforms for directing desired biological activities to fucosylated glycoconjugates in target niches. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. Sequence-Based Discovery Demonstrates That Fixed Light Chain Human Transgenic Rats Produce a Diverse Repertoire of Antigen-Specific Antibodies.

    PubMed

    Harris, Katherine E; Aldred, Shelley Force; Davison, Laura M; Ogana, Heather Anne N; Boudreau, Andrew; Brüggemann, Marianne; Osborn, Michael; Ma, Biao; Buelow, Benjamin; Clarke, Starlynn C; Dang, Kevin H; Iyer, Suhasini; Jorgensen, Brett; Pham, Duy T; Pratap, Payal P; Rangaswamy, Udaya S; Schellenberger, Ute; van Schooten, Wim C; Ugamraj, Harshad S; Vafa, Omid; Buelow, Roland; Trinklein, Nathan D

    2018-01-01

    We created a novel transgenic rat that expresses human antibodies comprising a diverse repertoire of heavy chains with a single common rearranged kappa light chain (IgKV3-15-JK1). This fixed light chain animal, called OmniFlic, presents a unique system for human therapeutic antibody discovery and a model to study heavy chain repertoire diversity in the context of a constant light chain. The purpose of this study was to analyze heavy chain variable gene usage, clonotype diversity, and to describe the sequence characteristics of antigen-specific monoclonal antibodies (mAbs) isolated from immunized OmniFlic animals. Using next-generation sequencing antibody repertoire analysis, we measured heavy chain variable gene usage and the diversity of clonotypes present in the lymph node germinal centers of 75 OmniFlic rats immunized with 9 different protein antigens. Furthermore, we expressed 2,560 unique heavy chain sequences sampled from a diverse set of clonotypes as fixed light chain antibody proteins and measured their binding to antigen by ELISA. Finally, we measured patterns and overall levels of somatic hypermutation in the full B-cell repertoire and in the 2,560 mAbs tested for binding. The results demonstrate that OmniFlic animals produce an abundance of antigen-specific antibodies with heavy chain clonotype diversity that is similar to what has been described with unrestricted light chain use in mammals. In addition, we show that sequence-based discovery is a highly effective and efficient way to identify a large number of diverse monoclonal antibodies to a protein target of interest.

  8. [Analysis of structural characteristics of alpha-tubulins in plants with enhanced cold tolerance].

    PubMed

    Nyporko, A Iu; Demchuk, O N; Blium, Ia B

    2003-01-01

    The uniqueness of the point substitutions in the sequences of two alpha-tubulin isotypes from psychrophilic alga Chloromonas that can determine the increased cold tolerance of this alga was analyzed. The comparison of all known amino acid sequences of plant alpha-tubulins enabled to ascertain that only M268-->V replacement is unique and may have a significant influence on spatial structure of plant alpha-tubulins. Modeling of molecular surfaces of alpha-tubulins from Chloromonas, Chalmydomonas reinhardtii and goose grass Eleusine indica showed that insertion of the amino acid replacement M268-->V into the sequence of goose grace tubulin led to the likening of this protein surface to the surface of native alpha-tubulin from Chloromonas. Alteration of local hydrophobic properties of alpha-tubulin molecular surface in interdimeric contact zone as a result of the mentioned replacement was shown that may play important role in increasing the level of cold resistance of microtubules. The crucial role of amino acid residue in 268 position for forming the interdimeric contact surface of alpha-tubulin molecule was revealed. The assumption is made about the importance of replacements at this position for plant tolerance to abiotic factors of different nature (cold, herbicides).

  9. The Complete Plastome Sequence of an Antarctic Bryophyte Sanionia uncinata (Hedw.) Loeske

    PubMed Central

    Park, Mira; Park, Hyun; Lee, Hyoungseok; Lee, Byeong-ha

    2018-01-01

    Organellar genomes of bryophytes are poorly represented with chloroplast genomes of only four mosses, four liverworts and two hornworts having been sequenced and annotated. Moreover, while Antarctic vegetation is dominated by the bryophytes, there are few reports on the plastid genomes for the Antarctic bryophytes. Sanionia uncinata (Hedw.) Loeske is one of the most dominant moss species in the maritime Antarctic. It has been researched as an important marker for ecological studies and as an extremophile plant for studies on stress tolerance. Here, we report the complete plastome sequence of S. uncinata, which can be exploited in comparative studies to identify the lineage-specific divergence across different species. The complete plastome of S. uncinata is 124,374 bp in length with a typical quadripartite structure of 114 unique genes including 82 unique protein-coding genes, 37 tRNA genes and four rRNA genes. However, two genes encoding the α subunit of RNA polymerase (rpoA) and encoding the cytochrome b6/f complex subunit VIII (petN) were absent. We could identify nuclear genes homologous to those genes, which suggests that rpoA and petN might have been relocated from the chloroplast genome to the nuclear genome. PMID:29494552

  10. Unique Structural Features and Sequence Motifs of Proline Utilization A (PutA)

    PubMed Central

    Singh, Ranjan K.; Tanner, John J.

    2013-01-01

    Proline utilization A proteins (PutAs) are bifunctional enzymes that catalyze the oxidation of proline to glutamate using spatially separated proline dehydrogenase and pyrroline-5-carboxylate dehydrogenase active sites. Here we use the crystal structure of the minimalist PutA from Bradyrhizobium japonicum (BjPutA) along with sequence analysis to identify unique structural features of PutAs. This analysis shows that PutAs have secondary structural elements and domains not found in the related monofunctional enzymes. Some of these extra features are predicted to be important for substrate channeling in BjPutA. Multiple sequence alignment analysis shows that some PutAs have a 17-residue conserved motif in the C-terminal 20–30 residues of the polypeptide chain. The BjPutA structure shows that this motif helps seal the internal substrate-channeling cavity from the bulk medium. Finally, it is shown that some PutAs have a 100–200 residue domain of unknown function in the C-terminus that is not found in minimalist PutAs. Remote homology detection suggests that this domain is homologous to the oligomerization beta-hairpin and Rossmann fold domain of BjPutA. PMID:22201760

  11. Stable isotope, site-specific mass tagging for protein identification

    DOEpatents

    Chen, Xian

    2006-10-24

    Proteolytic peptide mass mapping as measured by mass spectrometry provides an important method for the identification of proteins, which are usually identified by matching the measured and calculated m/z values of the proteolytic peptides. A unique identification is, however, heavily dependent upon the mass accuracy and sequence coverage of the fragment ions generated by peptide ionization. The present invention describes a method for increasing the specificity, accuracy and efficiency of the assignments of particular proteolytic peptides and consequent protein identification, by the incorporation of selected amino acid residue(s) enriched with stable isotope(s) into the protein sequence without the need for ultrahigh instrumental accuracy. Selected amino acid(s) are labeled with .sup.13C/.sup.15N/.sup.2H and incorporated into proteins in a sequence-specific manner during cell culturing. Each of these labeled amino acids carries a defined mass change encoded in its monoisotopic distribution pattern. Through their characteristic patterns, the peptides with mass tag(s) can then be readily distinguished from other peptides in mass spectra. The present method of identifying unique proteins can also be extended to protein complexes and will significantly increase data search specificity, efficiency and accuracy for protein identifications.

  12. Genome Wide Search for Biomarkers to Diagnose Yersinia Infections.

    PubMed

    Kalia, Vipin Chandra; Kumar, Prasun

    2015-12-01

    Bacterial identification on the basis of the highly conserved 16S rRNA (rrs) gene is limited by its presence in multiple copies and a very high level of similarity among them. The need is to look for other genes with unique characteristics to be used as biomarkers. Fifty-one sequenced genomes belonging to 10 different Yersinia species were used for searching genes common to all the genomes. Out of 304 common genes, 34 genes of sizes varying from 0.11 to 4.42 kb, were selected and subjected to in silico digestion with 10 different Restriction endonucleases (RE) (4-6 base cutters). Yersinia species have 6-7 copies of rrs per genome, which are difficult to distinguish by multiple sequence alignments or their RE digestion patterns. However, certain unique combinations of other common gene sequences-carB, fadJ, gluM, gltX, ileS, malE, nusA, ribD, and rlmL and their RE digestion patterns can be used as markers for identifying 21 strains belonging to 10 Yersinia species: Y. aldovae, Y. enterocolitica, Y. frederiksenii, Y. intermedia, Y. kristensenii, Y. pestis, Y. pseudotuberculosis, Y. rohdei, Y. ruckeri, and Y. similis. This approach can be applied for rapid diagnostic applications.

  13. Molecular epidemiology of pathogenic Leptospira spp. in the straw-colored fruit bat (Eidolon helvum) migrating to Zambia from the Democratic Republic of Congo.

    PubMed

    Ogawa, Hirohito; Koizumi, Nobuo; Ohnuma, Aiko; Mutemwa, Alisheke; Hang'ombe, Bernard M; Mweene, Aaron S; Takada, Ayato; Sugimoto, Chihiro; Suzuki, Yasuhiko; Kida, Hiroshi; Sawa, Hirofumi

    2015-06-01

    The role played by bats as a potential source of transmission of Leptospira spp. to humans is poorly understood, despite various pathogenic Leptospira spp. being identified in these mammals. Here, we investigated the prevalence and diversity of pathogenic Leptospira spp. that infect the straw-colored fruit bat (Eidolon helvum). We captured this bat species, which is widely distributed in Africa, in Zambia during 2008-2013. We detected the flagellin B gene (flaB) from pathogenic Leptospira spp. in kidney samples from 79 of 529 E. helvum (14.9%) bats. Phylogenetic analysis of 70 flaB fragments amplified from E. helvum samples and previously reported sequences, revealed that 12 of the fragments grouped with Leptospira borgpetersenii and Leptospira kirschneri; however, the remaining 58 flaB fragments appeared not to be associated with any reported species. Additionally, the 16S ribosomal RNA gene (rrs) amplified from 27 randomly chosen flaB-positive samples was compared with previously reported sequences, including bat-derived Leptospira spp. All 27 rrs fragments clustered into a pathogenic group. Eight fragments were located in unique branches, the other 19 fragments were closely related to Leptospira spp. detected in bats. These results show that rrs sequences in bats are genetically related to each other without regional variation, suggesting that Leptospira are evolutionarily well-adapted to bats and have uniquely evolved in the bat population. Our study indicates that pathogenic Leptospira spp. in E. helvum in Zambia have unique genotypes. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.

  14. Properties of some monkey DNA sequences obtained by a procedure that enriches for DNA replication origins.

    PubMed

    Zannis-Hadjopoulos, M; Kaufmann, G; Wang, S S; Lechner, R L; Karawya, E; Hesse, J; Martin, R G

    1985-07-01

    Twelve clones of monkey DNA obtained by a procedure that enriches 10(3)- to 10(4)-fold for nascent sequences activated early in S phase (G. Kaufmann, M. Zannis-Hadjopoulos, and R. G. Martin, Mol. Cell. Biol. 5:721-727, 1985) have been examined. Only 2 of the 12 ors sequences (origin-enriched sequences) are unique (ors1 and ors8). Three contain the highly reiterated Alu family (ors3, ors9, and ors11). One contains the highly reiterated alpha-satellite family (ors12), but none contain the Kpn family. Those remaining contain middle repetitive sequences. Two examples of the same middle repetitive sequence were found (ors2 and ors6). Three of the middle repetitive sequences (the ors2-ors6 pair, ors5, and ors10) are moderately dispersed; one (ors4) is highly dispersed. The last, ors7, has been mapped to the bona fide replication origin of the D loop of mitochondrial DNA. Of the nine ors sequences tested, half possess snapback (intrachain reannealing) properties.

  15. Next Generation Sequencing Technologies: The Doorway to the Unexplored Genomics of Non-Model Plants

    PubMed Central

    Unamba, Chibuikem I. N.; Nag, Akshay; Sharma, Ram K.

    2015-01-01

    Non-model plants i.e., the species which have one or all of the characters such as long life cycle, difficulty to grow in the laboratory or poor fecundity, have been schemed out of sequencing projects earlier, due to high running cost of Sanger sequencing. Consequently, the information about their genomics and key biological processes are inadequate. However, the advent of fast and cost effective next generation sequencing (NGS) platforms in the recent past has enabled the unearthing of certain characteristic gene structures unique to these species. It has also aided in gaining insight about mechanisms underlying processes of gene expression and secondary metabolism as well as facilitated development of genomic resources for diversity characterization, evolutionary analysis and marker assisted breeding even without prior availability of genomic sequence information. In this review we explore how different Next Gen Sequencing platforms, as well as recent advances in NGS based high throughput genotyping technologies are rewarding efforts on de-novo whole genome/transcriptome sequencing, development of genome wide sequence based markers resources for improvement of non-model crops that are less costly than phenotyping. PMID:26734016

  16. Vander Lugt correlation of DNA sequence data

    NASA Astrophysics Data System (ADS)

    Christens-Barry, William A.; Hawk, James F.; Martin, James C.

    1990-12-01

    DNA, the molecule containing the genetic code of an organism, is a linear chain of subunits. It is the sequence of subunits, of which there are four kinds, that constitutes the unique blueprint of an individual. This sequence is the focus of a large number of analyses performed by an army of geneticists, biologists, and computer scientists. Most of these analyses entail searches for specific subsequences within the larger set of sequence data. Thus, most analyses are essentially pattern recognition or correlation tasks. Yet, there are special features to such analysis that influence the strategy and methods of an optical pattern recognition approach. While the serial processing employed in digital electronic computers remains the main engine of sequence analyses, there is no fundamental reason that more efficient parallel methods cannot be used. We describe an approach using optical pattern recognition (OPR) techniques based on matched spatial filtering. This allows parallel comparison of large blocks of sequence data. In this study we have simulated a Vander Lugt1 architecture implementing our approach. Searches for specific target sequence strings within a block of DNA sequence from the Co/El plasmid2 are performed.

  17. Sequence of the toxic shock syndrome toxin gene (tstH) borne by strains of Staphylococcus aureus isolated from patients with Kawasaki syndrome.

    PubMed Central

    Deresiewicz, R L; Flaxenburg, J; Leng, K; Kasper, D L

    1996-01-01

    To explore whether a novel staphylococcal clone or structural variant of toxic shock syndrome toxin 1 is associated with Kawasaki syndrome, six toxigenic strains of Staphylococcus aureus from Kawasaki syndrome patients were studied. The strains were divisible into two groups based on phenotypic and genotypic characteristics and are therefore unequivocally not clonal. Portions of the tstH genes of each strain were sequenced. Three were sequenced in their entirety, while the remainder were sequenced from codon 66 to codon 137 of the mature protein only. Two of the former group differed slightly in the sequences of their signal peptides relative to the sequence published for the tstH signal peptide. Those differences did not affect toxin processing or secretion. The sequenced portions of the regions encoding mature toxic shock syndrome toxin 1 were identical in all six strains and corresponded exactly to the published sequence of tstH. No evidence was found for the existence of a structural variant of tstH uniquely associated with Kawasaki syndrome. PMID:8757881

  18. Microfluidic droplet enrichment for targeted sequencing

    PubMed Central

    Eastburn, Dennis J.; Huang, Yong; Pellegrino, Maurizio; Sciambi, Adam; Ptáček, Louis J.; Abate, Adam R.

    2015-01-01

    Targeted sequence enrichment enables better identification of genetic variation by providing increased sequencing coverage for genomic regions of interest. Here, we report the development of a new target enrichment technology that is highly differentiated from other approaches currently in use. Our method, MESA (Microfluidic droplet Enrichment for Sequence Analysis), isolates genomic DNA fragments in microfluidic droplets and performs TaqMan PCR reactions to identify droplets containing a desired target sequence. The TaqMan positive droplets are subsequently recovered via dielectrophoretic sorting, and the TaqMan amplicons are removed enzymatically prior to sequencing. We demonstrated the utility of this approach by generating an average 31.6-fold sequence enrichment across 250 kb of targeted genomic DNA from five unique genomic loci. Significantly, this enrichment enabled a more comprehensive identification of genetic polymorphisms within the targeted loci. MESA requires low amounts of input DNA, minimal prior locus sequence information and enriches the target region without PCR bias or artifacts. These features make it well suited for the study of genetic variation in a number of research and diagnostic applications. PMID:25873629

  19. U50: A New Metric for Measuring Assembly Output Based on Non-Overlapping, Target-Specific Contigs.

    PubMed

    Castro, Christina J; Ng, Terry Fei Fan

    2017-11-01

    Advances in next-generation sequencing technologies enable routine genome sequencing, generating millions of short reads. A crucial step for full genome analysis is the de novo assembly, and currently, performance of different assembly methods is measured by a metric called N 50 . However, the N 50 value can produce skewed, inaccurate results when complex data are analyzed, especially for viral and microbial datasets. To provide a better assessment of assembly output, we developed a new metric called U 50 . The U 50 identifies unique, target-specific contigs by using a reference genome as baseline, aiming at circumventing some limitations that are inherent to the N 50 metric. Specifically, the U 50 program removes overlapping sequence of multiple contigs by utilizing a mask array, so the performance of the assembly is only measured by unique contigs. We compared simulated and real datasets by using U 50 and N 50 , and our results demonstrated that U 50 has the following advantages over N 50 : (1) reducing erroneously large N 50 values due to a poor assembly, (2) eliminating overinflated N 50 values caused by large measurements from overlapping contigs, (3) eliminating diminished N 50 values caused by an abundance of small contigs, and (4) allowing comparisons across different platforms or samples based on the new percentage-based metric UG 50 %. The use of the U 50 metric allows for a more accurate measure of assembly performance by analyzing only the unique, non-overlapping contigs. In addition, most viral and microbial sequencing have high background noise (i.e., host and other non-targets), which contributes to having a skewed, misrepresented N 50 value-this is corrected by U 50 . Also, the UG 50 % can be used to compare assembly results from different samples or studies, the cross-comparisons of which cannot be performed with N 50 .

  20. Structural insights into the evolution of a sexy protein: novel topology and restricted backbone flexibility in a hypervariable pheromone from the red-legged salamander, Plethodon shermani.

    PubMed

    Wilburn, Damien B; Bowen, Kathleen E; Doty, Kari A; Arumugam, Sengodagounder; Lane, Andrew N; Feldhoff, Pamela W; Feldhoff, Richard C

    2014-01-01

    In response to pervasive sexual selection, protein sex pheromones often display rapid mutation and accelerated evolution of corresponding gene sequences. For proteins, the general dogma is that structure is maintained even as sequence or function may rapidly change. This phenomenon is well exemplified by the three-finger protein (TFP) superfamily: a diverse class of vertebrate proteins co-opted for many biological functions - such as components of snake venoms, regulators of the complement system, and coordinators of amphibian limb regeneration. All of the >200 structurally characterized TFPs adopt the namesake "three-finger" topology. In male red-legged salamanders, the TFP pheromone Plethodontid Modulating Factor (PMF) is a hypervariable protein such that, through extensive gene duplication and pervasive sexual selection, individual male salamanders express more than 30 unique isoforms. However, it remained unclear how this accelerated evolution affected the protein structure of PMF. Using LC/MS-MS and multidimensional NMR, we report the 3D structure of the most abundant PMF isoform, PMF-G. The high resolution structural ensemble revealed a highly modified TFP structure, including a unique disulfide bonding pattern and loss of secondary structure, that define a novel protein topology with greater backbone flexibility in the third peptide finger. Sequence comparison, models of molecular evolution, and homology modeling together support that this flexible third finger is the most rapidly evolving segment of PMF. Combined with PMF sequence hypervariability, this structural flexibility may enhance the plasticity of PMF as a chemical signal by permitting potentially thousands of structural conformers. We propose that the flexible third finger plays a critical role in PMF:receptor interactions. As female receptors co-evolve, this flexibility may allow PMF to still bind its receptor(s) without the immediate need for complementary mutations. Consequently, this unique adaptation may establish new paradigms for how receptor:ligand pairs co-evolve, in particular with respect to sexual conflict.

  1. Rapid isolation of microsatellite DNAs and identification of polymorphic mitochondrial DNA regions in the fish rotan (Perccottus glenii) invading European Russia

    USGS Publications Warehouse

    King, Timothy L.; Eackles, Michael S.; Reshetnikov, Andrey N.

    2015-01-01

    Human-mediated translocations and subsequent large-scale colonization by the invasive fish rotan (Perccottus glenii Dybowski, 1877; Perciformes, Odontobutidae), also known as Amur or Chinese sleeper, has resulted in dramatic transformations of small lentic ecosystems. However, no detailed genetic information exists on population structure, levels of effective movement, or relatedness among geographic populations of P. glenii within the European part of the range. We used massively parallel genomic DNA shotgun sequencing on the semiconductor-based Ion Torrent Personal Genome Machine (PGM) sequencing platform to identify nuclear microsatellite and mitochondrial DNA sequences in P. glenii from European Russia. Here we describe the characterization of nine nuclear microsatellite loci, ascertain levels of allelic diversity, heterozygosity, and demographic status of P. glenii collected from Ilev, Russia, one of several initial introduction points in European Russia. In addition, we mapped sequence reads to the complete P. glenii mitochondrial DNA sequence to identify polymorphic regions. Nuclear microsatellite markers developed for P. glenii yielded sufficient genetic diversity to: (1) produce unique multilocus genotypes; (2) elucidate structure among geographic populations; and (3) provide unique perspectives for analysis of population sizes and historical demographics. Among 4.9 million filtered P. glenii Ion Torrent PGM sequence reads, 11,304 mapped to the mitochondrial genome (NC_020350). This resulted in 100 % coverage of this genome to a mean coverage depth of 102X. A total of 130 variable sites were observed between the publicly available genome from China and the studied composite mitochondrial genome. Among these, 82 were diagnostic and monomorphic between the mitochondrial genomes and distributed among 15 genome regions. The polymorphic sites (N = 48) were distributed among 11 mitochondrial genome regions. Our results also indicate that sequence reads generated from two three-hour runs on the Ion Torrent PGM can generate a sufficient number of nuclear and mitochondrial markers to improve understanding of the evolutionary and ecological dynamics of non-model and in particular, invasive species.

  2. Transcriptome sequencing of lentil based on second-generation technology permits large-scale unigene assembly and SSR marker discovery.

    PubMed

    Kaur, Sukhjiwan; Cogan, Noel O I; Pembleton, Luke W; Shinozuka, Maiko; Savin, Keith W; Materne, Michael; Forster, John W

    2011-05-25

    Lentil (Lens culinaris Medik.) is a cool-season grain legume which provides a rich source of protein for human consumption. In terms of genomic resources, lentil is relatively underdeveloped, in comparison to other Fabaceae species, with limited available data. There is hence a significant need to enhance such resources in order to identify novel genes and alleles for molecular breeding to increase crop productivity and quality. Tissue-specific cDNA samples from six distinct lentil genotypes were sequenced using Roche 454 GS-FLX Titanium technology, generating c. 1.38 × 106 expressed sequence tags (ESTs). De novo assembly generated a total of 15,354 contigs and 68,715 singletons. The complete unigene set was sequence-analysed against genome drafts of the model legume species Medicago truncatula and Arabidopsis thaliana to identify 12,639, and 7,476 unique matches, respectively. When compared to the genome of Glycine max, a total of 20,419 unique hits were observed corresponding to c. 31% of the known gene space. A total of 25,592 lentil unigenes were subsequently annoated from GenBank. Simple sequence repeat (SSR)-containing ESTs were identified from consensus sequences and a total of 2,393 primer pairs were designed. A subset of 192 EST-SSR markers was screened for validation across a panel 12 cultivated lentil genotypes and one wild relative species. A total of 166 primer pairs obtained successful amplification, of which 47.5% detected genetic polymorphism. A substantial collection of ESTs has been developed from sequence analysis of lentil genotypes using second-generation technology, permitting unigene definition across a broad range of functional categories. As well as providing resources for functional genomics studies, the unigene set has permitted significant enhancement of the number of publicly-available molecular genetic markers as tools for improvement of this species.

  3. Microbial Diversity in a Hydrocarbon- and Chlorinated-Solvent-Contaminated Aquifer Undergoing Intrinsic Bioremediation

    PubMed Central

    Dojka, Michael A.; Hugenholtz, Philip; Haack, Sheridan K.; Pace, Norman R.

    1998-01-01

    A culture-independent molecular phylogenetic approach was used to survey constituents of microbial communities associated with an aquifer contaminated with hydrocarbons (mainly jet fuel) and chlorinated solvents undergoing intrinsic bioremediation. Samples were obtained from three redox zones: methanogenic, methanogenic-sulfate reducing, and iron or sulfate reducing. Small-subunit rRNA genes were amplified directly from aquifer material DNA by PCR with universally conserved or Bacteria- or Archaea-specific primers and were cloned. A total of 812 clones were screened by restriction fragment length polymorphisms (RFLP), approximately 50% of which were unique. All RFLP types that occurred more than once in the libraries, as well as many of the unique types, were sequenced. A total of 104 (94 bacterial and 10 archaeal) sequence types were determined. Of the 94 bacterial sequence types, 10 have no phylogenetic association with known taxonomic divisions and are phylogenetically grouped in six novel division level groups (candidate divisions WS1 to WS6); 21 belong to four recently described candidate divisions with no cultivated representatives (OP5, OP8, OP10, and OP11); and 63 are phylogenetically associated with 10 well-recognized divisions. The physiology of two particularly abundant sequence types obtained from the methanogenic zone could be inferred from their phylogenetic association with groups of microorganisms with a consistent phenotype. One of these sequence types is associated with the genus Syntrophus; Syntrophus spp. produce energy from the anaerobic oxidation of organic acids, with the production of acetate and hydrogen. The organism represented by the other sequence type is closely related to Methanosaeta spp., which are known to be capable of energy generation only through aceticlastic methanogenesis. We hypothesize, therefore, that the terminal step of hydrocarbon degradation in the methanogenic zone of the aquifer is aceticlastic methanogenesis and that the microorganisms represented by these two sequence types occur in syntrophic association. PMID:9758812

  4. Microbial diversity in a hydrocarbon- and chlorinated-solvent- contaminated aquifer undergoing intrinsic bioremediation

    USGS Publications Warehouse

    Dojka, M.A.; Hugenholtz, P.; Haack, S.K.; Pace, N.R.

    1998-01-01

    A culture-independent molecular phylogenetic approach was used to survey constituents of microbial communities associated with an aquifer contaminated with hydrocarbons (mainly jet fuel) and chlorinated solvents undergoing intrinsic bioremediation. Samples were obtained from three redox zones: methanogenic, methanogenic-sulfate reducing, and iron or sulfate reducing. Small-subunit rRNA genes were amplified directly from aquifer material DNA by PCR with universally conserved or Bacteria- or Archaea-specific primers and were cloned. A total of 812 clones were screened by restriction fragment length polymorphisms (RFLP), approximately 50% of which were unique. All RFLP types that occurred more than once in the libraries, as well as many of the unique types, were sequenced. A total of 104 (94 bacterial and 10 archaeal) sequence types were determined. Of the 94 bacterial sequence types, 10 have no phylogenetic association with known taxonomic divisions and are phylogenetically grouped in six novel division level groups (candidate divisions WS1 to WS6); 21 belong to four recently described candidate divisions with no cultivated representatives (OPS, OP8, OP10, and OP11); and 63 are phylogenetically associated with 10 well-recognized divisions. The physiology of two particularly abundant sequence types obtained from the methanogenic zone could be inferred from their phylogenetic association with groups of microorganisms with a consistent phenotype. One of these sequence types is associated with the genus Syntrophus; Syntrophus spp. produce energy from the anaerobic oxidation of organic acids, with the production of acetate and hydrogen. The organism represented by the other sequence type is closely related to Methanosaeta spp., which are known to be capable of energy generation only through aceticlastic methanogenesis. We hypothesize, therefore, that the terminal step of hydrocarbon degradation in the methanogenic zone of the aquifer is aceticlastic methanogenesis and that the microorganisms represented by these two sequence types occur in syntrophic association.

  5. Microbial community analysis of the hypersaline water of the Dead Sea using high-throughput amplicon sequencing.

    PubMed

    Jacob, Jacob H; Hussein, Emad I; Shakhatreh, Muhamad Ali K; Cornelison, Christopher T

    2017-10-01

    Amplicon sequencing using next-generation technology (bTEFAP ® ) has been utilized in describing the diversity of Dead Sea microbiota. The investigated area is a well-known salt lake in the western part of Jordan found in the lowest geographical location in the world (more than 420 m below sea level) and characterized by extreme salinity (approximately, 34%) in addition to other extreme conditions (low pH, unique ionic composition different from sea water). DNA was extracted from Dead Sea water. A total of 314,310 small subunit RNA (SSU rRNA) sequences were parsed, and 288,452 sequences were then clustered. For alpha diversity analysis, sample was rarefied to 3,000 sequences. The Shannon-Wiener index curve plot reached a plateau at approximately 3,000 sequences indicating that sequencing depth was sufficient to capture the full scope of microbial diversity. Archaea was found to be dominating the sequences (52%), whereas Bacteria constitute 45% of the sequences. Altogether, prokaryotic sequences (which constitute 97% of all sequences) were found to predominate. The findings expand on previous studies by using high-throughput amplicon sequencing to describe the microbial community in an environment which in recent years has been shown to hide some interesting diversity. © 2017 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.

  6. Efficient Identification of Murine M2 Macrophage Peptide Targeting Ligands by Phage Display and Next-Generation Sequencing.

    PubMed

    Liu, Gary W; Livesay, Brynn R; Kacherovsky, Nataly A; Cieslewicz, Maryelise; Lutz, Emi; Waalkes, Adam; Jensen, Michael C; Salipante, Stephen J; Pun, Suzie H

    2015-08-19

    Peptide ligands are used to increase the specificity of drug carriers to their target cells and to facilitate intracellular delivery. One method to identify such peptide ligands, phage display, enables high-throughput screening of peptide libraries for ligands binding to therapeutic targets of interest. However, conventional methods for identifying target binders in a library by Sanger sequencing are low-throughput, labor-intensive, and provide a limited perspective (<0.01%) of the complete sequence space. Moreover, the small sample space can be dominated by nonspecific, preferentially amplifying "parasitic sequences" and plastic-binding sequences, which may lead to the identification of false positives or exclude the identification of target-binding sequences. To overcome these challenges, we employed next-generation Illumina sequencing to couple high-throughput screening and high-throughput sequencing, enabling more comprehensive access to the phage display library sequence space. In this work, we define the hallmarks of binding sequences in next-generation sequencing data, and develop a method that identifies several target-binding phage clones for murine, alternatively activated M2 macrophages with a high (100%) success rate: sequences and binding motifs were reproducibly present across biological replicates; binding motifs were identified across multiple unique sequences; and an unselected, amplified library accurately filtered out parasitic sequences. In addition, we validate the Multiple Em for Motif Elicitation tool as an efficient and principled means of discovering binding sequences.

  7. Deep sequencing in library selection projects: what insight does it bring?

    PubMed

    Glanville, J; D'Angelo, S; Khan, T A; Reddy, S T; Naranjo, L; Ferrara, F; Bradbury, A R M

    2015-08-01

    High throughput sequencing is poised to change all aspects of the way antibodies and other binders are discovered and engineered. Millions of available sequence reads provide an unprecedented sampling depth able to guide the design and construction of effective, high quality naïve libraries containing tens of billions of unique molecules. Furthermore, during selections, high throughput sequencing enables quantitative tracing of enriched clones and position-specific guidance to amino acid variation under positive selection during antibody engineering. Successful application of the technologies relies on specific PCR reagent design, correct sequencing platform selection, and effective use of computational tools and statistical measures to remove error, identify antibodies, estimate diversity, and extract signatures of selection from the clone down to individual structural positions. Here we review these considerations and discuss some of the remaining challenges to the widespread adoption of the technology. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Deep sequencing in library selection projects: what insight does it bring?

    PubMed Central

    Glanville, J; D’Angelo, S; Khan, T.A.; Reddy, S. T.; Naranjo, L.; Ferrara, F.; Bradbury, A.R.M.

    2015-01-01

    High throughput sequencing is poised to change all aspects of the way antibodies and other binders are discovered and engineered. Millions of available sequence reads provide an unprecedented sampling depth able to guide the design and construction of effective, high quality naïve libraries containing tens of billions of unique molecules. Furthermore, during selections, high throughput sequencing enables quantitative tracing of enriched clones and position-specific guidance to amino acid variation under positive selection during antibody engineering. Successful application of the technologies relies on specific PCR reagent design, correct sequencing platform selection, and effective use of computational tools and statistical measures to remove error, identify antibodies, estimate diversity, and extract signatures of selection from the clone down to individual structural positions. Here we review these considerations and discuss some of the remaining challenges to the widespread adoption of the technology. PMID:26451649

  9. Phylogenetic tree of 16s rRNA sequences from sulfate-reducing bacteria in a sandy marine sediment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Devereux, R.; Mundfrom, G.W.

    1994-01-01

    Phylogenetic divergence among sulfate-reducing bateria in an estuarine sediment sample was investigated by PCR amplification and comparison of partial 16S rDNA sequences. Twenty unique 16S rDNA sequences were found, 12 from delta subclass bacteria based on overall sequence similarity (82-91%). Two successive PCR amplifications were used to obtain and clone the 16S rDNA. The first reaction used templates derived from phosphate-buffered saline washed sediment with primers designed to amplify nearly full-length bacterial domain 16S rDNA. A produce from a first reaction was used as template in a second reaction with primers designed to selectivity amplify a region of 16S rDNAmore » genes of sulfate-reducing bacteria. A phylogenetic tree incorporating the cloned sequences suggests the presence of yet to be cultivated lines of sulfate-reducing bacteria within the sediment sample.« less

  10. Aircraft stress sequence development: A complex engineering process made simple

    NASA Technical Reports Server (NTRS)

    Schrader, K. H.; Butts, D. G.; Sparks, W. A.

    1994-01-01

    Development of stress sequences for critical aircraft structure requires flight measured usage data, known aircraft loads, and established relationships between aircraft flight loads and structural stresses. Resulting cycle-by-cycle stress sequences can be directly usable for crack growth analysis and coupon spectra tests. Often, an expert in loads and spectra development manipulates the usage data into a typical sequence of representative flight conditions for which loads and stresses are calculated. For a fighter/trainer type aircraft, this effort is repeated many times for each of the fatigue critical locations (FCL) resulting in expenditure of numerous engineering hours. The Aircraft Stress Sequence Computer Program (ACSTRSEQ), developed by Southwest Research Institute under contract to San Antonio Air Logistics Center, presents a unique approach for making complex technical computations in a simple, easy to use method. The program is written in Microsoft Visual Basic for the Microsoft Windows environment.

  11. Modeling read counts for CNV detection in exome sequencing data.

    PubMed

    Love, Michael I; Myšičková, Alena; Sun, Ruping; Kalscheuer, Vera; Vingron, Martin; Haas, Stefan A

    2011-11-08

    Varying depth of high-throughput sequencing reads along a chromosome makes it possible to observe copy number variants (CNVs) in a sample relative to a reference. In exome and other targeted sequencing projects, technical factors increase variation in read depth while reducing the number of observed locations, adding difficulty to the problem of identifying CNVs. We present a hidden Markov model for detecting CNVs from raw read count data, using background read depth from a control set as well as other positional covariates such as GC-content. The model, exomeCopy, is applied to a large chromosome X exome sequencing project identifying a list of large unique CNVs. CNVs predicted by the model and experimentally validated are then recovered using a cross-platform control set from publicly available exome sequencing data. Simulations show high sensitivity for detecting heterozygous and homozygous CNVs, outperforming normalization and state-of-the-art segmentation methods.

  12. Shotgun Optical Maps of the Whole Escherichia coli O157:H7 Genome

    PubMed Central

    Lim, Alex; Dimalanta, Eileen T.; Potamousis, Konstantinos D.; Yen, Galex; Apodoca, Jennifer; Tao, Chunhong; Lin, Jieyi; Qi, Rong; Skiadas, John; Ramanathan, Arvind; Perna, Nicole T.; Plunkett, Guy; Burland, Valerie; Mau, Bob; Hackett, Jeremiah; Blattner, Frederick R.; Anantharaman, Thomas S.; Mishra, Bhubaneswar; Schwartz, David C.

    2001-01-01

    We have constructed NheI and XhoI optical maps of Escherichia coli O157:H7 solely from genomic DNA molecules to provide a uniquely valuable scaffold for contig closure and sequence validation. E. coli O157:H7 is a common pathogen found in contaminated food and water. Our approach obviated the need for the analysis of clones, PCR products, and hybridizations, because maps were constructed from ensembles of single DNA molecules. Shotgun sequencing of bacterial genomes remains labor-intensive, despite advances in sequencing technology. This is partly due to manual intervention required during the last stages of finishing. The applicability of optical mapping to this problem was enhanced by advances in machine vision techniques that improved mapping throughput and created a path to full automation of mapping. Comparisons were made between maps and sequence data that characterized sequence gaps and guided nascent assemblies. PMID:11544203

  13. Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana.

    PubMed

    Mayer, K; Schüller, C; Wambutt, R; Murphy, G; Volckaert, G; Pohl, T; Düsterhöft, A; Stiekema, W; Entian, K D; Terryn, N; Harris, B; Ansorge, W; Brandt, P; Grivell, L; Rieger, M; Weichselgartner, M; de Simone, V; Obermaier, B; Mache, R; Müller, M; Kreis, M; Delseny, M; Puigdomenech, P; Watson, M; Schmidtheini, T; Reichert, B; Portatelle, D; Perez-Alonso, M; Boutry, M; Bancroft, I; Vos, P; Hoheisel, J; Zimmermann, W; Wedler, H; Ridley, P; Langham, S A; McCullagh, B; Bilham, L; Robben, J; Van der Schueren, J; Grymonprez, B; Chuang, Y J; Vandenbussche, F; Braeken, M; Weltjens, I; Voet, M; Bastiaens, I; Aert, R; Defoor, E; Weitzenegger, T; Bothe, G; Ramsperger, U; Hilbert, H; Braun, M; Holzer, E; Brandt, A; Peters, S; van Staveren, M; Dirske, W; Mooijman, P; Klein Lankhorst, R; Rose, M; Hauf, J; Kötter, P; Berneiser, S; Hempel, S; Feldpausch, M; Lamberth, S; Van den Daele, H; De Keyser, A; Buysshaert, C; Gielen, J; Villarroel, R; De Clercq, R; Van Montagu, M; Rogers, J; Cronin, A; Quail, M; Bray-Allen, S; Clark, L; Doggett, J; Hall, S; Kay, M; Lennard, N; McLay, K; Mayes, R; Pettett, A; Rajandream, M A; Lyne, M; Benes, V; Rechmann, S; Borkova, D; Blöcker, H; Scharfe, M; Grimm, M; Löhnert, T H; Dose, S; de Haan, M; Maarse, A; Schäfer, M; Müller-Auer, S; Gabel, C; Fuchs, M; Fartmann, B; Granderath, K; Dauner, D; Herzl, A; Neumann, S; Argiriou, A; Vitale, D; Liguori, R; Piravandi, E; Massenet, O; Quigley, F; Clabauld, G; Mündlein, A; Felber, R; Schnabl, S; Hiller, R; Schmidt, W; Lecharny, A; Aubourg, S; Chefdor, F; Cooke, R; Berger, C; Montfort, A; Casacuberta, E; Gibbons, T; Weber, N; Vandenbol, M; Bargues, M; Terol, J; Torres, A; Perez-Perez, A; Purnelle, B; Bent, E; Johnson, S; Tacon, D; Jesse, T; Heijnen, L; Schwarz, S; Scholler, P; Heber, S; Francs, P; Bielke, C; Frishman, D; Haase, D; Lemcke, K; Mewes, H W; Stocker, S; Zaccaria, P; Bevan, M; Wilson, R K; de la Bastide, M; Habermann, K; Parnell, L; Dedhia, N; Gnoj, L; Schutz, K; Huang, E; Spiegel, L; Sehkon, M; Murray, J; Sheet, P; Cordes, M; Abu-Threideh, J; Stoneking, T; Kalicki, J; Graves, T; Harmon, G; Edwards, J; Latreille, P; Courtney, L; Cloud, J; Abbott, A; Scott, K; Johnson, D; Minx, P; Bentley, D; Fulton, B; Miller, N; Greco, T; Kemp, K; Kramer, J; Fulton, L; Mardis, E; Dante, M; Pepin, K; Hillier, L; Nelson, J; Spieth, J; Ryan, E; Andrews, S; Geisel, C; Layman, D; Du, H; Ali, J; Berghoff, A; Jones, K; Drone, K; Cotton, M; Joshu, C; Antonoiu, B; Zidanic, M; Strong, C; Sun, H; Lamar, B; Yordan, C; Ma, P; Zhong, J; Preston, R; Vil, D; Shekher, M; Matero, A; Shah, R; Swaby, I K; O'Shaughnessy, A; Rodriguez, M; Hoffmann, J; Till, S; Granat, S; Shohdy, N; Hasegawa, A; Hameed, A; Lodhi, M; Johnson, A; Chen, E; Marra, M; Martienssen, R; McCombie, W R

    1999-12-16

    The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.

  14. SACCHARIS: an automated pipeline to streamline discovery of carbohydrate active enzyme activities within polyspecific families and de novo sequence datasets.

    PubMed

    Jones, Darryl R; Thomas, Dallas; Alger, Nicholas; Ghavidel, Ata; Inglis, G Douglas; Abbott, D Wade

    2018-01-01

    Deposition of new genetic sequences in online databases is expanding at an unprecedented rate. As a result, sequence identification continues to outpace functional characterization of carbohydrate active enzymes (CAZymes). In this paradigm, the discovery of enzymes with novel functions is often hindered by high volumes of uncharacterized sequences particularly when the enzyme sequence belongs to a family that exhibits diverse functional specificities (i.e., polyspecificity). Therefore, to direct sequence-based discovery and characterization of new enzyme activities we have developed an automated in silico pipeline entitled: Sequence Analysis and Clustering of CarboHydrate Active enzymes for Rapid Informed prediction of Specificity (SACCHARIS). This pipeline streamlines the selection of uncharacterized sequences for discovery of new CAZyme or CBM specificity from families currently maintained on the CAZy website or within user-defined datasets. SACCHARIS was used to generate a phylogenetic tree of a GH43, a CAZyme family with defined subfamily designations. This analysis confirmed that large datasets can be organized into sequence clusters of manageable sizes that possess related functions. Seeding this tree with a GH43 sequence from Bacteroides dorei DSM 17855 (BdGH43b, revealed it partitioned as a single sequence within the tree. This pattern was consistent with it possessing a unique enzyme activity for GH43 as BdGH43b is the first described α-glucanase described for this family. The capacity of SACCHARIS to extract and cluster characterized carbohydrate binding module sequences was demonstrated using family 6 CBMs (i.e., CBM6s). This CBM family displays a polyspecific ligand binding profile and contains many structurally determined members. Using SACCHARIS to identify a cluster of divergent sequences, a CBM6 sequence from a unique clade was demonstrated to bind yeast mannan, which represents the first description of an α-mannan binding CBM. Additionally, we have performed a CAZome analysis of an in-house sequenced bacterial genome and a comparative analysis of B. thetaiotaomicron VPI-5482 and B. thetaiotaomicron 7330, to demonstrate that SACCHARIS can generate "CAZome fingerprints", which differentiate between the saccharolytic potential of two related strains in silico. Establishing sequence-function and sequence-structure relationships in polyspecific CAZyme families are promising approaches for streamlining enzyme discovery. SACCHARIS facilitates this process by embedding CAZyme and CBM family trees generated from biochemically to structurally characterized sequences, with protein sequences that have unknown functions. In addition, these trees can be integrated with user-defined datasets (e.g., genomics, metagenomics, and transcriptomics) to inform experimental characterization of new CAZymes or CBMs not currently curated, and for researchers to compare differential sequence patterns between entire CAZomes. In this light, SACCHARIS provides an in silico tool that can be tailored for enzyme bioprospecting in datasets of increasing complexity and for diverse applications in glycobiotechnology.

  15. Can natural proteins designed with 'inverted' peptide sequences adopt native-like protein folds?

    PubMed

    Sridhar, Settu; Guruprasad, Kunchur

    2014-01-01

    We have carried out a systematic computational analysis on a representative dataset of proteins of known three-dimensional structure, in order to evaluate whether it would possible to 'swap' certain short peptide sequences in naturally occurring proteins with their corresponding 'inverted' peptides and generate 'artificial' proteins that are predicted to retain native-like protein fold. The analysis of 3,967 representative proteins from the Protein Data Bank revealed 102,677 unique identical inverted peptide sequence pairs that vary in sequence length between 5-12 and 18 amino acid residues. Our analysis illustrates with examples that such 'artificial' proteins may be generated by identifying peptides with 'similar structural environment' and by using comparative protein modeling and validation studies. Our analysis suggests that natural proteins may be tolerant to accommodating such peptides.

  16. Distribution and Features of the Six Classes of Peroxiredoxins

    PubMed Central

    Poole, Leslie B.; Nelson, Kimberly J.

    2016-01-01

    Peroxiredoxins are cysteine-dependent peroxide reductases that group into 6 different, structurally discernable classes. In 2011, our research team reported the application of a bioinformatic approach called active site profiling to extract active site-proximal sequence segments from the 29 distinct, structurally-characterized peroxiredoxins available at the time. These extracted sequences were then used to create unique profiles for the six groups which were subsequently used to search GenBank(nr), allowing identification of ∼3500 peroxiredoxin sequences and their respective subgroups. Summarized in this minireview are the features and phylogenetic distributions of each of these peroxiredoxin subgroups; an example is also provided illustrating the use of the web accessible, searchable database known as PREX to identify subfamily-specific peroxiredoxin sequences for the organism Vitis vinifera (grape). PMID:26810075

  17. Unique clade of alphaproteobacterial endosymbionts induces complete cytoplasmic incompatibility in the coconut beetle

    PubMed Central

    Takano, Shun-ichiro; Tuda, Midori; Takasu, Keiji; Furuya, Naruto; Imamura, Yuya; Kim, Sangwan; Tashiro, Kosuke; Iiyama, Kazuhiro; Tavares, Matias; Amaral, Acacio Cardoso

    2017-01-01

    Maternally inherited bacterial endosymbionts in arthropods manipulate host reproduction to increase the fitness of infected females. Cytoplasmic incompatibility (CI) is one such manipulation, in which uninfected females produce few or no offspring when they mate with infected males. To date, two bacterial endosymbionts, Wolbachia and Cardinium, have been reported as CI inducers. Only Wolbachia induces complete CI, which causes 100% offspring mortality in incompatible crosses. Here we report a third CI inducer that belongs to a unique clade of Alphaproteobacteria detected within the coconut beetle, Brontispa longissima. This beetle comprises two cryptic species, the Asian clade and the Pacific clade, which show incompatibility in hybrid crosses. Different bacterial endosymbionts, a unique clade of Alphaproteobacteria in the Pacific clade and Wolbachia in the Asian clade, induced bidirectional CI between hosts. The former induced complete CI (100% mortality), whereas the latter induced partial CI (70% mortality). Illumina MiSeq sequencing and denaturing gradient gel electrophoresis patterns showed that the predominant bacterium detected in the Pacific clade of B. longissima was this unique clade of Alphaproteobacteria alone, indicating that this endosymbiont was responsible for the complete CI. Sex distortion did not occur in any of the tested crosses. The 1,160 bp of 16S rRNA gene sequence obtained for this endosymbiont had only 89.3% identity with that of Wolbachia, indicating that it can be recognized as a distinct species. We discuss the potential use of this bacterium as a biological control agent. PMID:28533374

  18. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution.

    PubMed

    2004-12-09

    We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.

  19. A cost effective 5΄ selective single cell transcriptome profiling approach with improved UMI design

    PubMed Central

    Arguel, Marie-Jeanne; LeBrigand, Kevin; Paquet, Agnès; Ruiz García, Sandra; Zaragosi, Laure-Emmanuelle; Waldmann, Rainer

    2017-01-01

    Abstract Single cell RNA sequencing approaches are instrumental in studies of cell-to-cell variability. 5΄ selective transcriptome profiling approaches allow simultaneous definition of the transcription start size and have advantages over 3΄ selective approaches which just provide internal sequences close to the 3΄ end. The only currently existing 5΄ selective approach requires costly and labor intensive fragmentation and cell barcoding after cDNA amplification. We developed an optimized 5΄ selective workflow where all the cell indexing is done prior to fragmentation. With our protocol, cell indexing can be performed in the Fluidigm C1 microfluidic device, resulting in a significant reduction of cost and labor. We also designed optimized unique molecular identifiers that show less sequence bias and vulnerability towards sequencing errors resulting in an improved accuracy of molecule counting. We provide comprehensive experimental workflows for Illumina and Ion Proton sequencers that allow single cell sequencing in a cost range comparable to qPCR assays. PMID:27940562

  20. Phylogenetic relations of humans and African apes from DNA sequences in the Psi eta-globin region

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Miyamoto, M.M.; Slightom, J.L.; Goodman, M.

    Sequences from the upstream and downstream flanking DNA regions of the Psi eta-globin locus in Pan troglodytes (common chimpanzee), Gorilla gorilla (gorilla), and Pongo pygmaeus (orangutan, the closest living relative to Homo, Pan, and Gorilla) provided further data for evaluating the phylogenetic relations of humans and African apes. These newly sequenced orthologs (an additional 4.9 kilobase pairs (kbp) for each species) were combined with published Psi eta-gene sequences and then compared to the same orthologous stretch (a continuous 7.1-kbp region) available for humans. Phylogenetic analysis of these nucleotide sequences by the parsimony method indicated (i) that human and chimpanzee aremore » more closely related to each other than either is to gorilla and (ii) that the slowdown in the rate of sequence evolution evident in higher primates is especially pronounced in humans. These results indicate that features unique to African apes (but not to humans) are primitive and that even local molecular clocks should be applied with caution.« less

  1. Construction of a robust microarray from a non-model species (largemouth bass) using pyrosequencing technology

    PubMed Central

    Garcia-Reyero, Natàlia; Griffitt, Robert J.; Liu, Li; Kroll, Kevin J.; Farmerie, William G.; Barber, David S.; Denslow, Nancy D.

    2009-01-01

    A novel custom microarray for largemouth bass (Micropterus salmoides) was designed with sequences obtained from a normalized cDNA library using the 454 Life Sciences GS-20 pyrosequencer. This approach yielded in excess of 58 million bases of high-quality sequence. The sequence information was combined with 2,616 reads obtained by traditional suppressive subtractive hybridizations to derive a total of 31,391 unique sequences. Annotation and coding sequences were predicted for these transcripts where possible. 16,350 annotated transcripts were selected as target sequences for the design of the custom largemouth bass oligonucleotide microarray. The microarray was validated by examining the transcriptomic response in male largemouth bass exposed to 17β-œstradiol. Transcriptomic responses were assessed in liver and gonad, and indicated gene expression profiles typical of exposure to œstradiol. The results demonstrate the potential to rapidly create the tools necessary to assess large scale transcriptional responses in non-model species, paving the way for expanded impact of toxicogenomics in ecotoxicology. PMID:19936325

  2. Micronuclear DNA of Oxytricha nova contains sequences with autonomously replicating activity in Saccharomyces cerevisiae.

    PubMed Central

    Colombo, M M; Swanton, M T; Donini, P; Prescott, D M

    1984-01-01

    Oxytricha nova is a hypotrichous ciliate with micronuclei and macronuclei. Micronuclei, which contain large, chromosomal-sized DNA, are genetically inert but undergo meiosis and exchange during cell mating. Macronuclei, which contain only small, gene-sized DNA molecules, provide all of the nuclear RNA needed to run the cell. After cell mating the macronucleus is derived from a micronucleus, a derivation that includes excision of the genes from chromosomes and elimination of the remaining DNA. The eliminated DNA includes all of the repetitious sequences and approximately 95% of the unique sequences. We cloned large restriction fragments from the micronucleus that confer replication ability on a replication-deficient plasmid in Saccharomyces cerevisiae. Sequences that confer replication ability are called autonomously replicating sequences. The frequency and effectiveness of autonomously replicating sequences in micronuclear DNA are similar to those reported for DNAs of other organisms introduced into yeast cells. Of the 12 micronuclear fragments with autonomously replicating sequence activity, 9 also showed homology to macronuclear DNA, indicating that they contain a macronuclear gene sequence. We conclude from this that autonomously replicating sequence activity is nonrandomly distributed throughout micronuclear DNA and is preferentially associated with those regions of micronuclear DNA that contain genes. Images PMID:6092934

  3. AMPLISAS: a web server for multilocus genotyping using next-generation amplicon sequencing data.

    PubMed

    Sebastian, Alvaro; Herdegen, Magdalena; Migalska, Magdalena; Radwan, Jacek

    2016-03-01

    Next-generation sequencing (NGS) technologies are revolutionizing the fields of biology and medicine as powerful tools for amplicon sequencing (AS). Using combinations of primers and barcodes, it is possible to sequence targeted genomic regions with deep coverage for hundreds, even thousands, of individuals in a single experiment. This is extremely valuable for the genotyping of gene families in which locus-specific primers are often difficult to design, such as the major histocompatibility complex (MHC). The utility of AS is, however, limited by the high intrinsic sequencing error rates of NGS technologies and other sources of error such as polymerase amplification or chimera formation. Correcting these errors requires extensive bioinformatic post-processing of NGS data. Amplicon Sequence Assignment (AMPLISAS) is a tool that performs analysis of AS results in a simple and efficient way, while offering customization options for advanced users. AMPLISAS is designed as a three-step pipeline consisting of (i) read demultiplexing, (ii) unique sequence clustering and (iii) erroneous sequence filtering. Allele sequences and frequencies are retrieved in excel spreadsheet format, making them easy to interpret. AMPLISAS performance has been successfully benchmarked against previously published genotyped MHC data sets obtained with various NGS technologies. © 2015 John Wiley & Sons Ltd.

  4. Integrated databanks access and sequence/structure analysis services at the PBIL.

    PubMed

    Perrière, Guy; Combet, Christophe; Penel, Simon; Blanchet, Christophe; Thioulouse, Jean; Geourjon, Christophe; Grassot, Julien; Charavay, Céline; Gouy, Manolo; Duret, Laurent; Deléage, Gilbert

    2003-07-01

    The World Wide Web server of the PBIL (Pôle Bioinformatique Lyonnais) provides on-line access to sequence databanks and to many tools of nucleic acid and protein sequence analyses. This server allows to query nucleotide sequence banks in the EMBL and GenBank formats and protein sequence banks in the SWISS-PROT and PIR formats. The query engine on which our data bank access is based is the ACNUC system. It allows the possibility to build complex queries to access functional zones of biological interest and to retrieve large sequence sets. Of special interest are the unique features provided by this system to query the data banks of gene families developed at the PBIL. The server also provides access to a wide range of sequence analysis methods: similarity search programs, multiple alignments, protein structure prediction and multivariate statistics. An originality of this server is the integration of these two aspects: sequence retrieval and sequence analysis. Indeed, thanks to the introduction of re-usable lists, it is possible to perform treatments on large sets of data. The PBIL server can be reached at: http://pbil.univ-lyon1.fr.

  5. Software Reviews: "Pow! Zap! Ker-plunk! The Comic Book Maker" (Pelican Software).

    ERIC Educational Resources Information Center

    Porter, Bernajean

    1990-01-01

    Reviews the newest addition to Pelican's Creative Writing Series of instructional software, which uses the comic book format to provide a unique writing environment for satire, symbolism, sequencing, and combining text and graphics to communicate ideas. (SR)

  6. Analysis of xylem formation in pine by cDNA sequencing

    NASA Technical Reports Server (NTRS)

    Allona, I.; Quinn, M.; Shoop, E.; Swope, K.; St Cyr, S.; Carlis, J.; Riedl, J.; Retzel, E.; Campbell, M. M.; Sederoff, R.; hide

    1998-01-01

    Secondary xylem (wood) formation is likely to involve some genes expressed rarely or not at all in herbaceous plants. Moreover, environmental and developmental stimuli influence secondary xylem differentiation, producing morphological and chemical changes in wood. To increase our understanding of xylem formation, and to provide material for comparative analysis of gymnosperm and angiosperm sequences, ESTs were obtained from immature xylem of loblolly pine (Pinus taeda L.). A total of 1,097 single-pass sequences were obtained from 5' ends of cDNAs made from gravistimulated tissue from bent trees. Cluster analysis detected 107 groups of similar sequences, ranging in size from 2 to 20 sequences. A total of 361 sequences fell into these groups, whereas 736 sequences were unique. About 55% of the pine EST sequences show similarity to previously described sequences in public databases. About 10% of the recognized genes encode factors involved in cell wall formation. Sequences similar to cell wall proteins, most known lignin biosynthetic enzymes, and several enzymes of carbohydrate metabolism were found. A number of putative regulatory proteins also are represented. Expression patterns of several of these genes were studied in various tissues and organs of pine. Sequencing novel genes expressed during xylem formation will provide a powerful means of identifying mechanisms controlling this important differentiation pathway.

  7. enoLOGOS: a versatile web tool for energy normalized sequence logos

    PubMed Central

    Workman, Christopher T.; Yin, Yutong; Corcoran, David L.; Ideker, Trey; Stormo, Gary D.; Benos, Panayiotis V.

    2005-01-01

    enoLOGOS is a web-based tool that generates sequence logos from various input sources. Sequence logos have become a popular way to graphically represent DNA and amino acid sequence patterns from a set of aligned sequences. Each position of the alignment is represented by a column of stacked symbols with its total height reflecting the information content in this position. Currently, the available web servers are able to create logo images from a set of aligned sequences, but none of them generates weighted sequence logos directly from energy measurements or other sources. With the advent of high-throughput technologies for estimating the contact energy of different DNA sequences, tools that can create logos directly from binding affinity data are useful to researchers. enoLOGOS generates sequence logos from a variety of input data, including energy measurements, probability matrices, alignment matrices, count matrices and aligned sequences. Furthermore, enoLOGOS can represent the mutual information of different positions of the consensus sequence, a unique feature of this tool. Another web interface for our software, C2H2-enoLOGOS, generates logos for the DNA-binding preferences of the C2H2 zinc-finger transcription factor family members. enoLOGOS and C2H2-enoLOGOS are accessible over the web at . PMID:15980495

  8. Sources of PCR-induced distortions in high-throughput sequencing data sets

    PubMed Central

    Kebschull, Justus M.; Zador, Anthony M.

    2015-01-01

    PCR permits the exponential and sequence-specific amplification of DNA, even from minute starting quantities. PCR is a fundamental step in preparing DNA samples for high-throughput sequencing. However, there are errors associated with PCR-mediated amplification. Here we examine the effects of four important sources of error—bias, stochasticity, template switches and polymerase errors—on sequence representation in low-input next-generation sequencing libraries. We designed a pool of diverse PCR amplicons with a defined structure, and then used Illumina sequencing to search for signatures of each process. We further developed quantitative models for each process, and compared predictions of these models to our experimental data. We find that PCR stochasticity is the major force skewing sequence representation after amplification of a pool of unique DNA amplicons. Polymerase errors become very common in later cycles of PCR but have little impact on the overall sequence distribution as they are confined to small copy numbers. PCR template switches are rare and confined to low copy numbers. Our results provide a theoretical basis for removing distortions from high-throughput sequencing data. In addition, our findings on PCR stochasticity will have particular relevance to quantification of results from single cell sequencing, in which sequences are represented by only one or a few molecules. PMID:26187991

  9. New insights into the paleolake sequence of Baumkirchen (Austria): multiple lake phases and a minor ice advance during MIS 4?

    NASA Astrophysics Data System (ADS)

    Barrett, Samuel; Starnberger, Reinhard; Spötl, Christoph; Brauer, Achim; Tjallingii, Rik; Dulski, Peter; Abfalterer, Christof

    2015-04-01

    The sequence of pre-LGM lacustrine sediments at Baumkirchen (Austria) provides a key record in Alpine Quaternary stratigraphy. These sediments from within the boundary of the Alps potentially provide unique insights into the regional paleoclimate. Recent drilling revealed at least ~250m (the base was not reached) of almost entirely mm- to cm-scale lacustrine sediments. The laminated sediments are comprised of alternations between clayey silt and event layers of medium silt to fine sand. The sequence is interrupted only by a short section of gravel supported in an unlaminated clay-rich matrix. Optically stimulated luminescence dating identifies two distinct sequences: the upper sequence spanning mid-late Marine Isotope Stage (MIS) 3 (~33 to ~45 ka BP), agreeing with existing calibrated radiocarbon ages, and the lower section dating to MIS 4 (~59 to ~73 ka BP). Whether the hiatus is an erosional unconformity, or if the sequences represent two separate lake phases is unclear. Although the precise location of the hiatus is hard to identify, the gravel-rich section lies at the very top of the lower sequence. Pebbles in these gravels are largely angular and contain a significant proportion of non-local, regional lithologies. Such gravels are absent in the remainder of the entire 250 m-thick sequence and hence suggest a unique event rather than e.g. an interfingering local delta gravel foresets with the basin sediments. The gravels are therefore likely to be ice-rafted debris from icebergs from nearby glaciers calving into the lake. This therefore represents the first sedimentological evidence of a MIS 4 ice advance in the Eastern Alps. X-ray fluorescence analysis (ITRAX core scanning) of event layers indicates a strong change in the geochemical composition from generally K, Zr and Ti-rich layers in the upper sequence to mainly Ca and/or Si-rich layers in the lower sequence. X-ray diffraction analysis shows the Ca and Si signals to be controlled by carbonate (both calcite and dolomite) and quartz, respectively. This suggests a change in dominant sediment source and may indicate a change in catchment or paleolake configuration, re-raising the long outstanding question of how the lake or lakes were dammed.

  10. Piscine reovirus: Genomic and molecular phylogenetic analysis from farmed and wild salmonids collected on the Canada/US Pacific Coast

    USGS Publications Warehouse

    Siah, Ahmed; Morrison, Diane B.; Fringuelli, Elena; Savage, Paul S.; Richmond, Zina; Purcell, Maureen K.; Johns, Robert; Johnson, Stewart C.; Sakasida, Sonja M.

    2015-01-01

    Piscine reovirus (PRV) is a double stranded non-enveloped RNA virus detected in farmed and wild salmonids. This study examined the phylogenetic relationships among different PRV sequence types present in samples from salmonids in Western Canada and the US, including Alaska (US), British Columbia (Canada) and Washington State (US). Tissues testing positive for PRV were partially sequenced for segment S1, producing 71 sequences that grouped into 10 unique sequence types. Sequence analysis revealed no identifiable geographical or temporal variation among the sequence types. Identical sequence types were found in fish sampled in 2001, 2005 and 2014. In addition, PRV positive samples from fish derived from Alaska, British Columbia and Washington State share identical sequence types. Comparative analysis of the phylogenetic tree indicated that Canada/US Pacific Northwest sequences formed a subgroup with some Norwegian sequence types (group II), distinct from other Norwegian and Chilean sequences (groups I, III and IV). Representative PRV positive samples from farmed and wild fish in British Columbia and Washington State were subjected to genome sequencing using next generation sequencing methods. Individual analysis of each of the 10 partial segments indicated that the Canadian and US PRV sequence types clustered separately from available whole genome sequences of some Norwegian and Chilean sequences for all segments except the segment S4. In summary, PRV was genetically homogenous over a large geographic distance (Alaska to Washington State), and the sequence types were relatively stable over a 13 year period.

  11. Piscine Reovirus: Genomic and Molecular Phylogenetic Analysis from Farmed and Wild Salmonids Collected on the Canada/US Pacific Coast

    PubMed Central

    Siah, Ahmed; Morrison, Diane B.; Fringuelli, Elena; Savage, Paul; Richmond, Zina; Johns, Robert; Purcell, Maureen K.; Johnson, Stewart C.; Saksida, Sonja M.

    2015-01-01

    Piscine reovirus (PRV) is a double stranded non-enveloped RNA virus detected in farmed and wild salmonids. This study examined the phylogenetic relationships among different PRV sequence types present in samples from salmonids in Western Canada and the US, including Alaska (US), British Columbia (Canada) and Washington State (US). Tissues testing positive for PRV were partially sequenced for segment S1, producing 71 sequences that grouped into 10 unique sequence types. Sequence analysis revealed no identifiable geographical or temporal variation among the sequence types. Identical sequence types were found in fish sampled in 2001, 2005 and 2014. In addition, PRV positive samples from fish derived from Alaska, British Columbia and Washington State share identical sequence types. Comparative analysis of the phylogenetic tree indicated that Canada/US Pacific Northwest sequences formed a subgroup with some Norwegian sequence types (group II), distinct from other Norwegian and Chilean sequences (groups I, III and IV). Representative PRV positive samples from farmed and wild fish in British Columbia and Washington State were subjected to genome sequencing using next generation sequencing methods. Individual analysis of each of the 10 partial segments indicated that the Canadian and US PRV sequence types clustered separately from available whole genome sequences of some Norwegian and Chilean sequences for all segments except the segment S4. In summary, PRV was genetically homogenous over a large geographic distance (Alaska to Washington State), and the sequence types were relatively stable over a 13 year period. PMID:26536673

  12. Sequences downstream of AAUAAA signals affect pre-mRNA cleavage and polyadenylation in vitro both directly and indirectly.

    PubMed Central

    Ryner, L C; Takagaki, Y; Manley, J L

    1989-01-01

    To investigate the role of sequences lying downstream of the conserved AAUAAA hexanucleotide in pre-mRNA cleavage and polyadenylation, deletions or substitutions were constructed in polyadenylation signals from simian virus 40 and adenovirus, and their effects were assayed in both crude and fractionated HeLa cell nuclear extracts. As expected, these sequences influenced the efficiency of both cleavage and polyadenylation as well as the accuracy of the cleavage reaction. Sequences near or upstream of the actual site of poly(A) addition appeared to specify a unique cleavage site, since their deletion resulted, in some cases, in heterogeneous cleavage. Furthermore, the sequences that allowed the simian virus 40 late pre-RNA to be cleaved preferentially by partially purified cleavage activity were also those at the cleavage site itself. Interestingly, sequences downstream of the cleavage site interacted with factors not directly involved in catalyzing cleavage and polyadenylation, since the effects of deletions were substantially diminished when partially purified components were used in assays. In addition, these sequences contained elements that could affect 3'-end formation both positively and negatively. Images PMID:2566911

  13. Quartz-Seq2: a high-throughput single-cell RNA-sequencing method that effectively uses limited sequence reads.

    PubMed

    Sasagawa, Yohei; Danno, Hiroki; Takada, Hitomi; Ebisawa, Masashi; Tanaka, Kaori; Hayashi, Tetsutaro; Kurisaki, Akira; Nikaido, Itoshi

    2018-03-09

    High-throughput single-cell RNA-seq methods assign limited unique molecular identifier (UMI) counts as gene expression values to single cells from shallow sequence reads and detect limited gene counts. We thus developed a high-throughput single-cell RNA-seq method, Quartz-Seq2, to overcome these issues. Our improvements in the reaction steps make it possible to effectively convert initial reads to UMI counts, at a rate of 30-50%, and detect more genes. To demonstrate the power of Quartz-Seq2, we analyzed approximately 10,000 transcriptomes from in vitro embryonic stem cells and an in vivo stromal vascular fraction with a limited number of reads.

  14. Enabling multiplexed testing of pooled donor cells through whole-genome sequencing.

    PubMed

    Chan, Yingleong; Chan, Ying Kai; Goodman, Daniel B; Guo, Xiaoge; Chavez, Alejandro; Lim, Elaine T; Church, George M

    2018-04-19

    We describe a method that enables the multiplex screening of a pool of many different donor cell lines. Our method accurately predicts each donor proportion from the pool without requiring the use of unique DNA barcodes as markers of donor identity. Instead, we take advantage of common single nucleotide polymorphisms, whole-genome sequencing, and an algorithm to calculate the proportions from the sequencing data. By testing using simulated and real data, we showed that our method robustly predicts the individual proportions from a mixed-pool of numerous donors, thus enabling the multiplexed testing of diverse donor cells en masse.More information is available at https://pgpresearch.med.harvard.edu/poolseq/.

  15. MR imaging of breast implants.

    PubMed

    Gorczyca, D P

    1994-11-01

    MR imaging has proved to be an excellent imaging modality in locating free silicone and evaluating an implant for rupture, with a sensitivity of approximately 94% and specificity of 97%. Silicone has a unique MR resonance frequency and long T1 and T2 relaxation times, which allows several MR sequences to provide excellent diagnostic images. The most commonly used sequences include T2-weighted, STIR, and chemical shift imaging (Figs. 3, 13, and 14). The T2-weighted and STIR sequences are often used in conjunction with chemical water suppression. The most reliable findings on MR images for detection of implant rupture include identification of the collapsed implant shell (linguine sign) and free silicone within the breast parenchyma.

  16. 2D nanomaterials assembled from sequence-defined molecules

    DOE PAGES

    Mu, Peng; Zhou, Guangwen; Chen, Chun-Long

    2017-10-21

    Two dimensional (2D) nanomaterials have attracted broad interest owing to their unique physical and chemical properties with potential applications in electronics, chemistry, biology, medicine and pharmaceutics. Due to the current limitations of traditional 2D nanomaterials (e.g., graphene and graphene oxide) in tuning surface chemistry and compositions, 2D nanomaterials assembled from sequence-defined molecules (e.g., DNAs, proteins, peptides and peptoids) have recently been developed. They represent an emerging class of 2D nanomaterials with attractive physical and chemical properties. Here, we summarize the recent progress in the synthesis and applications of this type of sequence-defined 2D nanomaterials. We also discuss the challenges andmore » opportunities in this new field.« less

  17. 2D nanomaterials assembled from sequence-defined molecules

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mu, Peng; Zhou, Guangwen; Chen, Chun-Long

    Two dimensional (2D) nanomaterials have attracted broad interest owing to their unique physical and chemical properties with potential applications in electronics, chemistry, biology, medicine and pharmaceutics. Due to the current limitations of traditional 2D nanomaterials (e.g., graphene and graphene oxide) in tuning surface chemistry and compositions, 2D nanomaterials assembled from sequence-defined molecules (e.g., DNAs, proteins, peptides and peptoids) have recently been developed. They represent an emerging class of 2D nanomaterials with attractive physical and chemical properties. Here, we summarize the recent progress in the synthesis and applications of this type of sequence-defined 2D nanomaterials. We also discuss the challenges andmore » opportunities in this new field.« less

  18. Phylogenetic Relationships and Species Delimitation in Pinus Section Trifoliae Inferrred from Plastid DNA

    PubMed Central

    Hernández-León, Sergio; Gernandt, David S.; Pérez de la Rosa, Jorge A.; Jardón-Barbolla, Lev

    2013-01-01

    Recent diversification followed by secondary contact and hybridization may explain complex patterns of intra- and interspecific morphological and genetic variation in the North American hard pines (Pinus section Trifoliae), a group of approximately 49 tree species distributed in North and Central America and the Caribbean islands. We concatenated five plastid DNA markers for an average of 3.9 individuals per putative species and assessed the suitability of the five regions as DNA bar codes for species identification, species delimitation, and phylogenetic reconstruction. The ycf1 gene accounted for the greatest proportion of the alignment (46.9%), the greatest proportion of variable sites (74.9%), and the most unique sequences (75 haplotypes). Phylogenetic analysis recovered clades corresponding to subsections Australes, Contortae, and Ponderosae. Sequences for 23 of the 49 species were monophyletic and sequences for another 9 species were paraphyletic. Morphologically similar species within subsections usually grouped together, but there were exceptions consistent with incomplete lineage sorting or introgression. Bayesian relaxed molecular clock analyses indicated that all three subsections diversified relatively recently during the Miocene. The general mixed Yule-coalescent method gave a mixed model estimate of only 22 or 23 evolutionary entities for the plastid sequences, which corresponds to less than half the 49 species recognized based on morphological species assignments. Including more unique haplotypes per species may result in higher estimates, but low mutation rates, recent diversification, and large effective population sizes may limit the effectiveness of this method to detect evolutionary entities. PMID:23936218

  19. Sequence organization and evolutionary dynamics of Brachypodium-specific centromere retrotransposons.

    PubMed

    Qi, L L; Wu, J J; Friebe, B; Qian, C; Gu, Y Q; Fu, D L; Gill, B S

    2013-08-01

    Brachypodium distachyon is a wild annual grass belonging to the Pooideae, more closely related to wheat, barley, and forage grasses than rice and maize. As an experimental model, the completed genome sequence of B. distachyon provides a unique opportunity to study centromere evolution during the speciation of grasses. Centromeric satellite sequences have been identified in B. distachyon, but little is known about centromeric retrotransposons in this species. In the present study, bacterial artificial chromosome (BAC)-fluorescence in situ hybridization was conducted in maize, rice, barley, wheat, and rye using B. distachyon (Bd) centromere-specific BAC clones. Eight Bd centromeric BAC clones gave no detectable fluorescence in situ hybridization (FISH) signals on the chromosomes of rice and maize, and three of them also did not yield any FISH signals in barley, wheat, and rye. In addition, four of five Triticeae centromeric BAC clones did not hybridize to the B. distachyon centromeres, implying certain unique features of Brachypodium centromeres. Analysis of Brachypodium centromeric BAC sequences identified a long terminal repeat (LTR)-centromere retrotransposon of B. distachyon (CRBd1). This element was found in high copy number accounting for 1.6 % of the B. distachyon genome, and is enriched in Brachypodium centromeric regions. CRBd1 accumulated in active centromeres, but was lost from inactive ones. The LTR of CRBd1 appears to be specific to B. distachyon centromeres. These results reveal different evolutionary events of this retrotransposon family across grass species.

  20. The Nostoc punctiforme Genome

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    John C. Meeks

    2001-12-31

    Nostoc punctiforme is a filamentous cyanobacterium with extensive phenotypic characteristics and a relatively large genome, approaching 10 Mb. The phenotypic characteristics include a photoautotrophic, diazotrophic mode of growth, but N. punctiforme is also facultatively heterotrophic; its vegetative cells have multiple development alternatives, including terminal differentiation into nitrogen-fixing heterocysts and transient differentiation into spore-like akinetes or motile filaments called hormogonia; and N. punctiforme has broad symbiotic competence with fungi and terrestrial plants, including bryophytes, gymnosperms and an angiosperm. The shotgun-sequencing phase of the N. punctiforme strain ATCC 29133 genome has been completed by the Joint Genome Institute. Annotation of an 8.9more » Mb database yielded 7432 open reading frames, 45% of which encode proteins with known or probable known function and 29% of which are unique to N. punctiforme. Comparative analysis of the sequence indicates a genome that is highly plastic and in a state of flux, with numerous insertion sequences and multilocus repeats, as well as genes encoding transposases and DNA modification enzymes. The sequence also reveals the presence of genes encoding putative proteins that collectively define almost all characteristics of cyanobacteria as a group. N. punctiforme has an extensive potential to sense and respond to environmental signals as reflected by the presence of more than 400 genes encoding sensor protein kinases, response regulators and other transcriptional factors. The signal transduction systems and any of the large number of unique genes may play essential roles in the cell differentiation and symbiotic interaction properties of N. punctiforme.« less

  1. zUMIs - A fast and flexible pipeline to process RNA sequencing data with UMIs.

    PubMed

    Parekh, Swati; Ziegenhain, Christoph; Vieth, Beate; Enard, Wolfgang; Hellmann, Ines

    2018-06-01

    Single-cell RNA-sequencing (scRNA-seq) experiments typically analyze hundreds or thousands of cells after amplification of the cDNA. The high throughput is made possible by the early introduction of sample-specific bar codes (BCs), and the amplification bias is alleviated by unique molecular identifiers (UMIs). Thus, the ideal analysis pipeline for scRNA-seq data needs to efficiently tabulate reads according to both BC and UMI. zUMIs is a pipeline that can handle both known and random BCs and also efficiently collapse UMIs, either just for exon mapping reads or for both exon and intron mapping reads. If BC annotation is missing, zUMIs can accurately detect intact cells from the distribution of sequencing reads. Another unique feature of zUMIs is the adaptive downsampling function that facilitates dealing with hugely varying library sizes but also allows the user to evaluate whether the library has been sequenced to saturation. To illustrate the utility of zUMIs, we analyzed a single-nucleus RNA-seq dataset and show that more than 35% of all reads map to introns. Also, we show that these intronic reads are informative about expression levels, significantly increasing the number of detected genes and improving the cluster resolution. zUMIs flexibility makes if possible to accommodate data generated with any of the major scRNA-seq protocols that use BCs and UMIs and is the most feature-rich, fast, and user-friendly pipeline to process such scRNA-seq data.

  2. Phylogenetic relationships and species delimitation in pinus section trifoliae inferrred from plastid DNA.

    PubMed

    Hernández-León, Sergio; Gernandt, David S; Pérez de la Rosa, Jorge A; Jardón-Barbolla, Lev

    2013-01-01

    Recent diversification followed by secondary contact and hybridization may explain complex patterns of intra- and interspecific morphological and genetic variation in the North American hard pines (Pinus section Trifoliae), a group of approximately 49 tree species distributed in North and Central America and the Caribbean islands. We concatenated five plastid DNA markers for an average of 3.9 individuals per putative species and assessed the suitability of the five regions as DNA bar codes for species identification, species delimitation, and phylogenetic reconstruction. The ycf1 gene accounted for the greatest proportion of the alignment (46.9%), the greatest proportion of variable sites (74.9%), and the most unique sequences (75 haplotypes). Phylogenetic analysis recovered clades corresponding to subsections Australes, Contortae, and Ponderosae. Sequences for 23 of the 49 species were monophyletic and sequences for another 9 species were paraphyletic. Morphologically similar species within subsections usually grouped together, but there were exceptions consistent with incomplete lineage sorting or introgression. Bayesian relaxed molecular clock analyses indicated that all three subsections diversified relatively recently during the Miocene. The general mixed Yule-coalescent method gave a mixed model estimate of only 22 or 23 evolutionary entities for the plastid sequences, which corresponds to less than half the 49 species recognized based on morphological species assignments. Including more unique haplotypes per species may result in higher estimates, but low mutation rates, recent diversification, and large effective population sizes may limit the effectiveness of this method to detect evolutionary entities.

  3. Assessing host-specificity of Escherichia coli using a supervised learning logic-regression-based analysis of single nucleotide polymorphisms in intergenic regions.

    PubMed

    Zhi, Shuai; Li, Qiaozhi; Yasui, Yutaka; Edge, Thomas; Topp, Edward; Neumann, Norman F

    2015-11-01

    Host specificity in E. coli is widely debated. Herein, we used supervised learning logic-regression-based analysis of intergenic DNA sequence variability in E. coli in an attempt to identify single nucleotide polymorphism (SNP) biomarkers of E. coli that are associated with natural selection and evolution toward host specificity. Seven-hundred and eighty strains of E. coli were isolated from 15 different animal hosts. We utilized logic regression for analyzing DNA sequence data of three intergenic regions (flanked by the genes uspC-flhDC, csgBAC-csgDEFG, and asnS-ompF) to identify genetic biomarkers that could potentially discriminate E. coli based on host sources. Across 15 different animal hosts, logic regression successfully discriminated E. coli based on animal host source with relatively high specificity (i.e., among the samples of the non-target animal host, the proportion that correctly did not have the host-specific marker pattern) and sensitivity (i.e., among the samples from a given animal host, the proportion that correctly had the host-specific marker pattern), even after fivefold cross validation. Permutation tests confirmed that for most animals, host specific intergenic biomarkers identified by logic regression in E. coli were significantly associated with animal host source. The highest level of biomarker sensitivity was observed in deer isolates, with 82% of all deer E. coli isolates displaying a unique SNP pattern that was 98% specific to deer. Fifty-three percent of human isolates displayed a unique biomarker pattern that was 98% specific to humans. Twenty-nine percent of cattle isolates displayed a unique biomarker that was 97% specific to cattle. Interestingly, even within a related host group (i.e., Family: Canidae [domestic dogs and coyotes]), highly specific SNP biomarkers (98% and 99% specificity for dog and coyotes, respectively) were observed, with 21% of dog E. coli isolates displaying a unique dog biomarker and 61% of coyote isolates displaying a unique coyote biomarker. Application of a supervised learning method, such as logic regression, to DNA sequence analysis at certain intergenic regions demonstrates that some E. coli strains may evolve to become host-specific. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. Core Proteomic Analysis of Unique Metabolic Pathways of Salmonella enterica for the Identification of Potential Drug Targets.

    PubMed

    Uddin, Reaz; Sufian, Muhammad

    2016-01-01

    Infections caused by Salmonella enterica, a Gram-negative facultative anaerobic bacteria belonging to the family of Enterobacteriaceae, are major threats to the health of humans and animals. The recent availability of complete genome data of pathogenic strains of the S. enterica gives new avenues for the identification of drug targets and drug candidates. We have used the genomic and metabolic pathway data to identify pathways and proteins essential to the pathogen and absent from the host. We took the whole proteome sequence data of 42 strains of S. enterica and Homo sapiens along with KEGG-annotated metabolic pathway data, clustered proteins sequences using CD-HIT, identified essential genes using DEG database and discarded S. enterica homologs of human proteins in unique metabolic pathways (UMPs) and characterized hypothetical proteins with SVM-prot and InterProScan. Through this core proteomic analysis we have identified enzymes essential to the pathogen. The identification of 73 enzymes common in 42 strains of S. enterica is the real strength of the current study. We proposed all 73 unexplored enzymes as potential drug targets against the infections caused by the S. enterica. The study is comprehensive around S. enterica and simultaneously considered every possible pathogenic strain of S. enterica. This comprehensiveness turned the current study significant since, to the best of our knowledge it is the first subtractive core proteomic analysis of the unique metabolic pathways applied to any pathogen for the identification of drug targets. We applied extensive computational methods to shortlist few potential drug targets considering the druggability criteria e.g. Non-homologous to the human host, essential to the pathogen and playing significant role in essential metabolic pathways of the pathogen (i.e. S. enterica). In the current study, the subtractive proteomics through a novel approach was applied i.e. by considering only proteins of the unique metabolic pathways of the pathogens and mining the proteomic data of all completely sequenced strains of the pathogen, thus improving the quality and application of the results. We believe that the sharing of the knowledge from this study would eventually lead to bring about novel and unique therapeutic regimens against the infections caused by the S. enterica.

  5. How to Fabricate Functional Artificial Luciferases for Bioassays.

    PubMed

    Kim, Sung-Bae; Fujii, Rika

    2016-01-01

    The present protocol introduces fabrication of artificial luciferases (ALuc(®)) by extracting the consensus amino acids from the alignment of copepod luciferase sequences. The made ALucs have unique sequential identities that are phylogenetically distinctive from those of any existing copepod luciferase. Some ALucs exhibited heat stability, and strong and greatly prolonged optical intensities. The made ALucs are applicable to various bioassays as an optical readout, including live cell imaging, single-chain probes, and bioluminescent tags of antibodies. The present protocol guides on how to fabricate a unique artificial luciferase with designed optical properties and functionalities.

  6. Unique BK virus non-coding control region (NCCR) variants in hematopoietic stem cell transplant recipients with and without hemorrhagic cystitis.

    PubMed

    Carr, Michael J; McCormack, Grace P; Mutton, Ken J; Crowley, Brendan

    2006-04-01

    Hematopoietic stem cell transplant recipients frequently develop BK virus (BKV)-associated hemorrhagic cystitis, which coincides with BK viruria. However, the precise role of BKV in the etiology of hemorrhagic cystitis in hematopoietic stem cell transplant recipients remains unclear, since approximately 50% of all such adult transplant recipients excrete BKV, yet do not develop this clinical condition. In the present study, BKV were analyzed to determine if mutations in the non-coding control region (NCCR), and specific BKV sub-types defined by sequence analysis of major capsid protein VP1, were associated with development of hemorrhagic cystitis in hematopoietic stem cell transplant recipients. The regions encoding VP1 and NCCRs of BKV in urine samples collected from 15 hematopoietic stem cell transplant recipients with hemorrhagic cystitis and 20 without this illness were amplified and sequenced. Sequence variations in the NCCRs of BKV were identified in urine samples from those with and without hemorrhagic cystitis. Furthermore, five unique sequence variations within transcription factor binding sites in the canonical NCCR, O-P-Q-R-S, were identified, representing new BKV variants from a population of cloned quasi-species obtained from patients with and without hemorrhagic cystitis. Thirty-five BKV VP1 sequences were analyzed by phylogenetic analysis but no specific BKV sub-type was associated with hemorrhagic cystitis. Five previously unrecognized naturally occurring variants of the BKV are described which involve amplifications, deletions, and rearrangements of the archetypal BKV NCCRs in individuals with and without hemorrhagic cystitis. Architectural rearrangements in the NCCRs of BKV did not appear to be a prerequisite for development of hemorrhagic cystitis in hematopoietic stem cell transplant recipients. Copyright 2006 Wiley-Liss, Inc.

  7. Organismal, genetic, and transcriptional variation in the deeply sequenced gut microbiomes of identical twins

    PubMed Central

    Turnbaugh, Peter J.; Quince, Christopher; Faith, Jeremiah J.; McHardy, Alice C.; Yatsunenko, Tanya; Niazi, Faheem; Affourtit, Jason; Egholm, Michael; Henrissat, Bernard; Knight, Rob; Gordon, Jeffrey I.

    2010-01-01

    We deeply sampled the organismal, genetic, and transcriptional diversity in fecal samples collected from a monozygotic (MZ) twin pair and compared the results to 1,095 communities from the gut and other body habitats of related and unrelated individuals. Using a new scheme for noise reduction in pyrosequencing data, we estimated the total diversity of species-level bacterial phylotypes in the 1.2-1.5 million bacterial 16S rRNA reads obtained from each deeply sampled cotwin to be ~800 (35.9%, 49.1% detected in both). A combined 1.1 million read 16S rRNA dataset representing 281 shallowly sequenced fecal samples from 54 twin pairs and their mothers contained an estimated 4,018 species-level phylotypes, with each sample having a unique species assemblage (53.4 ± 0.6% and 50.3 ± 0.5% overlap with the deeply sampled cotwins). Of the 134 phylotypes with a relative abundance of >0.1% in the combined dataset, only 37 appeared in >50% of the samples, with one phylotype in the Lachnospiraceae family present in 99%. Nongut communities had significantly reduced overlap with the deeply sequenced twins’ fecal microbiota (18.3 ± 0.3%, 15.3 ± 0.3%). The MZ cotwins’ fecal DNA was deeply sequenced (3.8-6.3 Gbp/sample) and assembled reads were assigned to 25 genus-level phylogenetic bins. Only 17% of the genes in these bins were shared between the cotwins. Bins exhibited differences in their degree of sequence variation, gene content including the repertoire of carbohydrate active enzymes present within and between twins (e.g., predicted cellulases, dockerins), and transcriptional activities. These results provide an expanded perspective about features that make each of us unique life forms and directions for future characterization of our gut ecosystems. PMID:20363958

  8. NASBA: A detection and amplification system uniquely suited for RNA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sooknanan, R.; Malek, L.T.

    1995-06-01

    The invention of PCR (polymerase chain reaction) has revolutionized our ability to amplify and manipulate a nucleic acid sequence in vitro. The commercial rewards of this revolution have driven the development of other nuclei acid amplification and detection methodologies. This has created an alphabet soup of technologies that use different amplification methods, including NASBA (nucleic acid sequence-based amplification), LCR (ligase chain reaction), SDA (strand displacement amplification), QBR (Q-beta replicase), CPR (cycling probe reaction), and bDNA (branched DNA). Despite the differences in their processes, these amplification systems can be separated into two broad categories based on how they achieve their goal:more » sequence-based amplification systems, such as PCR, NASBA, and SDA, amplify a target nucleic acid sequence. Signal-based amplification systems, such as LCR, QBR, CPR and bDNA, amplify or alter a signal from a detection reaction that is target-dependent. While the various methods have relative strengths and weaknesses, only NASBA offers the unique ability to homogeneously amplify an RNA analyte in the presence of homologous genomic DNA under isothermal conditions. Since the detection of RNA sequences almost invariably measures biological activity, it is an excellent prognostic indicator of activities as diverse as virus production, gene expression, and cell viability. The isothermal nature of the reaction makes NASBA especially suitable for large-scale manual screening. These features extend NASBA`s application range from research to commercial diagnostic applications. Field test kits are presently under development for human diagnostics as well as the burgeoning fields of food and environmental diagnostic testing. These developments suggest future integration of NASBA into robotic workstations for high-throughput screening as well. 17 refs., 1 tab.« less

  9. Comparative analyses of putative toxin gene homologs from an Old World viper, Daboia russelii

    PubMed Central

    Krishnan, Neeraja M.

    2017-01-01

    Availability of snake genome sequences has opened up exciting areas of research on comparative genomics and gene diversity. One of the challenges in studying snake genomes is the acquisition of biological material from live animals, especially from the venomous ones, making the process cumbersome and time-consuming. Here, we report comparative sequence analyses of putative toxin gene homologs from Russell’s viper (Daboia russelii) using whole-genome sequencing data obtained from shed skin. When compared with the major venom proteins in Russell’s viper studied previously, we found 45–100% sequence similarity between the venom proteins and their putative homologs in the skin. Additionally, comparative analyses of 20 putative toxin gene family homologs provided evidence of unique sequence motifs in nerve growth factor (NGF), platelet derived growth factor (PDGF), Kunitz/Bovine pancreatic trypsin inhibitor (Kunitz BPTI), cysteine-rich secretory proteins, antigen 5, andpathogenesis-related1 proteins (CAP) and cysteine-rich secretory protein (CRISP). In those derived proteins, we identified V11 and T35 in the NGF domain; F23 and A29 in the PDGF domain; N69, K2 and A5 in the CAP domain; and Q17 in the CRISP domain to be responsible for differences in the largest pockets across the protein domain structures in crotalines, viperines and elapids from the in silico structure-based analysis. Similarly, residues F10, Y11 and E20 appear to play an important role in the protein structures across the kunitz protein domain of viperids and elapids. Our study highlights the usefulness of shed skin in obtaining good quality high-molecular weight DNA for comparative genomic studies, and provides evidence towards the unique features and evolution of putative venom gene homologs in vipers. PMID:29230357

  10. De Novo RNA Sequencing and Transcriptome Analysis of Colletotrichum gloeosporioides ES026 Reveal Genes Related to Biosynthesis of Huperzine A

    PubMed Central

    Zhang, Xiangmei; Xia, Qianqian; Zhao, Xinmei; Ahn, Youngjoon; Ahmed, Nevin; Cosoveanu, Andreea; Wang, Mo; Wang, Jialu; Shu, Shaohua

    2015-01-01

    Huperzine A is important in the treatment of Alzheimer’s disease. There are major challenges for the mass production of huperzine A from plants due to the limited number of huperzine-A-producing plants, as well as the low content of huperzine A in these plants. Various endophytic fungi produce huperzine A. Colletotrichum gloeosporioides ES026 was previously isolated from a huperzine-A-producing plant Huperzia serrata, and this fungus also produces huperzine A. In this study, de novo RNA sequencing of C. gloeosporioides ES026 was carried out with an Illumina HiSeq2000. A total of 4,324,299,051 bp from 50,442,617 high-quality sequence reads of ES026 were obtained. These raw data were assembled into 24,998 unigenes, 40,536,684 residues and 19,790 genes. The majority of the unique sequences were assigned to corresponding putative functions based on BLAST searches of public databases. The molecular functions, biological processes and biochemical pathways of these unique sequences were determined using gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) assignments. A gene encoding copper amine oxidase (CAO) (unigene 9322) was annotated for the conversion of cadaverine to 5-aminopentanal in the biosynthesis of huperzine A. This gene was also detected in the root, stem and leaf of H. serrata. Furthermore, a close relationship was observed between expression of the CAO gene (unigene 9322) and quantity of crude huperzine A extracted from ES026. Therefore, CAO might be involved in the biosynthesis of huperzine A and it most likely plays a key role in regulating the content of huperzine A in ES026. PMID:25799531

  11. Whole-Genome Sequencing for Detecting Antimicrobial Resistance in Nontyphoidal Salmonella.

    PubMed

    McDermott, Patrick F; Tyson, Gregory H; Kabera, Claudine; Chen, Yuansha; Li, Cong; Folster, Jason P; Ayers, Sherry L; Lam, Claudia; Tate, Heather P; Zhao, Shaohua

    2016-09-01

    Laboratory-based in vitro antimicrobial susceptibility testing is the foundation for guiding anti-infective therapy and monitoring antimicrobial resistance trends. We used whole-genome sequencing (WGS) technology to identify known antimicrobial resistance determinants among strains of nontyphoidal Salmonella and correlated these with susceptibility phenotypes to evaluate the utility of WGS for antimicrobial resistance surveillance. Six hundred forty Salmonella of 43 different serotypes were selected from among retail meat and human clinical isolates that were tested for susceptibility to 14 antimicrobials using broth microdilution. The MIC for each drug was used to categorize isolates as susceptible or resistant based on Clinical and Laboratory Standards Institute clinical breakpoints or National Antimicrobial Resistance Monitoring System (NARMS) consensus interpretive criteria. Each isolate was subjected to whole-genome shotgun sequencing, and resistance genes were identified from assembled sequences. A total of 65 unique resistance genes, plus mutations in two structural resistance loci, were identified. There were more unique resistance genes (n = 59) in the 104 human isolates than in the 536 retail meat isolates (n = 36). Overall, resistance genotypes and phenotypes correlated in 99.0% of cases. Correlations approached 100% for most classes of antibiotics but were lower for aminoglycosides and beta-lactams. We report the first finding of extended-spectrum β-lactamases (ESBLs) (blaCTX-M1 and blaSHV2a) in retail meat isolates of Salmonella in the United States. Whole-genome sequencing is an effective tool for predicting antibiotic resistance in nontyphoidal Salmonella, although the use of more appropriate surveillance breakpoints and increased knowledge of new resistance alleles will further improve correlations. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  12. Bifidobacterium animalis subsp. lactis ATCC 27673 Is a Genomically Unique Strain within Its Conserved Subspecies

    PubMed Central

    Loquasto, Joseph R.; Barrangou, Rodolphe; Dudley, Edward G.; Stahl, Buffy; Chen, Chun

    2013-01-01

    Many strains of Bifidobacterium animalis subsp. lactis are considered health-promoting probiotic microorganisms and are commonly formulated into fermented dairy foods. Analyses of previously sequenced genomes of B. animalis subsp. lactis have revealed little genetic diversity, suggesting that it is a monomorphic subspecies. However, during a multilocus sequence typing survey of Bifidobacterium, it was revealed that B. animalis subsp. lactis ATCC 27673 gave a profile distinct from that of the other strains of the subspecies. As part of an ongoing study designed to understand the genetic diversity of this subspecies, the genome of this strain was sequenced and compared to other sequenced genomes of B. animalis subsp. lactis and B. animalis subsp. animalis. The complete genome of ATCC 27673 was 1,963,012 bp, contained 1,616 genes and 4 rRNA operons, and had a G+C content of 61.55%. Comparative analyses revealed that the genome of ATCC 27673 contained six distinct genomic islands encoding 83 open reading frames not found in other strains of the same subspecies. In four islands, either phage or mobile genetic elements were identified. In island 6, a novel clustered regularly interspaced short palindromic repeat (CRISPR) locus which contained 81 unique spacers was identified. This type I-E CRISPR-cas system differs from the type I-C systems previously identified in this subspecies, representing the first identification of a different system in B. animalis subsp. lactis. This study revealed that ATCC 27673 is a strain of B. animalis subsp. lactis with novel genetic content and suggests that the lack of genetic variability observed is likely due to the repeated sequencing of a limited number of widely distributed commercial strains. PMID:23995933

  13. International Life Science Institute North America Cronobacter (Formerly Enterobacter sakazakii) isolate set.

    PubMed

    Ivy, Reid A; Farber, Jeffrey M; Pagotto, Franco; Wiedmann, Martin

    2013-01-01

    Foodborne pathogen isolate collections are important for the development of detection methods, for validation of intervention strategies, and to develop an understanding of pathogenesis and virulence. We have assembled a publicly available Cronobacter (formerly Enterobacter sakazakii) isolate set that consists of (i) 25 Cronobacter sakazakii isolates, (ii) two Cronobacter malonaticus isolates, (iii) one Cronobacter muytjensii isolate, which displays some atypical phenotypic characteristics, biochemical profiles, and colony color on selected differential media, and (iv) two nonclinical Enterobacter asburiae isolates, which show some phenotypic characteristics similar to those of Cronobacter spp. The set consists of human (n = 10), food (n = 11), and environmental (n = 9) isolates. Analysis of partial 16S rDNA sequence and seven-gene multilocus sequence typing data allowed for reliable identification of these isolates to species and identification of 14 isolates as sequence type 4, which had previously been shown to be the most common C. sakazakii sequence type associated with neonatal meningitis. Phenotypic characterization was carried out with API 20E and API 32E test strips and streaking on two selective chromogenic agars; isolates were also assessed for sorbitol fermentation and growth at 45°C. Although these strategies typically produced the same classification as sequence-based strategies, based on a panel of four biochemical tests, one C. sakazakii isolate yielded inconclusive data and one was classified as C. malonaticus. EcoRI automated ribotyping and pulsed-field gel electrophoresis (PFGE) with XbaI separated the set into 23 unique ribotypes and 30 unique PFGE types, respectively, indicating subtype diversity within the set. Subtype and source data for the collection are publicly available in the PathogenTracker database (www. pathogentracker. net), which allows for continuous updating of information on the set, including links to publications that include information on isolates from this collection.

  14. The mitochondrial genome sequences of the round goby and the sand goby reveal patterns of recent evolution in gobiid fish.

    PubMed

    Adrian-Kalchhauser, Irene; Svensson, Ola; Kutschera, Verena E; Alm Rosenblad, Magnus; Pippel, Martin; Winkler, Sylke; Schloissnig, Siegfried; Blomberg, Anders; Burkhardt-Holm, Patricia

    2017-02-16

    Vertebrate mitochondrial genomes are optimized for fast replication and low cost of RNA expression. Accordingly, they are devoid of introns, are transcribed as polycistrons and contain very little intergenic sequences. Usually, vertebrate mitochondrial genomes measure between 16.5 and 17 kilobases (kb). During genome sequencing projects for two novel vertebrate models, the invasive round goby and the sand goby, we found that the sand goby genome is exceptionally small (16.4 kb), while the mitochondrial genome of the round goby is much larger than expected for a vertebrate. It is 19 kb in size and is thus one of the largest fish and even vertebrate mitochondrial genomes known to date. The expansion is attributable to a sequence insertion downstream of the putative transcriptional start site. This insertion carries traces of repeats from the control region, but is mostly novel. To get more information about this phenomenon, we gathered all available mitochondrial genomes of Gobiidae and of nine gobioid species, performed phylogenetic analyses, analysed gene arrangements, and compared gobiid mitochondrial genome sizes, ecological information and other species characteristics with respect to the mitochondrial phylogeny. This allowed us amongst others to identify a unique arrangement of tRNAs among Ponto-Caspian gobies. Our results indicate that the round goby mitochondrial genome may contain novel features. Since mitochondrial genome organisation is tightly linked to energy metabolism, these features may be linked to its invasion success. Also, the unique tRNA arrangement among Ponto-Caspian gobies may be helpful in studying the evolution of this highly adaptive and invasive species group. Finally, we find that the phylogeny of gobiids can be further refined by the use of longer stretches of linked DNA sequence.

  15. Chemical property based sequence characterization of PpcA and its homolog proteins PpcB-E: A mathematical approach

    PubMed Central

    Pal Choudhury, Pabitra

    2017-01-01

    Periplasmic c7 type cytochrome A (PpcA) protein is determined in Geobacter sulfurreducens along with its other four homologs (PpcB-E). From the crystal structure viewpoint the observation emerges that PpcA protein can bind with Deoxycholate (DXCA), while its other homologs do not. But it is yet to be established with certainty the reason behind this from primary protein sequence information. This study is primarily based on primary protein sequence analysis through the chemical basis of embedded amino acids. Firstly, we look for the chemical group specific score of amino acids. Along with this, we have developed a new methodology for the phylogenetic analysis based on chemical group dissimilarities of amino acids. This new methodology is applied to the cytochrome c7 family members and pinpoint how a particular sequence is differing with others. Secondly, we build a graph theoretic model on using amino acid sequences which is also applied to the cytochrome c7 family members and some unique characteristics and their domains are highlighted. Thirdly, we search for unique patterns as subsequences which are common among the group or specific individual member. In all the cases, we are able to show some distinct features of PpcA that emerges PpcA as an outstanding protein compared to its other homologs, resulting towards its binding with deoxycholate. Similarly, some notable features for the structurally dissimilar protein PpcD compared to the other homologs are also brought out. Further, the five members of cytochrome family being homolog proteins, they must have some common significant features which are also enumerated in this study. PMID:28362850

  16. Comprehensive definition of genome features in Spirodela polyrhiza by high-depth physical mapping and short-read DNA sequencing strategies.

    PubMed

    Michael, Todd P; Bryant, Douglas; Gutierrez, Ryan; Borisjuk, Nikolai; Chu, Philomena; Zhang, Hanzhong; Xia, Jing; Zhou, Junfei; Peng, Hai; El Baidouri, Moaine; Ten Hallers, Boudewijn; Hastie, Alex R; Liang, Tiffany; Acosta, Kenneth; Gilbert, Sarah; McEntee, Connor; Jackson, Scott A; Mockler, Todd C; Zhang, Weixiong; Lam, Eric

    2017-02-01

    Spirodela polyrhiza is a fast-growing aquatic monocot with highly reduced morphology, genome size and number of protein-coding genes. Considering these biological features of Spirodela and its basal position in the monocot lineage, understanding its genome architecture could shed light on plant adaptation and genome evolution. Like many draft genomes, however, the 158-Mb Spirodela genome sequence has not been resolved to chromosomes, and important genome characteristics have not been defined. Here we deployed rapid genome-wide physical maps combined with high-coverage short-read sequencing to resolve the 20 chromosomes of Spirodela and to empirically delineate its genome features. Our data revealed a dramatic reduction in the number of the rDNA repeat units in Spirodela to fewer than 100, which is even fewer than that reported for yeast. Consistent with its unique phylogenetic position, small RNA sequencing revealed 29 Spirodela-specific microRNA, with only two being shared with Elaeis guineensis (oil palm) and Musa balbisiana (banana). Combining DNA methylation data and small RNA sequencing enabled the accurate prediction of 20.5% long terminal repeats (LTRs) that doubled the previous estimate, and revealed a high Solo:Intact LTR ratio of 8.2. Interestingly, we found that Spirodela has the lowest global DNA methylation levels (9%) of any plant species tested. Taken together our results reveal a genome that has undergone reduction, likely through eliminating non-essential protein coding genes, rDNA and LTRs. In addition to delineating the genome features of this unique plant, the methodologies described and large-scale genome resources from this work will enable future evolutionary and functional studies of this basal monocot family. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.

  17. Molecular inimitability amongst tumors: implications for precision cancer medicine in the age of personalized oncology.

    PubMed

    Patel, Sandip P; Schwaederle, Maria; Daniels, Gregory A; Fanta, Paul T; Schwab, Richard B; Shimabukuro, Kelly A; Kesari, Santosh; Piccioni, David E; Bazhenova, Lyudmila A; Helsten, Teresa L; Lippman, Scott M; Parker, Barbara A; Kurzrock, Razelle

    2015-10-20

    Tumor sequencing has revolutionized oncology, allowing for detailed interrogation of the molecular underpinnings of cancer at an individual level. With this additional insight, it is increasingly apparent that not only do tumors vary within a sample (tumor heterogeneity), but also that each patient's individual tumor is a constellation of unique molecular aberrations that will require an equally unique personalized therapeutic regimen. We report here the results of 439 patients who underwent Clinical Laboratory Improvement Amendment (CLIA)-certified next generation sequencing (NGS) across histologies. Among these patients, 98.4% had a unique molecular profile, and aside from three primary brain tumor patients with a single genetic lesion (IDH1 R132H), no two patients within a given histology were molecularly identical. Additionally, two sets of patients had identical profiles consisting of two mutations in common and no other anomalies. However, these profiles did not segregate by histology (lung adenocarcinoma-appendiceal cancer (KRAS G12D and GNAS R201C), and lung adenocarcinoma-liposarcoma (CDK4 and MDM2 amplification pairs)). These findings suggest that most advanced tumors are molecular singletons within and between histologies, and that tumors that differ in histology may still nonetheless exhibit identical molecular portraits, albeit rarely.

  18. Multiplexed microsatellite recovery using massively parallel sequencing

    USGS Publications Warehouse

    Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C.

    2011-01-01

    Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5M (USD).

  19. Cell cycle, differentiation and tissue-independent expression of ribosomal protein L37.

    PubMed

    Su, S; Bird, R C

    1995-09-15

    A unique human cDNA (hG1.16) that encodes a mRNA of 450 nucleotides was isolated from a subtractive library derived from HeLa cells. The relative expression level of hG1.16 during different cell-cycle phases was determined by Northern-blot analysis of cells synchronized by double-thymidine block and serum deprivation/refeeding. hG1.16 was constitutively expressed during all phases of the cell cycle, including the quiescent phase when even most constitutively expressed genes experience some suppression of expression. The expression level of hG1.16 did not change during terminal differentiation of myoblasts to myotubes, during which cells become permanently post-mitotic. Examination of other tissues revealed that the relative expression level of hG1.16 was constitutive in all embryonic mouse tissues examined, including brain, eye, heart, kidney, liver, lung and skeletal muscle. This was unusual in that expression was not down-modulated during differentiation and did not vary appreciably between tissue types. Analysis by inter-species Northern-blot analysis revealed that hG1.16 was highly conserved among all vertebrates studied (from fish to humans but not in insects). DNA sequence analysis of hG1.16 revealed a high level of similarity to rat ribosomal protein L37, identifying hG1.16 as a new member of this multigene family. The deduced amino acid sequence of hG1.16 was identical to rat ribosomal protein L37 that contained 97 amino acids, many of which are highly positively charged (15 arginine and 14 lysine residues with a predicted M(r) of 11,065). hG1.16 protein has a single C2-C2 zinc-finger-like motif which is also present in rat ribosomal protein L37. Using primers designed from the sequence of hG1.16, unique bovine and rat cDNAs were also isolated by 5'-rapid-amplification of cDNA ends. DNA sequences of bovine and rat G1.16, clones were 92.8% and 92.2% similar to human G1.16 while the deduced amino acid sequences derived from bovine and rat cDNAs each differed by a single amino acid from the sequence of hG1.16 and the published rat L37 sequence. Southern-blot analysis revealed that hG1.16 exists in multiple copies in human, rat and mouse genomes. These G1.16 clones encode unique human, rat and bovine members of the ribosomal protein L37 gene family, which are constitutively expressed even during transitions from quiescence to active cell proliferation or terminal differentiation, in all tissues and all vertebrates investigated.

  20. From NGS assembly challenges to instability of fungal mitochondrial genomes: A case study in genome complexity.

    PubMed

    Misas, Elizabeth; Muñoz, José Fernando; Gallo, Juan Esteban; McEwen, Juan Guillermo; Clay, Oliver Keatinge

    2016-04-01

    The presence of repetitive or non-unique DNA persisting over sizable regions of a eukaryotic genome can hinder the genome's successful de novo assembly from short reads: ambiguities in assigning genome locations to the non-unique subsequences can result in premature termination of contigs and thus overfragmented assemblies. Fungal mitochondrial (mtDNA) genomes are compact (typically less than 100 kb), yet often contain short non-unique sequences that can be shown to impede their successful de novo assembly in silico. Such repeats can also confuse processes in the cell in vivo. A well-studied example is ectopic (out-of-register, illegitimate) recombination associated with repeat pairs, which can lead to deletion of functionally important genes that are located between the repeats. Repeats that remain conserved over micro- or macroevolutionary timescales despite such risks may indicate functionally or structurally (e.g., for replication) important regions. This principle could form the basis of a mining strategy for accelerating discovery of function in genome sequences. We present here our screening of a sample of 11 fully sequenced fungal mitochondrial genomes by observing where exact k-mer repeats occurred several times; initial analyses motivated us to focus on 17-mers occurring more than three times. Based on the diverse repeats we observe, we propose that such screening may serve as an efficient expedient for gaining a rapid but representative first insight into the repeat landscapes of sparsely characterized mitochondrial chromosomes. Our matching of the flagged repeats to previously reported regions of interest supports the idea that systems of persisting, non-trivial repeats in genomes can often highlight features meriting further attention. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. Structure-Function Studies with the Unique Hexameric Form II Ribulose-1,5-bisphosphate Carboxylase/Oxygenase (Rubisco) from Rhodopseudomonas palustris*

    PubMed Central

    Satagopan, Sriram; Chan, Sum; Perry, L. Jeanne; Tabita, F. Robert

    2014-01-01

    The first x-ray crystal structure has been solved for an activated transition-state analog-bound form II ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco). This enzyme, from Rhodopseudomonas palustris, assembles as a unique hexamer with three pairs of catalytic large subunit homodimers around a central 3-fold symmetry axis. This oligomer arrangement is unique among all known Rubisco structures, including the form II homolog from Rhodospirillum rubrum. The presence of a transition-state analog in the active site locked the activated enzyme in a “closed” conformation and revealed the positions of critical active site residues during catalysis. Functional roles of two form II-specific residues (Ile165 and Met331) near the active site were examined via site-directed mutagenesis. Substitutions at these residues affect function but not the ability of the enzyme to assemble. Random mutagenesis and suppressor selection in a Rubisco deletion strain of Rhodobacter capsulatus identified a residue in the amino terminus of one subunit (Ala47) that compensated for a negative change near the active site of a neighboring subunit. In addition, substitution of the native carboxyl-terminal sequence with the last few dissimilar residues from the related R. rubrum homolog increased the enzyme's kcat for carboxylation. However, replacement of a longer carboxyl-terminal sequence with termini from either a form III or a form I enzyme, which varied both in length and sequence, resulted in complete loss of function. From these studies, it is evident that a number of subtle interactions near the active site and the carboxyl terminus account for functional differences between the different forms of Rubiscos found in nature. PMID:24942737

  2. Structure-function studies with the unique hexameric form II ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) from Rhodopseudomonas palustris.

    PubMed

    Satagopan, Sriram; Chan, Sum; Perry, L Jeanne; Tabita, F Robert

    2014-08-01

    The first x-ray crystal structure has been solved for an activated transition-state analog-bound form II ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco). This enzyme, from Rhodopseudomonas palustris, assembles as a unique hexamer with three pairs of catalytic large subunit homodimers around a central 3-fold symmetry axis. This oligomer arrangement is unique among all known Rubisco structures, including the form II homolog from Rhodospirillum rubrum. The presence of a transition-state analog in the active site locked the activated enzyme in a "closed" conformation and revealed the positions of critical active site residues during catalysis. Functional roles of two form II-specific residues (Ile(165) and Met(331)) near the active site were examined via site-directed mutagenesis. Substitutions at these residues affect function but not the ability of the enzyme to assemble. Random mutagenesis and suppressor selection in a Rubisco deletion strain of Rhodobacter capsulatus identified a residue in the amino terminus of one subunit (Ala(47)) that compensated for a negative change near the active site of a neighboring subunit. In addition, substitution of the native carboxyl-terminal sequence with the last few dissimilar residues from the related R. rubrum homolog increased the enzyme's kcat for carboxylation. However, replacement of a longer carboxyl-terminal sequence with termini from either a form III or a form I enzyme, which varied both in length and sequence, resulted in complete loss of function. From these studies, it is evident that a number of subtle interactions near the active site and the carboxyl terminus account for functional differences between the different forms of Rubiscos found in nature. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.

  3. CMS: A Web-Based System for Visualization and Analysis of Genome-Wide Methylation Data of Human Cancers

    PubMed Central

    Huang, Yi-Wen; Roa, Juan C.; Goodfellow, Paul J.; Kizer, E. Lynette; Huang, Tim H. M.; Chen, Yidong

    2013-01-01

    Background DNA methylation of promoter CpG islands is associated with gene suppression, and its unique genome-wide profiles have been linked to tumor progression. Coupled with high-throughput sequencing technologies, it can now efficiently determine genome-wide methylation profiles in cancer cells. Also, experimental and computational technologies make it possible to find the functional relationship between cancer-specific methylation patterns and their clinicopathological parameters. Methodology/Principal Findings Cancer methylome system (CMS) is a web-based database application designed for the visualization, comparison and statistical analysis of human cancer-specific DNA methylation. Methylation intensities were obtained from MBDCap-sequencing, pre-processed and stored in the database. 191 patient samples (169 tumor and 22 normal specimen) and 41 breast cancer cell-lines are deposited in the database, comprising about 6.6 billion uniquely mapped sequence reads. This provides comprehensive and genome-wide epigenetic portraits of human breast cancer and endometrial cancer to date. Two views are proposed for users to better understand methylation structure at the genomic level or systemic methylation alteration at the gene level. In addition, a variety of annotation tracks are provided to cover genomic information. CMS includes important analytic functions for interpretation of methylation data, such as the detection of differentially methylated regions, statistical calculation of global methylation intensities, multiple gene sets of biologically significant categories, interactivity with UCSC via custom-track data. We also present examples of discoveries utilizing the framework. Conclusions/Significance CMS provides visualization and analytic functions for cancer methylome datasets. A comprehensive collection of datasets, a variety of embedded analytic functions and extensive applications with biological and translational significance make this system powerful and unique in cancer methylation research. CMS is freely accessible at: http://cbbiweb.uthscsa.edu/KMethylomes/. PMID:23630576

  4. Xylan utilization in human gut commensal bacteria is orchestrated by unique modular organization of polysaccharide-degrading enzymes.

    PubMed

    Zhang, Meiling; Chekan, Jonathan R; Dodd, Dylan; Hong, Pei-Ying; Radlinski, Lauren; Revindran, Vanessa; Nair, Satish K; Mackie, Roderick I; Cann, Isaac

    2014-09-02

    Enzymes that degrade dietary and host-derived glycans represent the most abundant functional activities encoded by genes unique to the human gut microbiome. However, the biochemical activities of a vast majority of the glycan-degrading enzymes are poorly understood. Here, we use transcriptome sequencing to understand the diversity of genes expressed by the human gut bacteria Bacteroides intestinalis and Bacteroides ovatus grown in monoculture with the abundant dietary polysaccharide xylan. The most highly induced carbohydrate active genes encode a unique glycoside hydrolase (GH) family 10 endoxylanase (BiXyn10A or BACINT_04215 and BACOVA_04390) that is highly conserved in the Bacteroidetes xylan utilization system. The BiXyn10A modular architecture consists of a GH10 catalytic module disrupted by a 250 amino acid sequence of unknown function. Biochemical analysis of BiXyn10A demonstrated that such insertion sequences encode a new family of carbohydrate-binding modules (CBMs) that binds to xylose-configured oligosaccharide/polysaccharide ligands, the substrate of the BiXyn10A enzymatic activity. The crystal structures of CBM1 from BiXyn10A (1.8 Å), a cocomplex of BiXyn10A CBM1 with xylohexaose (1.14 Å), and the CBM from its homolog in the Prevotella bryantii B14 Xyn10C (1.68 Å) reveal an unanticipated mode for ligand binding. A minimal enzyme mix, composed of the gene products of four of the most highly up-regulated genes during growth on wheat arabinoxylan, depolymerizes the polysaccharide into its component sugars. The combined biochemical and biophysical studies presented here provide a framework for understanding fiber metabolism by an important group within the commensal bacterial population known to influence human health.

  5. Cadherin Expression, Vectorial Active Transport, and Metallothionein Isoform 3 Mediated EMT/MET Responses in Cultured Primary and Immortalized Human Proximal Tubule Cells

    PubMed Central

    Slusser, Andrea; Bathula, Chandra S.; Sens, Donald A.; Somji, Seema; Sens, Mary Ann; Zhou, Xu Dong; Garrett, Scott H.

    2015-01-01

    Background Cultures of human proximal tubule cells have been widely utilized to study the role of EMT in renal disease. The goal of this study was to define the role of growth media composition on classic EMT responses, define the expression of E- and N-cadherin, and define the functional epitope of MT-3 that mediates MET in HK-2 cells. Methods Immunohistochemistry, microdissection, real-time PCR, western blotting, and ELISA were used to define the expression of E- and N-cadherin mRNA and protein in HK-2 and HPT cell cultures. Site-directed mutagenesis, stable transfection, measurement of transepithelial resistance and dome formation were used to define the unique amino acid sequence of MT-3 associated with MET in HK-2 cells. Results It was shown that both E- and N-cadherin mRNA and protein are expressed in the human renal proximal tubule. It was shown, based on the pattern of cadherin expression, connexin expression, vectorial active transport, and transepithelial resistance, that the HK-2 cell line has already undergone many of the early features associated with EMT. It was shown that the unique, six amino acid, C-terminal sequence of MT-3 is required for MT-3 to induce MET in HK-2 cells. Conclusions The results show that the HK-2 cell line can be an effective model to study later stages in the conversion of the renal epithelial cell to a mesenchymal cell. The HK-2 cell line, transfected with MT-3, may be an effective model to study the process of MET. The study implicates the unique C-terminal sequence of MT-3 in the conversion of HK-2 cells to display an enhanced epithelial phenotype. PMID:25803827

  6. Conserved small mRNA with an unique, extended Shine-Dalgarno sequence

    PubMed Central

    Hahn, Julia; Migur, Anzhela; von Boeselager, Raphael Freiherr; Kubatova, Nina; Kubareva, Elena; Schwalbe, Harald

    2017-01-01

    ABSTRACT Up to now, very small protein-coding genes have remained unrecognized in sequenced genomes. We identified an mRNA of 165 nucleotides (nt), which is conserved in Bradyrhizobiaceae and encodes a polypeptide with 14 amino acid residues (aa). The small mRNA harboring a unique Shine-Dalgarno sequence (SD) with a length of 17 nt was localized predominantly in the ribosome-containing P100 fraction of Bradyrhizobium japonicum USDA 110. Strong interaction between the mRNA and 30S ribosomal subunits was demonstrated by their co-sedimentation in sucrose density gradient. Using translational fusions with egfp, we detected weak translation and found that it is impeded by both the extended SD and the GTG start codon (instead of ATG). Biophysical characterization (CD- and NMR-spectroscopy) showed that synthesized polypeptide remained unstructured in physiological puffer. Replacement of the start codon by a stop codon increased the stability of the transcript, strongly suggesting additional posttranscriptional regulation at the ribosome. Therefore, the small gene was named rreB (ribosome-regulated expression in Bradyrhizobiaceae). Assuming that the unique ribosome binding site (RBS) is a hallmark of rreB homologs or similarly regulated genes, we looked for similar putative RBS in bacterial genomes and detected regions with at least 16 nt complementarity to the 3′-end of 16S rRNA upstream of sORFs in Caulobacterales, Rhizobiales, Rhodobacterales and Rhodospirillales. In the Rhodobacter/Roseobacter lineage of α-proteobacteria the corresponding gene (rreR) is conserved and encodes an 18 aa protein. This shows how specific RBS features can be used to identify new genes with presumably similar control of expression at the RNA level. PMID:27834614

  7. Rare k-mer DNA: Identification of sequence motifs and prediction of CpG island and promoter.

    PubMed

    Mohamed Hashim, Ezzeddin Kamil; Abdullah, Rosni

    2015-12-21

    Empirical analysis on k-mer DNA has been proven as an effective tool in finding unique patterns in DNA sequences which can lead to the discovery of potential sequence motifs. In an extensive study of empirical k-mer DNA on hundreds of organisms, the researchers found unique multi-modal k-mer spectra occur in the genomes of organisms from the tetrapod clade only which includes all mammals. The multi-modality is caused by the formation of the two lowest modes where k-mers under them are referred as the rare k-mers. The suppression of the two lowest modes (or the rare k-mers) can be attributed to the CG dinucleotide inclusions in them. Apart from that, the rare k-mers are selectively distributed in certain genomic features of CpG Island (CGI), promoter, 5' UTR, and exon. We correlated the rare k-mers with hundreds of annotated features using several bioinformatic tools, performed further intrinsic rare k-mer analyses within the correlated features, and modeled the elucidated rare k-mer clustering feature into a classifier to predict the correlated CGI and promoter features. Our correlation results show that rare k-mers are highly associated with several annotated features of CGI, promoter, 5' UTR, and open chromatin regions. Our intrinsic results show that rare k-mers have several unique topological, compositional, and clustering properties in CGI and promoter features. Finally, the performances of our RWC (rare-word clustering) method in predicting the CGI and promoter features are ranked among the top three, in eight of the CGI and promoter evaluations, among eight of the benchmarked datasets. Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.

  8. CMS: a web-based system for visualization and analysis of genome-wide methylation data of human cancers.

    PubMed

    Gu, Fei; Doderer, Mark S; Huang, Yi-Wen; Roa, Juan C; Goodfellow, Paul J; Kizer, E Lynette; Huang, Tim H M; Chen, Yidong

    2013-01-01

    DNA methylation of promoter CpG islands is associated with gene suppression, and its unique genome-wide profiles have been linked to tumor progression. Coupled with high-throughput sequencing technologies, it can now efficiently determine genome-wide methylation profiles in cancer cells. Also, experimental and computational technologies make it possible to find the functional relationship between cancer-specific methylation patterns and their clinicopathological parameters. Cancer methylome system (CMS) is a web-based database application designed for the visualization, comparison and statistical analysis of human cancer-specific DNA methylation. Methylation intensities were obtained from MBDCap-sequencing, pre-processed and stored in the database. 191 patient samples (169 tumor and 22 normal specimen) and 41 breast cancer cell-lines are deposited in the database, comprising about 6.6 billion uniquely mapped sequence reads. This provides comprehensive and genome-wide epigenetic portraits of human breast cancer and endometrial cancer to date. Two views are proposed for users to better understand methylation structure at the genomic level or systemic methylation alteration at the gene level. In addition, a variety of annotation tracks are provided to cover genomic information. CMS includes important analytic functions for interpretation of methylation data, such as the detection of differentially methylated regions, statistical calculation of global methylation intensities, multiple gene sets of biologically significant categories, interactivity with UCSC via custom-track data. We also present examples of discoveries utilizing the framework. CMS provides visualization and analytic functions for cancer methylome datasets. A comprehensive collection of datasets, a variety of embedded analytic functions and extensive applications with biological and translational significance make this system powerful and unique in cancer methylation research. CMS is freely accessible at: http://cbbiweb.uthscsa.edu/KMethylomes/.

  9. Advanced surface-enhanced Raman gene probe systems and methods thereof

    DOEpatents

    Vo-Dinh, Tuan

    2001-01-01

    The subject invention is a series of methods and systems for using the Surface-Enhanced Raman (SER)-labeled Gene Probe for hybridization, detection and identification of SER-labeled hybridized target oligonucleotide material comprising the steps of immobilizing SER-labeled hybridized target oligonucleotide material on a support means, wherein the SER-labeled hybridized target oligonucleotide material comprise a SER label attached either to a target oligonucleotide of unknown sequence or to a gene probe of known sequence complementary to the target oligonucleotide sequence, the SER label is unique for the target oligonucleotide strands of a particular sequence wherein the SER-labeled oligonucleotide is hybridized to its complementary oligonucleotide strand, then the support means having the SER-labeled hybridized target oligonucleotide material adsorbed thereon is SERS activated with a SERS activating means, then the support means is analyzed.

  10. An improved divergent synthesis of comb-type branched oligodeoxyribonucleotides (bDNA) containing multiple secondary sequences.

    PubMed

    Horn, T; Chang, C A; Urdea, M S

    1997-12-01

    The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays.

  11. An improved divergent synthesis of comb-type branched oligodeoxyribonucleotides (bDNA) containing multiple secondary sequences.

    PubMed Central

    Horn, T; Chang, C A; Urdea, M S

    1997-01-01

    The divergent synthesis of branched DNA (bDNA) comb structures is described. This new type of bDNA contains one unique oligonucleotide, the primary sequence, covalently attached through a comb-like branch network to many identical copies of a different oligonucleotide, the secondary sequence. The bDNA comb structures were assembled on a solid support and several synthesis parameters were investigated and optimized. The bDNA comb molecules were characterized by polyacrylamide gel electrophoretic methods and by controlled cleavage at periodate-cleavable moieties incorporated during synthesis. The developed chemistry allows synthesis of bDNA comb molecules containing multiple secondary sequences. In the accompanying article we describe the synthesis and characterization of large bDNA combs containing all four deoxynucleotides for use as signal amplifiers in nucleic acid quantification assays. PMID:9365265

  12. ClusterMine360: a database of microbial PKS/NRPS biosynthesis

    PubMed Central

    Conway, Kyle R.; Boddy, Christopher N.

    2013-01-01

    ClusterMine360 (http://www.clustermine360.ca/) is a database of microbial polyketide and non-ribosomal peptide gene clusters. It takes advantage of crowd-sourcing by allowing members of the community to make contributions while automation is used to help achieve high data consistency and quality. The database currently has >200 gene clusters from >185 compound families. It also features a unique sequence repository containing >10 000 polyketide synthase/non-ribosomal peptide synthetase domains. The sequences are filterable and downloadable as individual or multiple sequence FASTA files. We are confident that this database will be a useful resource for members of the polyketide synthases/non-ribosomal peptide synthetases research community, enabling them to keep up with the growing number of sequenced gene clusters and rapidly mine these clusters for functional information. PMID:23104377

  13. Sequence-dependent effects in drug-DNA interaction: the crystal structure of Hoechst 33258 bound to the d(CGCAAATTTGCG)2 duplex.

    PubMed Central

    Spink, N; Brown, D G; Skelly, J V; Neidle, S

    1994-01-01

    The bis-benzimidazole drug Hoechst 33258 has been co-crystallized with the dodecanucleotide sequence d(CGCAAATTTGCG)2. The structure has been solved by molecular replacement and refined to an R factor of 18.5% for 2125 reflections collected on a Xentronics area detector. The drug is bound in the minor groove, at the five base-pair site 5'-ATTTG and is in a unique orientation. This is displaced by one base pair in the 5' direction compared to previously-determined structures of this drug with the sequence d(CGCGAATTCGCG)2. Reasons for this difference in behaviour are discussed in terms of several sequence-dependent structural features of the DNA, with particular reference to differences in propeller twist and minor-groove width. Images PMID:7515488

  14. Pattern recognition of electronic bit-sequences using a semiconductor mode-locked laser and spatial light modulators

    NASA Astrophysics Data System (ADS)

    Bhooplapur, Sharad; Akbulut, Mehmetkan; Quinlan, Franklyn; Delfyett, Peter J.

    2010-04-01

    A novel scheme for recognition of electronic bit-sequences is demonstrated. Two electronic bit-sequences that are to be compared are each mapped to a unique code from a set of Walsh-Hadamard codes. The codes are then encoded in parallel on the spectral phase of the frequency comb lines from a frequency-stabilized mode-locked semiconductor laser. Phase encoding is achieved by using two independent spatial light modulators based on liquid crystal arrays. Encoded pulses are compared using interferometric pulse detection and differential balanced photodetection. Orthogonal codes eight bits long are compared, and matched codes are successfully distinguished from mismatched codes with very low error rates, of around 10-18. This technique has potential for high-speed, high accuracy recognition of bit-sequences, with applications in keyword searches and internet protocol packet routing.

  15. Weissella fabaria sp. nov., from a Ghanaian cocoa fermentation.

    PubMed

    De Bruyne, Katrien; Camu, Nicholas; De Vuyst, Luc; Vandamme, Peter

    2010-09-01

    Two lactic acid bacteria, strains 257(T) and 252, were isolated from traditional heap fermentations of Ghanaian cocoa beans. 16S rRNA gene sequence analysis of these strains allocated them to the genus Weissella, showing 99.5 % 16S rRNA gene sequence similarity towards Weissella ghanensis LMG 24286(T). Whole-cell protein electrophoresis, fluorescent amplified fragment length polymorphism fingerprinting of whole genomes and biochemical tests confirmed their unique taxonomic position. DNA-DNA hybridization experiments towards their nearest phylogenetic neighbour demonstrated that the two strains represent a novel species, for which we propose the name Weissella fabaria sp. nov., with strain 257(T) (=LMG 24289(T) =DSM 21416(T)) as the type strain. Additional sequence analysis using pheS gene sequences proved useful for identification of all Weissella-Leuconostoc-Oenococcus species and for the recognition of the novel species.

  16. Accurate RNA consensus sequencing for high-fidelity detection of transcriptional mutagenesis-induced epimutations.

    PubMed

    Reid-Bayliss, Kate S; Loeb, Lawrence A

    2017-08-29

    Transcriptional mutagenesis (TM) due to misincorporation during RNA transcription can result in mutant RNAs, or epimutations, that generate proteins with altered properties. TM has long been hypothesized to play a role in aging, cancer, and viral and bacterial evolution. However, inadequate methodologies have limited progress in elucidating a causal association. We present a high-throughput, highly accurate RNA sequencing method to measure epimutations with single-molecule sensitivity. Accurate RNA consensus sequencing (ARC-seq) uniquely combines RNA barcoding and generation of multiple cDNA copies per RNA molecule to eliminate errors introduced during cDNA synthesis, PCR, and sequencing. The stringency of ARC-seq can be scaled to accommodate the quality of input RNAs. We apply ARC-seq to directly assess transcriptome-wide epimutations resulting from RNA polymerase mutants and oxidative stress.

  17. Zseq: An Approach for Preprocessing Next-Generation Sequencing Data.

    PubMed

    Alkhateeb, Abedalrhman; Rueda, Luis

    2017-08-01

    Next-generation sequencing technology generates a huge number of reads (short sequences), which contain a vast amount of genomic data. The sequencing process, however, comes with artifacts. Preprocessing of sequences is mandatory for further downstream analysis. We present Zseq, a linear method that identifies the most informative genomic sequences and reduces the number of biased sequences, sequence duplications, and ambiguous nucleotides. Zseq finds the complexity of the sequences by counting the number of unique k-mers in each sequence as its corresponding score and also takes into the account other factors such as ambiguous nucleotides or high GC-content percentage in k-mers. Based on a z-score threshold, Zseq sweeps through the sequences again and filters those with a z-score less than the user-defined threshold. Zseq algorithm is able to provide a better mapping rate; it reduces the number of ambiguous bases significantly in comparison with other methods. Evaluation of the filtered reads has been conducted by aligning the reads and assembling the transcripts using the reference genome as well as de novo assembly. The assembled transcripts show a better discriminative ability to separate cancer and normal samples in comparison with another state-of-the-art method. Moreover, de novo assembled transcripts from the reads filtered by Zseq have longer genomic sequences than other tested methods. Estimating the threshold of the cutoff point is introduced using labeling rules with optimistic results.

  18. Polymorphisms and variants in the prion protein sequence of European moose (Alces alces), reindeer (Rangifer tarandus), roe deer (Capreolus capreolus) and fallow deer (Dama dama) in Scandinavia

    PubMed Central

    Wik, Lotta; Mikko, Sofia; Klingeborn, Mikael; Stéen, Margareta; Simonsson, Magnus; Linné, Tommy

    2012-01-01

    The prion protein (PrP) sequence of European moose, reindeer, roe deer and fallow deer in Scandinavia has high homology to the PrP sequence of North American cervids. Variants in the European moose PrP sequence were found at amino acid position 109 as K or Q. The 109Q variant is unique in the PrP sequence of vertebrates. During the 1980s a wasting syndrome in Swedish moose, Moose Wasting Syndrome (MWS), was described. SNP analysis demonstrated a difference in the observed genotype proportions of the heterozygous Q/K and homozygous Q/Q variants in the MWS animals compared with the healthy animals. In MWS moose the allele frequencies for 109K and 109Q were 0.73 and 0.27, respectively, and for healthy animals 0.69 and 0.31. Both alleles were seen as heterozygotes and homozygotes. In reindeer, PrP sequence variation was demonstrated at codon 176 as D or N and codon 225 as S or Y. The PrP sequences in roe deer and fallow deer were identical with published GenBank sequences. PMID:22441661

  19. A fossil protein chimera; difficulties in discriminating dinosaur peptide sequences from modern cross-contamination.

    PubMed

    Buckley, Michael; Warwood, Stacey; van Dongen, Bart; Kitchener, Andrew C; Manning, Phillip L

    2017-05-31

    A decade ago, reports that organic-rich soft tissue survived from dinosaur fossils were apparently supported by proteomics-derived sequence information of exceptionally well-preserved bone. This initial claim to the sequencing of endogenous collagen peptides from an approximately 68 Myr Tyrannosaurus rex fossil was highly controversial, largely on the grounds of potential contamination from either bacterial biofilms or from laboratory practice. In a subsequent study, collagen peptide sequences from an approximately 78 Myr Brachylophosaurus canadensis fossil were reported that have remained largely unchallenged. However, the endogeneity of these sequences relies heavily on a single peptide sequence, apparently unique to both dinosaurs. Given the potential for cross-contamination from modern bone analysed by the same team, here we extract collagen from bone samples of three individuals of ostrich, Struthio camelus The resulting LC-MS/MS data were found to match all of the proposed sequences for both the original Tyrannosaurus and Brachylophosaurus studies. Regardless of the true nature of the dinosaur peptides, our finding highlights the difficulty of differentiating such sequences with confidence. Our results not only imply that cross-contamination cannot be ruled out, but that appropriate measures to test for endogeneity should be further evaluated. © 2017 The Authors.

  20. A fossil protein chimera; difficulties in discriminating dinosaur peptide sequences from modern cross-contamination

    PubMed Central

    Warwood, Stacey; van Dongen, Bart; Kitchener, Andrew C.; Manning, Phillip L.

    2017-01-01

    A decade ago, reports that organic-rich soft tissue survived from dinosaur fossils were apparently supported by proteomics-derived sequence information of exceptionally well-preserved bone. This initial claim to the sequencing of endogenous collagen peptides from an approximately 68 Myr Tyrannosaurus rex fossil was highly controversial, largely on the grounds of potential contamination from either bacterial biofilms or from laboratory practice. In a subsequent study, collagen peptide sequences from an approximately 78 Myr Brachylophosaurus canadensis fossil were reported that have remained largely unchallenged. However, the endogeneity of these sequences relies heavily on a single peptide sequence, apparently unique to both dinosaurs. Given the potential for cross-contamination from modern bone analysed by the same team, here we extract collagen from bone samples of three individuals of ostrich, Struthio camelus. The resulting LC–MS/MS data were found to match all of the proposed sequences for both the original Tyrannosaurus and Brachylophosaurus studies. Regardless of the true nature of the dinosaur peptides, our finding highlights the difficulty of differentiating such sequences with confidence. Our results not only imply that cross-contamination cannot be ruled out, but that appropriate measures to test for endogeneity should be further evaluated. PMID:28566488

  1. Sequences of two related multiple antibiotic resistance virulence plasmids sharing a unique IS26-related molecular signature isolated from different Escherichia coli pathotypes from different hosts.

    PubMed

    Venturini, Carola; Hassan, Karl A; Roy Chowdhury, Piklu; Paulsen, Ian T; Walker, Mark J; Djordjevic, Steven P

    2013-01-01

    Enterohemorrhagic Escherichia coli (EHEC) and atypical enteropathogenic E. coli (aEPEC) are important zoonotic pathogens that increasingly are becoming resistant to multiple antibiotics. Here we describe two plasmids, pO26-CRL125 (125 kb) from a human O26:H- EHEC, and pO111-CRL115 (115kb) from a bovine O111 aEPEC, that impart resistance to ampicillin, kanamycin, neomycin, streptomycin, sulfathiazole, trimethoprim and tetracycline and both contain atypical class 1 integrons with an identical IS26-mediated deletion in their 3´-conserved segment. Complete sequence analysis showed that pO26-CRL125 and pO111-CRL115 are essentially identical except for a 9.7 kb fragment, present in the backbone of pO26-CRL125 but absent in pO111-CRL115, and several indels. The 9.7 kb fragment encodes IncI-associated genes involved in plasmid stability during conjugation, a putative transposase gene and three imperfect repeats. Contiguous sequence identical to regions within these pO26-CRL125 imperfect repeats was identified in pO111-CRL115 precisely where the 9.7 kb fragment is missing, suggesting it may be mobile. Sequences shared between the plasmids include a complete IncZ replicon, a unique toxin/antitoxin system, IncI stability and maintenance genes, a novel putative serine protease autotransporter, and an IncI1 transfer system including a unique shufflon. Both plasmids carry a derivate Tn21 transposon with an atypical class 1 integron comprising a dfrA5 gene cassette encoding resistance to trimethoprim, and 24 bp of the 3´-conserved segment followed by Tn6026, which encodes resistance to ampicillin, kanymycin, neomycin, streptomycin and sulfathiazole. The Tn21-derivative transposon is linked to a truncated Tn1721, encoding resistance to tetracycline, via a region containing the IncP-1α oriV. Absence of the 5 bp direct repeats flanking Tn3-family transposons, indicates that homologous recombination events played a key role in the formation of this complex antibiotic resistance gene locus. Comparative sequence analysis of these closely related plasmids reveals aspects of plasmid evolution in pathogenic E. coli from different hosts.

  2. Novel rod-shaped viruses isolated from garlic, Allium sativum, possessing a unique genome organization.

    PubMed

    Sumi, S; Tsuneyoshi, T; Furutani, H

    1993-09-01

    Rod-shaped flexuous viruses were partially purified from garlic plants (Allium sativum) showing typical mosaic symptoms. The genome was shown to be composed of RNA with a poly(A) tail of an estimated size of 10 kb as shown by denaturing agarose gel electrophoresis. We constructed cDNA libraries and screened four independent clones, which were designated GV-A, GV-B, GV-C and GV-D, using Northern and Southern blot hybridization. Nucleotide sequence determination of the cDNAs, two of which correspond to nearly one-third of the virus genomic RNA, shows that all of these viruses possess an identical genomic structure and that also at least four proteins are encoded in the viral cDNA, their M(r)s being estimated to be 15K, 27K, 40K and 11K. The 15K open reading frame (ORF) encodes the core-like sequence of a zinc finger protein preceded by a cluster of basic amino acid residues. The 27K ORF probably encodes the viral coat protein (CP), based on both the existence of some conserved sequences observed in many other rod-shaped or flexuous virus CPs and an overall amino acid sequence similarity to potexvirus and carlavirus CPs. The 11K ORF shows significant amino acid sequence similarities to the corresponding 12K proteins of the potexviruses and carlaviruses. On the other hand, the 40K ORF product does not resemble any other plant virus gene products reported so far. The genomic organization in the 3' region of the garlic viruses resembles, but clearly differs from, that of carlaviruses. Phylogenetic analysis based upon the amino acid sequence of the viral capsid protein also indicates that the garlic viruses have a unique and distinct domain different from those of the potexvirus and carlavirus groups. The results suggest that the garlic viruses described here belong to an unclassified and new virus group closely related to the carlaviruses.

  3. The Paramecium germline genome provides a niche for intragenic parasitic DNA: evolutionary dynamics of internal eliminated sequences.

    PubMed

    Arnaiz, Olivier; Mathy, Nathalie; Baudry, Céline; Malinsky, Sophie; Aury, Jean-Marc; Denby Wilkes, Cyril; Garnier, Olivier; Labadie, Karine; Lauderdale, Benjamin E; Le Mouël, Anne; Marmignon, Antoine; Nowacki, Mariusz; Poulain, Julie; Prajer, Malgorzata; Wincker, Patrick; Meyer, Eric; Duharcourt, Sandra; Duret, Laurent; Bétermier, Mireille; Sperling, Linda

    2012-01-01

    Insertions of parasitic DNA within coding sequences are usually deleterious and are generally counter-selected during evolution. Thanks to nuclear dimorphism, ciliates provide unique models to study the fate of such insertions. Their germline genome undergoes extensive rearrangements during development of a new somatic macronucleus from the germline micronucleus following sexual events. In Paramecium, these rearrangements include precise excision of unique-copy Internal Eliminated Sequences (IES) from the somatic DNA, requiring the activity of a domesticated piggyBac transposase, PiggyMac. We have sequenced Paramecium tetraurelia germline DNA, establishing a genome-wide catalogue of -45,000 IESs, in order to gain insight into their evolutionary origin and excision mechanism. We obtained direct evidence that PiggyMac is required for excision of all IESs. Homology with known P. tetraurelia Tc1/mariner transposons, described here, indicates that at least a fraction of IESs derive from these elements. Most IES insertions occurred before a recent whole-genome duplication that preceded diversification of the P. aurelia species complex, but IES invasion of the Paramecium genome appears to be an ongoing process. Once inserted, IESs decay rapidly by accumulation of deletions and point substitutions. Over 90% of the IESs are shorter than 150 bp and present a remarkable size distribution with a -10 bp periodicity, corresponding to the helical repeat of double-stranded DNA and suggesting DNA loop formation during assembly of a transpososome-like excision complex. IESs are equally frequent within and between coding sequences; however, excision is not 100% efficient and there is selective pressure against IES insertions, in particular within highly expressed genes. We discuss the possibility that ancient domestication of a piggyBac transposase favored subsequent propagation of transposons throughout the germline by allowing insertions in coding sequences, a fraction of the genome in which parasitic DNA is not usually tolerated.

  4. The Paramecium Germline Genome Provides a Niche for Intragenic Parasitic DNA: Evolutionary Dynamics of Internal Eliminated Sequences

    PubMed Central

    Arnaiz, Olivier; Mathy, Nathalie; Baudry, Céline; Malinsky, Sophie; Aury, Jean-Marc; Denby Wilkes, Cyril; Garnier, Olivier; Labadie, Karine; Lauderdale, Benjamin E.; Le Mouël, Anne; Marmignon, Antoine; Nowacki, Mariusz; Poulain, Julie; Prajer, Malgorzata; Wincker, Patrick; Meyer, Eric; Duharcourt, Sandra; Duret, Laurent; Bétermier, Mireille; Sperling, Linda

    2012-01-01

    Insertions of parasitic DNA within coding sequences are usually deleterious and are generally counter-selected during evolution. Thanks to nuclear dimorphism, ciliates provide unique models to study the fate of such insertions. Their germline genome undergoes extensive rearrangements during development of a new somatic macronucleus from the germline micronucleus following sexual events. In Paramecium, these rearrangements include precise excision of unique-copy Internal Eliminated Sequences (IES) from the somatic DNA, requiring the activity of a domesticated piggyBac transposase, PiggyMac. We have sequenced Paramecium tetraurelia germline DNA, establishing a genome-wide catalogue of ∼45,000 IESs, in order to gain insight into their evolutionary origin and excision mechanism. We obtained direct evidence that PiggyMac is required for excision of all IESs. Homology with known P. tetraurelia Tc1/mariner transposons, described here, indicates that at least a fraction of IESs derive from these elements. Most IES insertions occurred before a recent whole-genome duplication that preceded diversification of the P. aurelia species complex, but IES invasion of the Paramecium genome appears to be an ongoing process. Once inserted, IESs decay rapidly by accumulation of deletions and point substitutions. Over 90% of the IESs are shorter than 150 bp and present a remarkable size distribution with a ∼10 bp periodicity, corresponding to the helical repeat of double-stranded DNA and suggesting DNA loop formation during assembly of a transpososome-like excision complex. IESs are equally frequent within and between coding sequences; however, excision is not 100% efficient and there is selective pressure against IES insertions, in particular within highly expressed genes. We discuss the possibility that ancient domestication of a piggyBac transposase favored subsequent propagation of transposons throughout the germline by allowing insertions in coding sequences, a fraction of the genome in which parasitic DNA is not usually tolerated. PMID:23071448

  5. Sequence analyses of fimbriae subunit FimA proteins on Actinomyces naeslundii genospecies 1 and 2 and Actinomyces odontolyticus with variant carbohydrate binding specificities

    PubMed Central

    Drobni, Mirva; Hallberg, Kristina; Öhman, Ulla; Birve, Anna; Persson, Karina; Johansson, Ingegerd; Strömberg, Nicklas

    2006-01-01

    Background Actinomyces naeslundii genospecies 1 and 2 express type-2 fimbriae (FimA subunit polymers) with variant Galβ binding specificities and Actinomyces odontolyticus a sialic acid specificity to colonize different oral surfaces. However, the fimbrial nature of the sialic acid binding property and sequence information about FimA proteins from multiple strains are lacking. Results Here we have sequenced fimA genes from strains of A.naeslundii genospecies 1 (n = 4) and genospecies 2 (n = 4), both of which harboured variant Galβ-dependent hemagglutination (HA) types, and from A.odontolyticus PK984 with a sialic acid-dependent HA pattern. Three unique subtypes of FimA proteins with 63.8–66.4% sequence identity were present in strains of A. naeslundii genospecies 1 and 2 and A. odontolyticus. The generally high FimA sequence identity (>97.2%) within a genospecies revealed species specific sequences or segments that coincided with binding specificity. All three FimA protein variants contained a signal peptide, pilin motif, E box, proline-rich segment and an LPXTG sorting motif among other conserved segments for secretion, assembly and sorting of fimbrial proteins. The highly conserved pilin, E box and LPXTG motifs are present in fimbriae proteins from other Gram-positive bacteria. Moreover, only strains of genospecies 1 were agglutinated with type-2 fimbriae antisera derived from A. naeslundii genospecies 1 strain 12104, emphasizing that the overall folding of FimA may generate different functionalities. Western blot analyses with FimA antisera revealed monomers and oligomers of FimA in whole cell protein extracts and a purified recombinant FimA preparation, indicating a sortase-independent oligomerization of FimA. Conclusion The genus Actinomyces involves a diversity of unique FimA proteins with conserved pilin, E box and LPXTG motifs, depending on subspecies and associated binding specificity. In addition, a sortase independent oligomerization of FimA subunit proteins in solution was indicated. PMID:16686953

  6. Structure and Genetic Content of the Megaplasmids of Neurotoxigenic Clostridium butyricum Type E Strains from Italy

    PubMed Central

    Iacobino, Angelo; Scalfaro, Concetta; Franciosa, Giovanna

    2013-01-01

    We determined the genetic maps of the megaplasmids of six neutoroxigenic Clostridium butyricum type E strains from Italy using molecular and bioinformatics techniques. The megaplasmids are circular, not linear as we had previously proposed. The differently-sized megaplasmids share a genetic region that includes structural, metabolic and regulatory genes. In addition, we found that a 168 kb genetic region is present only in the larger megaplasmids of two tested strains, whereas it is absent from the smaller megaplasmids of the four remaining strains. The genetic region unique to the larger megaplasmids contains, among other features, a locus for clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR associated (cas) genes, i.e. a bacterial adaptive immune system providing sequence-specific protection from invading genetic elements. Some CRISPR spacer sequences of the neurotoxigenic C. butyricum type E strains showed homology to prophage, phage and plasmid sequences from closely related clostridia species or from distant species, all sharing the intestinal habitat, suggesting that the CRISPR locus might be involved in the microorganism adaptation to the human or animal intestinal environment. Besides, we report here that each of four distinct CRISPR spacers partially matched DNA sequences of different prophages and phages, at identical nucleotide locations. This suggests that, at least in neurotoxigenic C. butyricum type E, the CRISPR locus is potentially able to recognize the same conserved DNA sequence of different invading genetic elements, besides targeting sequences unique to previously encountered invading DNA, as currently predicted for a CRISPR locus. Thus, the results of this study introduce the possibility that CRISPR loci can provide resistance to a wider range of invading DNA elements than previously appreciated. Whether it is more advantageous for the peculiar neurotoxigenic C. butyricum type E strains to maintain or to lose the CRISPR-cas system remains an open question. PMID:23967192

  7. Comparative Analysis of Genome Sequences Covering the Seven Cronobacter Species

    PubMed Central

    Cummings, Craig A.; Shih, Rita; Degoricija, Lovorka; Rico, Alain; Brzoska, Pius; Hamby, Stephen E.; Masood, Naqash; Hariri, Sumyya; Sonbol, Hana; Chuzhanova, Nadia; McClelland, Michael; Furtado, Manohar R.; Forsythe, Stephen J.

    2012-01-01

    Background Species of Cronobacter are widespread in the environment and are occasional food-borne pathogens associated with serious neonatal diseases, including bacteraemia, meningitis, and necrotising enterocolitis. The genus is composed of seven species: C. sakazakii, C. malonaticus, C. turicensis, C. dublinensis, C. muytjensii, C. universalis, and C. condimenti. Clinical cases are associated with three species, C. malonaticus, C. turicensis and, in particular, with C. sakazakii multilocus sequence type 4. Thus, it is plausible that virulence determinants have evolved in certain lineages. Methodology/Principal Findings We generated high quality sequence drafts for eleven Cronobacter genomes representing the seven Cronobacter species, including an ST4 strain of C. sakazakii. Comparative analysis of these genomes together with the two publicly available genomes revealed Cronobacter has over 6,000 genes in one or more strains and over 2,000 genes shared by all Cronobacter. Considerable variation in the presence of traits such as type six secretion systems, metal resistance (tellurite, copper and silver), and adhesins were found. C. sakazakii is unique in the Cronobacter genus in encoding genes enabling the utilization of exogenous sialic acid which may have clinical significance. The C. sakazakii ST4 strain 701 contained additional genes as compared to other C. sakazakii but none of them were known specific virulence-related genes. Conclusions/Significance Genome comparison revealed that pair-wise DNA sequence identity varies between 89 and 97% in the seven Cronobacter species, and also suggested various degrees of divergence. Sets of universal core genes and accessory genes unique to each strain were identified. These gene sequences can be used for designing genus/species specific detection assays. Genes encoding adhesins, T6SS, and metal resistance genes as well as prophages are found in only subsets of genomes and have contributed considerably to the variation of genomic content. Differences in gene content likely contribute to differences in the clinical and environmental distribution of species and sequence types. PMID:23166675

  8. Bacterial Artificial Chromosome Libraries for Mouse Sequencing and Functional Analysis

    PubMed Central

    Osoegawa, Kazutoyo; Tateno, Minako; Woon, Peng Yeong; Frengen, Eirik; Mammoser, Aaron G.; Catanese, Joseph J.; Hayashizaki, Yoshihide; de Jong, Pieter J.

    2000-01-01

    Bacterial artificial chromosome (BAC) and P1-derived artificial chromosome (PAC) libraries providing a combined 33-fold representation of the murine genome have been constructed using two different restriction enzymes for genomic digestion. A large-insert PAC library was prepared from the 129S6/SvEvTac strain in a bacterial/mammalian shuttle vector to facilitate functional gene studies. For genome mapping and sequencing, we prepared BAC libraries from the 129S6/SvEvTac and the C57BL/6J strains. The average insert sizes for the three libraries range between 130 kb and 200 kb. Based on the numbers of clones and the observed average insert sizes, we estimate each library to have slightly in excess of 10-fold genome representation. The average number of clones found after hybridization screening with 28 probes was in the range of 9–14 clones per marker. To explore the fidelity of the genomic representation in the three libraries, we analyzed three contigs, each established after screening with a single unique marker. New markers were established from the end sequences and screened against all the contig members to determine if any of the BACs and PACs are chimeric or rearranged. Only one chimeric clone and six potential deletions have been observed after extensive analysis of 113 PAC and BAC clones. Seventy-one of the 113 clones were conclusively nonchimeric because both end markers or sequences were mapped to the other confirmed contig members. We could not exclude chimerism for the remaining 41 clones because one or both of the insert termini did not contain unique sequence to design markers. The low rate of chimerism, ∼1%, and the low level of detected rearrangements support the anticipated usefulness of the BAC libraries for genome research. [The sequence data described in this paper have been submitted to the GenBank data library under accession numbers AQ797173–AQ797398.] PMID:10645956

  9. Unique transposon landscapes are pervasive across Drosophila melanogaster genomes

    PubMed Central

    Rahman, Reazur; Chirn, Gung-wei; Kanodia, Abhay; Sytnikova, Yuliya A.; Brembs, Björn; Bergman, Casey M.; Lau, Nelson C.

    2015-01-01

    To understand how transposon landscapes (TLs) vary across animal genomes, we describe a new method called the Transposon Insertion and Depletion AnaLyzer (TIDAL) and a database of >300 TLs in Drosophila melanogaster (TIDAL-Fly). Our analysis reveals pervasive TL diversity across cell lines and fly strains, even for identically named sub-strains from different laboratories such as the ISO1 strain used for the reference genome sequence. On average, >500 novel insertions exist in every lab strain, inbred strains of the Drosophila Genetic Reference Panel (DGRP), and fly isolates in the Drosophila Genome Nexus (DGN). A minority (<25%) of transposon families comprise the majority (>70%) of TL diversity across fly strains. A sharp contrast between insertion and depletion patterns indicates that many transposons are unique to the ISO1 reference genome sequence. Although TL diversity from fly strains reaches asymptotic limits with increasing sequencing depth, rampant TL diversity causes unsaturated detection of TLs in pools of flies. Finally, we show novel transposon insertions negatively correlate with Piwi-interacting RNA (piRNA) levels for most transposon families, except for the highly-abundant roo retrotransposon. Our study provides a useful resource for Drosophila geneticists to understand how transposons create extensive genomic diversity in fly cell lines and strains. PMID:26578579

  10. Variants of uncertain significance in newborn screening disorders: implications for large-scale genomic sequencing.

    PubMed

    Narravula, Alekhya; Garber, Kathryn B; Askree, S Hussain; Hegde, Madhuri; Hall, Patricia L

    2017-01-01

    As exome and genome sequencing using high-throughput sequencing technologies move rapidly into the diagnostic process, laboratories and clinicians need to develop a strategy for dealing with uncertain findings. A commitment must be made to minimize these findings, and all parties may need to make adjustments to their processes. The information required to reclassify these variants is often available but not communicated to all relevant parties. To illustrate these issues, we focused on three well-characterized monogenic, metabolic disorders included in newborn screens: classic galactosemia, caused by GALT variants; phenylketonuria, caused by PAH variants; and medium-chain acyl-CoA dehydrogenase (MCAD) deficiency, caused by ACADM variants. In 10 years of clinical molecular testing, we have observed 134 unique GALT variants, 46 of which were variants of uncertain significance (VUS). In PAH, we observed 132 variants, including 17 VUS, and for ACADM, we observed 64 unique variants, of which 33 were uncertain. After this review, 17 VUS (37%; 7 in ACADM, 9 in GALT, and 1 in PAH) were reclassified from uncertain (6 to benign or likely benign and 11 to pathogenic or likely pathogenic). We identified common types of missing information that would have helped make a definitive classification and categorized this information by ease and cost to obtain.Genet Med 19 1, 77-82.

  11. Characterization of capsaicin synthase and identification of its gene (csy1) for pungency factor capsaicin in pepper (Capsicum sp.)

    PubMed Central

    Prasad, B. C. Narasimha; Kumar, Vinod; Gururaj, H. B.; Parimalan, R.; Giridhar, P.; Ravishankar, G. A.

    2006-01-01

    Capsaicin is a unique alkaloid of the plant kingdom restricted to the genus Capsicum. Capsaicin is the pungency factor, a bioactive molecule of food and of medicinal importance. Capsaicin is useful as a counterirritant, antiarthritic, analgesic, antioxidant, and anticancer agent. Capsaicin biosynthesis involves condensation of vanillylamine and 8-methyl nonenoic acid, brought about by capsaicin synthase (CS). We found that CS activity correlated with genotype-specific capsaicin levels. We purified and characterized CS (≈35 kDa). Immunolocalization studies confirmed that CS is specifically localized to the placental tissues of Capsicum fruits. Western blot analysis revealed concomitant enhancement of CS levels and capsaicin accumulation during fruit development. We determined the N-terminal amino acid sequence of purified CS, cloned the CS gene (csy1) and sequenced full-length cDNA (981 bp). The deduced amino acid sequence of CS from full-length cDNA was 38 kDa. Functionality of csy1 through heterologous expression in recombinant Escherichia coli was also demonstrated. Here we report the gene responsible for capsaicin biosynthesis, which is unique to Capsicum spp. With this information on the CS gene, speculation on the gene for pungency is unequivocally resolved. Our findings have implications in the regulation of capsaicin levels in Capsicum genotypes. PMID:16938870

  12. The Glaciozyma antarctica genome reveals an array of systems that provide sustained responses towards temperature variations in a persistently cold habitat

    PubMed Central

    Hashim, Noor Haza Fazlin; Bharudin, Izwan; Abu Bakar, Mohd Faizal; Huang, Kie Kyon; Alias, Halimah; Lee, Bernard K. B.; Mat Isa, Mohd Noor; Mat-Sharani, Shuhaila; Sulaiman, Suhaila; Tay, Lih Jinq; Zolkefli, Radziah; Muhammad Noor, Yusuf; Law, Douglas Sie Nguong; Abdul Rahman, Siti Hamidah; Md-Illias, Rosli; Abu Bakar, Farah Diba; Najimudin, Nazalan; Abdul Murad, Abdul Munir; Mahadi, Nor Muhammad

    2018-01-01

    Extremely low temperatures present various challenges to life that include ice formation and effects on metabolic capacity. Psyhcrophilic microorganisms typically have an array of mechanisms to enable survival in cold temperatures. In this study, we sequenced and analysed the genome of a psychrophilic yeast isolated in the Antarctic region, Glaciozyma antarctica. The genome annotation identified 7857 protein coding sequences. From the genome sequence analysis we were able to identify genes that encoded for proteins known to be associated with cold survival, in addition to annotating genes that are unique to G. antarctica. For genes that are known to be involved in cold adaptation such as anti-freeze proteins (AFPs), our gene expression analysis revealed that they were differentially transcribed over time and in response to different temperatures. This indicated the presence of an array of adaptation systems that can respond to a changing but persistent cold environment. We were also able to validate the activity of all the AFPs annotated where the recombinant AFPs demonstrated anti-freeze capacity. This work is an important foundation for further collective exploration into psychrophilic microbiology where among other potential, the genes unique to this species may represent a pool of novel mechanisms for cold survival. PMID:29385175

  13. Identification of Two New HIV-1 Circulating Recombinant Forms (CRF87_cpx and CRF88_BC) from Reported Unique Recombinant Forms in Asia.

    PubMed

    Hu, Yihong; Wan, Zhenzhou; Zhou, Yan-Heng; Smith, Davey; Zheng, Yong-Tang; Zhang, Chiyu

    2017-04-01

    The on-going generation of HIV-1 intersubtype recombination has led to new circulating recombinant forms (CRFs) and unique recombinant forms (URFs) in Asia. In this study, we evaluated whether previously reported URFs were actually CRFs. All available complete or near full-length HIV-1 URF sequences from Asia were retrieved from the HIV Los Alamos National Laboratory Sequence database, and phylogenetic, transmission cluster, and bootscan analyses were performed using MEGA 6.0, Cluster Picker 1.2.1, and SimPlot3.5.1. According to the criterion of new CRFs, two new HIV-1 CRFs (CRF87_cpx and CRF88_BC) were identified from these available URFs. CRF87_cpx comprised HIV-1 subtypes B, C, and CRF01_AE, and CRF88_BC comprised subtypes B and C. HIV Blast and bootscan analysis revealed that besides the three representative strains, there were two additional CRF87_cpx strains. Furthermore, we defined seven dominant URFs (dURF01-dURF07), each of which contained two strains sharing same recombination map and can be used as sequence references to facilitate the finding of new potential CRFs in future. These results will benefit the molecular epidemiological investigation of HIV-1 in Asia.

  14. mtDNA control-region sequence variation suggests multiple independent origins of an "Asian-specific" 9-bp deletion in sub-Saharan Africans.

    PubMed Central

    Soodyall, H.; Vigilant, L.; Hill, A. V.; Stoneking, M.; Jenkins, T.

    1996-01-01

    The intergenic COII/tRNA(Lys) 9-bp deletion in human mtDNA, which is found at varying frequencies in Asia, Southeast Asia, Polynesia, and the New World, was also found in 81 of 919 sub-Saharan Africans. Using mtDNA control-region sequence data from a subset of 41 individuals with the deletion, we identified 22 unique mtDNA types associated with the deletion in Africa. A comparison of the unique mtDNA types from sub-Saharan Africans and Asians with the 9-bp deletion revealed that sub-Saharan Africans and Asians have sequence profiles that differ in the locations and frequencies of variant sites. Both phylogenetic and mismatch-distribution analysis suggest that 9-bp deletion arose independently in sub-Saharan Africa and Asia and that the deletion has arisen more than once in Africa. Within Africa, the deletion was not found among Khoisan peoples and was rare to absent in western and southwestern African populations, but it did occur in Pygmy and Negroid populations from central Africa and in Malawi and southern African Bantu-speakers. The distribution of the 9-bp deletion in Africa suggests that the deletion could have arisen in central Africa and was then introduced to southern Africa via the recent "Bantu expansion." PMID:8644719

  15. Comparative Genomic Analyses of Clavibacter michiganensis subsp. insidiosus and Pathogenicity on Medicago truncatula.

    PubMed

    Lu, You; Ishimaru, Carol A; Glazebrook, Jane; Samac, Deborah A

    2018-02-01

    Clavibacter michiganensis is the most economically important gram-positive bacterial plant pathogen, with subspecies that cause serious diseases of maize, wheat, tomato, potato, and alfalfa. Much less is known about pathogenesis involving gram-positive plant pathogens than is known for gram-negative bacteria. Comparative genome analyses of C. michiganensis subspecies affecting tomato, potato, and maize have provided insights on pathogenicity. In this study, we identified strains of C. michiganensis subsp. insidiosus with contrasting pathogenicity on three accessions of the model legume Medicago truncatula. We generated complete genome sequences for two strains and compared these to a previously sequenced strain and genome sequences of four other subspecies. The three C. michiganensis subsp. insidiosus strains varied in gene content due to genome rearrangements, most likely facilitated by insertion elements, and plasmid number, which varied from one to three depending on strain. The core C. michiganensis genome consisted of 1,917 genes, with 379 genes unique to C. michiganensis subsp. insidiosus. An operon for synthesis of the extracellular blue pigment indigoidine, enzymes for pectin degradation, and an operon for inositol metabolism are among the unique features. Secreted serine proteases belonging to both the pat-1 and ppa families were present but highly diverged from those in other subspecies.

  16. Smoking, pregnancy and the subgingival microbiome

    PubMed Central

    Paropkari, Akshay D.; Leblebicioglu, Binnaz; Christian, Lisa M.; Kumar, Purnima S.

    2016-01-01

    The periodontal microbiome is known to be altered during pregnancy as well as by smoking. However, despite the fact that 2.1 million women in the United States smoke during their pregnancy, the potentially synergistic effects of smoking and pregnancy on the subgingival microbiome have never been studied. Subgingival plaque was collected from 44 systemically and periodontally healthy non-pregnant nonsmokers (control), non-pregnant smokers, pregnant nonsmokers and pregnant smokers and sequenced using 16S-pyrotag sequencing. 331601 classifiable sequences were compared against HOMD. Community ordination methods and co-occurrence networks were used along with non-parametric tests to identify differences between groups. Linear Discriminant Analysis revealed significant clustering based on pregnancy and smoking status. Alpha diversity was similar between groups, however, pregnant women (smokers and nonsmokers) demonstrated higher levels of gram-positive and gram-negative facultatives, and lower levels of gram-negative anaerobes when compared to smokers. Each environmental perturbation induced distinctive co-occurrence patterns between species, with unique network anchors in each group. Our study thus suggests that the impact of each environmental perturbation on the periodontal microbiome is unique, and that when they are superimposed, the sum is greater than its parts. The persistence of these effects following cessation of the environmental disruption warrants further investigation. PMID:27461975

  17. Revealing the transcriptomic complexity of switchgrass by PacBio long-read sequencing.

    PubMed

    Zuo, Chunman; Blow, Matthew; Sreedasyam, Avinash; Kuo, Rita C; Ramamoorthy, Govindarajan Kunde; Torres-Jerez, Ivone; Li, Guifen; Wang, Mei; Dilworth, David; Barry, Kerrie; Udvardi, Michael; Schmutz, Jeremy; Tang, Yuhong; Xu, Ying

    2018-01-01

    Switchgrass ( Panicum virgatum L.) is an important bioenergy crop widely used for lignocellulosic research. While extensive transcriptomic analyses have been conducted on this species using short read-based sequencing techniques, very little has been reliably derived regarding alternatively spliced (AS) transcripts. We present an analysis of transcriptomes of six switchgrass tissue types pooled together, sequenced using Pacific Biosciences (PacBio) single-molecular long-read technology. Our analysis identified 105,419 unique transcripts covering 43,570 known genes and 8795 previously unknown genes. 45,168 are novel transcripts of known genes. A total of 60,096 AS transcripts are identified, 45,628 being novel. We have also predicted 1549 transcripts of genes involved in cell wall construction and remodeling, 639 being novel transcripts of known cell wall genes. Most of the predicted transcripts are validated against Illumina-based short reads. Specifically, 96% of the splice junction sites in all the unique transcripts are validated by at least five Illumina reads. Comparisons between genes derived from our identified transcripts and the current genome annotation revealed that among the gene set predicted by both analyses, 16,640 have different exon-intron structures. Overall, substantial amount of new information is derived from the PacBio RNA data regarding both the transcriptome and the genome of switchgrass.

  18. Novel groups and unique distribution of phage phoH genes in paddy waters in northeast China

    PubMed Central

    Wang, Xinzhen; Liu, Junjie; Yu, Zhenhua; Jin, Jian; Liu, Xiaobing; Wang, Guanghua

    2016-01-01

    Although bacteriophages are ubiquitous in various environments, their genetic diversity is primarily investigated in pelagic marine environments. Corresponding studies in terrestrial environments are few. In this study, we conducted the first survey of phage diversity in the paddy ecosystem by targeting a new viral biomarker gene, phoH. A total of 424 phoH sequences were obtained from four paddy waters generated from a pot experiment with different soils collected from open paddy fields in northeast China. The majority of phoH sequences in paddy waters were novel, with the highest identity of ≤70% with known phoH sequences. Four unique groups (Group α, Group β, Group γ and Group δ) and seven new subgroups (Group 2b, Group 3d, Group 3e, Group 6a, Group 6b, Group 6c and Group 6d) were formed exclusively with the clones from the paddy waters, suggesting novel phage phoH groups exist in the paddy ecosystem. Additionally, the distribution proportions of phoH clones in different groups varied among paddy water samples, suggesting the phage community in paddy fields is biogeographically distributed. Furthermore, non-metric multidimensional scaling analysis indicated that phage phoH assemblages in paddy waters were distinct from those in marine waters. PMID:27910929

  19. Transcriptomic analysis of Ruditapes philippinarum hemocytes reveals cytoskeleton disruption after in vitro Vibrio tapetis challenge.

    PubMed

    Brulle, Franck; Jeffroy, Fanny; Madec, Stéphanie; Nicolas, Jean-Louis; Paillard, Christine

    2012-10-01

    The Manila clam, Ruditapes philippinarum, is an economically-important, commercial shellfish; harvests are diminished in some European waters by a pathogenic bacterium, Vibrio tapetis, that causes Brown Ring disease. To identify molecular characteristics associated with susceptibility or resistance to Brown Ring disease, Suppression Subtractive Hybridization (SSH) analyzes were performed to construct cDNA libraries enriched in up- or down-regulated transcripts from clam immune cells, hemocytes, after a 3-h in vitro challenge with cultured V. tapetis. Nine hundred and ninety eight sequences from the two libraries were sequenced, and an in silico analysis identified 235 unique genes. BLAST and "Gene ontology" classification analyzes revealed that 60.4% of the Expressed Sequence Tags (ESTs) have high similarities with genes involved in various physiological functions, such as immunity, apoptosis and cytoskeleton organization; whereas, 39.6% remain unidentified. From the 235 unique genes, we selected 22 candidates based upon physiological function and redundancy in the libraries. Then, Real-Time PCR analysis identified 3 genes related to cytoskeleton organization showing significant variation in expression attributable to V. tapetis exposure. Disruption in regulation of these genes is consistent with the etiologic agent of Brown Ring disease in Manila clams. Copyright © 2012 Elsevier Ltd. All rights reserved.

  20. Bacterial and archaeal diversity in two hot spring microbial mats from the geothermal region of Tengchong, China.

    PubMed

    Pagaling, Eulyn; Grant, William D; Cowan, Don A; Jones, Brian E; Ma, Yanhe; Ventosa, Antonio; Heaphy, Shaun

    2012-07-01

    We investigated the bacterial and archaeal diversity in two hot spring microbial mats from the geothermal region of Tengchong in the Yunnan Province, China, using direct molecular analyses. The Langpu (LP) laminated mat was found by the side of a boiling pool with temperature of 60-65 °C and a pH of 8.5, while the Tengchong (TC) streamer mat consisted of white streamers in a slightly acidic (pH 6.5) hot pool outflow with a temperature of 72 °C. Four 16S rRNA gene clone libraries were constructed and restriction enzyme analysis of the inserts was used to identify unique sequences and clone frequencies. From almost 200 clones screened, 55 unique sequences were retrieved. Phylogenetic analysis showed that the LP mat consisted of a diverse bacterial population [Cyanobacteria, Chloroflexi, Chlorobia, Nitrospirae, 'Deinococcus-Thermus', Proteobacteria (alpha, beta and delta subdivisions), Firmicutes, Bacteroidetes and Actinobacteria], while the archaeal population was dominated by methanogenic Euryarchaeota and Crenarchaeota. In contrast, the TC streamer mat consisted of a bacterial population dominated by Aquificae, while the archaeal population also contained Korarchaeota as well as Crenarchaeota and methanogenic Euryarchaeota. These mats harboured clone sequences affiliated to unidentified lineages, suggesting that they are a potential source for discovering novel bacteria and archaea.

  1. Identification of Actinomyces meyeri actinomycosis in middle ear and mastoid by 16S rRNA analysis.

    PubMed

    Kakuta, Risako; Hidaka, Hiroshi; Yano, Hisakazu; Miyazaki, Hiromitsu; Suzaki, Hiroshi; Nakamura, Yasuhiro; Kanamori, Hajime; Endo, Shiro; Hirakata, Yoichi; Kaku, Mitsuo; Kobayashi, Toshimitsu

    2013-08-01

    Actinomycosis of the middle ear and mastoid is extremely rare. Here, we report a unique case of actinomycosis of the middle ear and mastoid caused by Actinomyces meyeri diagnosed by 16S rRNA gene sequence analysis.

  2. Demonstration: Genetic Jewelry

    ERIC Educational Resources Information Center

    Atkins, Thomas; Roderick, Joyce

    2006-01-01

    In order for students to understand genetics and evolution, they must first understand the structure of the DNA molecule. The function of DNA proceeds from its unique structure, a structure beautifully adapted for information storage, transcription, translation into amino acid sequences, replication, and time travel. The activity described in this…

  3. Isotachophoresis for fractionation and recovery of cytoplasmic RNA and nucleus from single cells.

    PubMed

    Kuriyama, Kentaro; Shintaku, Hirofumi; Santiago, Juan G

    2015-07-01

    There is a substantial need for simultaneous analyses of RNA and DNA from individual single cells. Such analysis provides unique evidence of cell-to-cell differences and the correlation between gene expression and genomic mutation in highly heterogeneous cell populations. We present a novel microfluidic system that leverages isotachophoresis to fractionate and isolate cytoplasmic RNA and genomic DNA (gDNA) from single cells. The system uniquely enables independent, sequence-specific analyses of these critical markers. Our system uses a microfluidic chip with a simple geometry and four end-channel electrodes, and completes the entire process in <5 min, including lysis, purification, fractionation, and delivery to DNA and RNA output reservoirs, each containing high quality and purity aliquots with no measurable cross-contamination of cytoplasmic RNA versus gDNA. We demonstrate our system with simultaneous, sequence-specific quantitation using off-chip RT-qPCR and qPCR for simultaneous cytoplasmic RNA and gDNA analyses, respectively. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. The genome of Mesobuthus martensii reveals a unique adaptation model of arthropods

    PubMed Central

    Cao, Zhijian; Yu, Yao; Wu, Yingliang; Hao, Pei; Di, Zhiyong; He, Yawen; Chen, Zongyun; Yang, Weishan; Shen, Zhiyong; He, Xiaohua; Sheng, Jia; Xu, Xiaobo; Pan, Bohu; Feng, Jing; Yang, Xiaojuan; Hong, Wei; Zhao, Wenjuan; Li, Zhongjie; Huang, Kai; Li, Tian; Kong, Yimeng; Liu, Hui; Jiang, Dahe; Zhang, Binyan; Hu, Jun; Hu, Youtian; Wang, Bin; Dai, Jianliang; Yuan, Bifeng; Feng, Yuqi; Huang, Wei; Xing, Xiaojing; Zhao, Guoping; Li, Xuan; Li, Yixue; Li, Wenxin

    2013-01-01

    Representing a basal branch of arachnids, scorpions are known as ‘living fossils’ that maintain an ancient anatomy and are adapted to have survived extreme climate changes. Here we report the genome sequence of Mesobuthus martensii, containing 32,016 protein-coding genes, the most among sequenced arthropods. Although M. martensii appears to evolve conservatively, it has a greater gene family turnover than the insects that have undergone diverse morphological and physiological changes, suggesting the decoupling of the molecular and morphological evolution in scorpions. Underlying the long-term adaptation of scorpions is the expansion of the gene families enriched in basic metabolic pathways, signalling pathways, neurotoxins and cytochrome P450, and the different dynamics of expansion between the shared and the scorpion lineage-specific gene families. Genomic and transcriptomic analyses further illustrate the important genetic features associated with prey, nocturnal behaviour, feeding and detoxification. The M. martensii genome reveals a unique adaptation model of arthropods, offering new insights into the genetic bases of the living fossils. PMID:24129506

  5. Genetic Characterization of Feline Leukemia Virus from Florida Panthers

    PubMed Central

    Brown, Meredith A.; Cunningham, Mark W.; Roca, Alfred L.; Troyer, Jennifer L.; Johnson, Warren E.

    2008-01-01

    From 2002 through 2005, an outbreak of feline leukemia virus (FeLV) occurred in Florida panthers (Puma concolor coryi). Clinical signs included lymphadenopathy, anemia, septicemia, and weight loss; 5 panthers died. Not associated with FeLV outcome were the genetic heritage of the panthers (pure Florida vs. Texas/Florida crosses) and co-infection with feline immunodeficiency virus. Genetic analysis of panther FeLV, designated FeLV-Pco, determined that the outbreak likely came from 1 cross-species transmission from a domestic cat. The FeLV-Pco virus was closely related to the domestic cat exogenous FeLV-A subgroup in lacking recombinant segments derived from endogenous FeLV. FeLV-Pco sequences were most similar to the well-characterized FeLV-945 strain, which is highly virulent and strongly pathogenic in domestic cats because of unique long terminal repeat and envelope sequences. These unique features may also account for the severity of the outbreak after cross-species transmission to the panther. PMID:18258118

  6. Diversity of Penicillium section Citrina within the fynbos biome of South Africa, including a new species from a Protea repens infructescence.

    PubMed

    Visagie, Cobus M; Seifert, Keith A; Houbraken, Jos; Samson, Robert A; Jacobs, Karin

    2014-01-01

    During a survey of the fynbos biome in the Western Cape of South Africa, 61 Penicillium species were isolated and nine belong to Penicillium section Citrina. Based on morphology and multigene phylogenies, section Citrina species were identified as P. cairnsense, P. citrinum, P. pancosmium, P. pasqualense, P. sanguifluum, P. sizovae, P. sumatrense and P. ubiquetum. One of the species displayed unique phenotypic characters and DNA sequences and is described here as P. sucrivorum. Multigene phylogenies consistently resolved the new species in a clade with P. aurantiacobrunneum, P. cairnsense, P. miczynksii, P. neomiczynskii and P. quebecense. However, ITS, β-tubulin and calmodulin gene sequences are unique for P. sucrivorum and growth rates on various media, the ability to grow at 30 C, a positive Ehrlich reaction and the absence of sclerotia on all media examined, distinguish P. sucrivorum from all of its close relatives. © 2014 by The Mycological Society of America.

  7. Electric fields yield chaos in microflows

    PubMed Central

    Posner, Jonathan D.; Pérez, Carlos L.; Santiago, Juan G.

    2012-01-01

    We present an investigation of chaotic dynamics of a low Reynolds number electrokinetic flow. Electrokinetic flows arise due to couplings of electric fields and electric double layers. In these flows, applied (steady) electric fields can couple with ionic conductivity gradients outside electric double layers to produce flow instabilities. The threshold of these instabilities is controlled by an electric Rayleigh number, Rae. As Rae increases monotonically, we show here flow dynamics can transition from steady state to a time-dependent periodic state and then to an aperiodic, chaotic state. Interestingly, further monotonic increase of Rae shows a transition back to a well-ordered state, followed by a second transition to a chaotic state. Temporal power spectra and time-delay phase maps of low dimensional attractors graphically depict the sequence between periodic and chaotic states. To our knowledge, this is a unique report of a low Reynolds number flow with such a sequence of periodic-to-aperiodic transitions. Also unique is a report of strange attractors triggered and sustained through electric fluid body forces. PMID:22908251

  8. The MUSES Satellite Team and Multidisciplinary System Engineering

    NASA Technical Reports Server (NTRS)

    Chen, John C.; Paiz, Alfred R.; Young, Donald L.

    1997-01-01

    In a unique partnership between three minority-serving institutions and NASA's Jet Propulsion Laboratory, a new course sequence, including a multidisciplinary capstone design experience, is to be developed and implemented at each of the schools with the ambitious goal of designing, constructing and launching a low-orbit Earth-resources satellite. The three universities involved are North Carolina A&T State University (NCA&T), University of Texas, El Paso (UTEP), and California State University, Los Angeles (CSULA). The schools form a consortium collectively known as MUSES - Minority Universities System Engineering and Satellite. Four aspects of this project make it unique: (1) Including all engineering disciplines in the capstone design course, (2) designing, building and launching an Earth-resources satellite, (3) sustaining the partnership between the three schools to achieve this goal, and (4) implementing systems engineering pedagogy at each of the three schools. This paper will describe the partnership and its goals, the first design of the satellite, the courses developed at NCA&T, and the implementation plan for the course sequence.

  9. Authentication Markers for Five Major Panax Species Developed via Comparative Analysis of Complete Chloroplast Genome Sequences.

    PubMed

    Nguyen, Van Binh; Park, Hyun-Seung; Lee, Sang-Choon; Lee, Junki; Park, Jee Young; Yang, Tae-Jin

    2017-08-02

    Ginseng represents a set of high-value medicinal plants of different species: Panax ginseng (Asian ginseng), Panax quinquefolius (American ginseng), Panax notoginseng (Chinese ginseng), Panax japonicus (Bamboo ginseng), and Panax vietnamensis (Vietnamese ginseng). Each species is pharmacologically and economically important, with differences in efficacy and price. Accordingly, an authentication system is needed to combat economically motivated adulteration of Panax products. We conducted comparative analysis of the chloroplast genome sequences of these five species, identifying 34-124 InDels and 141-560 SNPs. Fourteen InDel markers were developed to authenticate the Panax species. Among these, eight were species-unique markers that successfully differentiated one species from the others. We generated at least one species-unique marker for each of the five species, and any of the species can be authenticated by selection among these markers. The markers are reliable, easily detectable, and valuable for applications in the ginseng industry as well as in related research.

  10. Sequence, Structural Analysis and Metrics to Define the Unique Dynamic Features of the Flap Regions Among Aspartic Proteases.

    PubMed

    McGillewie, Lara; Ramesh, Muthusamy; Soliman, Mahmoud E

    2017-10-01

    Aspartic proteases are a class of hydrolytic enzymes that have been implicated in a number of diseases such as HIV, malaria, cancer and Alzheimer's. The flap region of aspartic proteases is a characteristic unique structural feature of these enzymes; and found to have a profound impact on protein overall structure, function and dynamics. Flap dynamics also plays a crucial role in drug binding and drug resistance. Therefore, understanding the structure and dynamic behavior of this flap regions is crucial in the design of potent and selective inhibitors against aspartic proteases. Defining metrics that can describe the flap motion/dynamics has been a challenging topic in literature. This review is the first attempt to compile comprehensive information on sequence, structure, motion and metrics used to assess the dynamics of the flap region of different aspartic proteases in "one pot". We believe that this review would be of critical importance to the researchers from different scientific domains.

  11. A chitin deacetylase of Podospora anserina has two functional chitin binding domains and a unique mode of action.

    PubMed

    Hoßbach, Janina; Bußwinkel, Franziska; Kranz, Andreas; Wattjes, Jasper; Cord-Landwehr, Stefan; Moerschbacher, Bruno M

    2018-03-01

    Chitosan is a structurally diverse biopolymer that is commercially derived from chitin by chemical processing, but chitin deacetylases (CDAs) potentially offer a sustainable and more controllable approach allowing the production of chitosans with tailored structures and biological activities. We investigated the CDA from Podospora anserina (PaCDA) which is closely related to Colletotrichum lindemuthianum CDA in the catalytic domain, but unique in having two chitin-binding domains. We produced recombinant PaCDA in Hansenula polymorpha for biochemical characterization and found that the catalytic domain of PaCDA is also functionally similar to C. lindemuthianum CDA, though differing in detail. When studying the enzyme's mode of action on chitin oligomers by quantitative mass-spectrometric sequencing, we found almost all possible sequences up to full deacetylation but with a clear preference for specific products. Deletion muteins lacking one or both CBDs confirmed their proposed function in supporting the enzymatic conversion of the insoluble substrate colloidal chitin. Copyright © 2017. Published by Elsevier Ltd.

  12. Human embryonic stem cell phosphoproteome revealed by electron transfer dissociation tandem mass spectrometry

    PubMed Central

    Swaney, Danielle L.; Wenger, Craig D.; Thomson, James A.; Coon, Joshua J.

    2009-01-01

    Protein phosphorylation is central to the understanding of cellular signaling, and cellular signaling is suggested to play a major role in the regulation of human embryonic stem (ES) cell pluripotency. Here, we describe the use of conventional tandem mass spectrometry-based sequencing technology—collision-activated dissociation (CAD)—and the more recently developed method electron transfer dissociation (ETD) to characterize the human ES cell phosphoproteome. In total, these experiments resulted in the identification of 11,995 unique phosphopeptides, corresponding to 10,844 nonredundant phosphorylation sites, at a 1% false discovery rate (FDR). Among these phosphorylation sites are 5 localized to 2 pluripotency critical transcription factors—OCT4 and SOX2. From these experiments, we conclude that ETD identifies a larger number of unique phosphopeptides than CAD (8,087 to 3,868), more frequently localizes the phosphorylation site to a specific residue (49.8% compared with 29.6%), and sequences whole classes of phosphopeptides previously unobserved. PMID:19144917

  13. Genetic characterization of feline leukemia virus from Florida panthers.

    PubMed

    Brown, Meredith A; Cunningham, Mark W; Roca, Alfred L; Troyer, Jennifer L; Johnson, Warren E; O'Brien, Stephen J

    2008-02-01

    From 2002 through 2005, an outbreak of feline leukemia virus (FeLV) occurred in Florida panthers (Puma concolor coryi). Clinical signs included lymphadenopathy, anemia, septicemia, and weight loss; 5 panthers died. Not associated with FeLV outcome were the genetic heritage of the panthers (pure Florida vs. Texas/Florida crosses) and co-infection with feline immunodeficiency virus. Genetic analysis of panther FeLV, designated FeLV-Pco, determined that the outbreak likely came from 1 cross-species transmission from a domestic cat. The FeLV-Pco virus was closely related to the domestic cat exogenous FeLV-A subgroup in lacking recombinant segments derived from endogenous FeLV. FeLV-Pco sequences were most similar to the well-characterized FeLV-945 strain, which is highly virulent and strongly pathogenic in domestic cats because of unique long terminal repeat and envelope sequences. These unique features may also account for the severity of the outbreak after cross-species transmission to the panther.

  14. A high HIV-1 strain variability in London, UK, revealed by full-genome analysis: Results from the ICONIC project.

    PubMed

    Yebra, Gonzalo; Frampton, Dan; Gallo Cassarino, Tiziano; Raffle, Jade; Hubb, Jonathan; Ferns, R Bridget; Waters, Laura; Tong, C Y William; Kozlakidis, Zisis; Hayward, Andrew; Kellam, Paul; Pillay, Deenan; Clark, Duncan; Nastouli, Eleni; Leigh Brown, Andrew J

    2018-01-01

    The ICONIC project has developed an automated high-throughput pipeline to generate HIV nearly full-length genomes (NFLG, i.e. from gag to nef) from next-generation sequencing (NGS) data. The pipeline was applied to 420 HIV samples collected at University College London Hospitals NHS Trust and Barts Health NHS Trust (London) and sequenced using an Illumina MiSeq at the Wellcome Trust Sanger Institute (Cambridge). Consensus genomes were generated and subtyped using COMET, and unique recombinants were studied with jpHMM and SimPlot. Maximum-likelihood phylogenetic trees were constructed using RAxML to identify transmission networks using the Cluster Picker. The pipeline generated sequences of at least 1Kb of length (median = 7.46Kb, IQR = 4.01Kb) for 375 out of the 420 samples (89%), with 174 (46.4%) being NFLG. A total of 365 sequences (169 of them NFLG) corresponded to unique subjects and were included in the down-stream analyses. The most frequent HIV subtypes were B (n = 149, 40.8%) and C (n = 77, 21.1%) and the circulating recombinant form CRF02_AG (n = 32, 8.8%). We found 14 different CRFs (n = 66, 18.1%) and multiple URFs (n = 32, 8.8%) that involved recombination between 12 different subtypes/CRFs. The most frequent URFs were B/CRF01_AE (4 cases) and A1/D, B/C, and B/CRF02_AG (3 cases each). Most URFs (19/26, 73%) lacked breakpoints in the PR+RT pol region, rendering them undetectable if only that was sequenced. Twelve (37.5%) of the URFs could have emerged within the UK, whereas the rest were probably imported from sub-Saharan Africa, South East Asia and South America. For 2 URFs we found highly similar pol sequences circulating in the UK. We detected 31 phylogenetic clusters using the full dataset: 25 pairs (mostly subtypes B and C), 4 triplets and 2 quadruplets. Some of these were not consistent across different genes due to inter- and intra-subtype recombination. Clusters involved 70 sequences, 19.2% of the dataset. The initial analysis of genome sequences detected substantial hidden variability in the London HIV epidemic. Analysing full genome sequences, as opposed to only PR+RT, identified previously undetected recombinants. It provided a more reliable description of CRFs (that would be otherwise misclassified) and transmission clusters.

  15. Construction of an Ostrea edulis database from genomic and expressed sequence tags (ESTs) obtained from Bonamia ostreae infected haemocytes: Development of an immune-enriched oligo-microarray.

    PubMed

    Pardo, Belén G; Álvarez-Dios, José Antonio; Cao, Asunción; Ramilo, Andrea; Gómez-Tato, Antonio; Planas, Josep V; Villalba, Antonio; Martínez, Paulino

    2016-12-01

    The flat oyster, Ostrea edulis, is one of the main farmed oysters, not only in Europe but also in the United States and Canada. Bonamiosis due to the parasite Bonamia ostreae has been associated with high mortality episodes in this species. This parasite is an intracellular protozoan that infects haemocytes, the main cells involved in oyster defence. Due to the economical and ecological importance of flat oyster, genomic data are badly needed for genetic improvement of the species, but they are still very scarce. The objective of this study is to develop a sequence database, OedulisDB, with new genomic and transcriptomic resources, providing new data and convenient tools to improve our knowledge of the oyster's immune mechanisms. Transcriptomic and genomic sequences were obtained using 454 pyrosequencing and compiled into an O. edulis database, OedulisDB, consisting of two sets of 10,318 and 7159 unique sequences that represent the oyster's genome (WG) and de novo haemocyte transcriptome (HT), respectively. The flat oyster transcriptome was obtained from two strains (naïve and tolerant) challenged with B. ostreae, and from their corresponding non-challenged controls. Approximately 78.5% of 5619 HT unique sequences were successfully annotated by Blast search using public databases. A total of 984 sequences were identified as being related to immune response and several key immune genes were identified for the first time in flat oyster. Additionally, transcriptome information was used to design and validate the first oligo-microarray in flat oyster enriched with immune sequences from haemocytes. Our transcriptomic and genomic sequencing and subsequent annotation have largely increased the scarce resources available for this economically important species and have enabled us to develop an OedulisDB database and accompanying tools for gene expression analysis. This study represents the first attempt to characterize in depth the O. edulis haemocyte transcriptome in response to B. ostreae through massively sequencing and has aided to improve our knowledge of the immune mechanisms of flat oyster. The validated oligo-microarray and the establishment of a reference transcriptome will be useful for large-scale gene expression studies in this species. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Universal sequence map (USM) of arbitrary discrete sequences

    PubMed Central

    2002-01-01

    Background For over a decade the idea of representing biological sequences in a continuous coordinate space has maintained its appeal but not been fully realized. The basic idea is that any sequence of symbols may define trajectories in the continuous space conserving all its statistical properties. Ideally, such a representation would allow scale independent sequence analysis – without the context of fixed memory length. A simple example would consist on being able to infer the homology between two sequences solely by comparing the coordinates of any two homologous units. Results We have successfully identified such an iterative function for bijective mappingψ of discrete sequences into objects of continuous state space that enable scale-independent sequence analysis. The technique, named Universal Sequence Mapping (USM), is applicable to sequences with an arbitrary length and arbitrary number of unique units and generates a representation where map distance estimates sequence similarity. The novel USM procedure is based on earlier work by these and other authors on the properties of Chaos Game Representation (CGR). The latter enables the representation of 4 unit type sequences (like DNA) as an order free Markov Chain transition table. The properties of USM are illustrated with test data and can be verified for other data by using the accompanying web-based tool:http://bioinformatics.musc.edu/~jonas/usm/. Conclusions USM is shown to enable a statistical mechanics approach to sequence analysis. The scale independent representation frees sequence analysis from the need to assume a memory length in the investigation of syntactic rules. PMID:11895567

  17. Sequence quality analysis tool for HIV type 1 protease and reverse transcriptase.

    PubMed

    Delong, Allison K; Wu, Mingham; Bennett, Diane; Parkin, Neil; Wu, Zhijin; Hogan, Joseph W; Kantor, Rami

    2012-08-01

    Access to antiretroviral therapy is increasing globally and drug resistance evolution is anticipated. Currently, protease (PR) and reverse transcriptase (RT) sequence generation is increasing, including the use of in-house sequencing assays, and quality assessment prior to sequence analysis is essential. We created a computational HIV PR/RT Sequence Quality Analysis Tool (SQUAT) that runs in the R statistical environment. Sequence quality thresholds are calculated from a large dataset (46,802 PR and 44,432 RT sequences) from the published literature ( http://hivdb.Stanford.edu ). Nucleic acid sequences are read into SQUAT, identified, aligned, and translated. Nucleic acid sequences are flagged if with >five 1-2-base insertions; >one 3-base insertion; >one deletion; >six PR or >18 RT ambiguous bases; >three consecutive PR or >four RT nucleic acid mutations; >zero stop codons; >three PR or >six RT ambiguous amino acids; >three consecutive PR or >four RT amino acid mutations; >zero unique amino acids; or <0.5% or >15% genetic distance from another submitted sequence. Thresholds are user modifiable. SQUAT output includes a summary report with detailed comments for troubleshooting of flagged sequences, histograms of pairwise genetic distances, neighbor joining phylogenetic trees, and aligned nucleic and amino acid sequences. SQUAT is a stand-alone, free, web-independent tool to ensure use of high-quality HIV PR/RT sequences in interpretation and reporting of drug resistance, while increasing awareness and expertise and facilitating troubleshooting of potentially problematic sequences.

  18. Putting engineering back into protein engineering: bioinformatic approaches to catalyst design.

    PubMed

    Gustafsson, Claes; Govindarajan, Sridhar; Minshull, Jeremy

    2003-08-01

    Complex multivariate engineering problems are commonplace and not unique to protein engineering. Mathematical and data-mining tools developed in other fields of engineering have now been applied to analyze sequence-activity relationships of peptides and proteins and to assist in the design of proteins and peptides with specified properties. Decreasing costs of DNA sequencing in conjunction with methods to quickly synthesize statistically representative sets of proteins allow modern heuristic statistics to be applied to protein engineering. This provides an alternative approach to expensive assays or unreliable high-throughput surrogate screens.

  19. Accelerated Gene Evolution and Subfunctionalization in thePseudotetraploid Frog Xenopus Laevis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hellsten, Uffe; Khokha, Mustafa K.; Grammar, Timothy C.

    2007-03-01

    Ancient whole genome duplications have been implicated in the vertebrate and teleost radiations, and in the emergence of diverse angiosperm lineages, but the evolutionary response to such a perturbation is still poorly understood. The African clawed frog Xenopus laevis experienced a relatively recent tetraploidization {approx} 40 million years ago. Analysis of the considerable amount of EST sequence available for this species together with the genome sequence of the related diploid Xenopus tropicalis provides a unique opportunity to study the genomic response to whole genome duplication.

  20. Method and apparatus for determining the coordinates of an object

    DOEpatents

    Pedersen, Paul S.

    2002-01-01

    A simplified method and related apparatus are described for determining the location of points on the surface of an object by varying, in accordance with a unique sequence, the intensity of each illuminated pixel directed to the object surface, and detecting at known detector pixel locations the intensity sequence of reflected illumination from the surface of the object whereby the identity and location of the originating illuminated pixel can be determined. The coordinates of points on the surface of the object are then determined by conventional triangulation methods.

Top