Science.gov

Sample records for expression sequence tags

  1. Expressed sequence tags analysis of Blattella germanica.

    PubMed

    Chung, Hyang Suk; Yu, Tai Hyun; Kim, Bong Jin; Kim, Sun Mi; Kim, Joo Yeong; Yu, Hak Sun; Jeong, Hae Jin; Ock, Mee Sun

    2005-12-01

    Four hundred and sixty five randomly selected clones from a cDNA library of Blattella germanica were partially sequenced and searched using BLAST as a means of analyzing the transcribed sequences of its genome. A total of 363 expressed sequence tags (ESTs) were generated from 465 clones after editing and trimming the vector and ambiguous sequences. About 42% (154/363) of these clones showed significant homology with other data base registered genes. These new B. germanica genes constituted a broad range of transcripts distributed among ribosomal proteins, energy metabolism, allergens, proteases, protease inhibitors, enzymes, translation, cell signaling pathways, and proteins of unknown function. Eighty clones were not well-matched by database searches, and these represent new B. germanica-specific ESTs. Some genes which drew our attention are discussed. The information obtained increases our understanding of the B. germanica genome.

  2. Expressed sequence tags analysis of Blattella germanica

    PubMed Central

    Chung, Hyang Suk; Yu, Tai Hyun; Kim, Bong Jin; Kim, Sun Mi; Kim, Joo Yeong; Yu, Hak Sun; Jeong, Hae Jin

    2005-01-01

    Four hundred and sixty five randomly selected clones from a cDNA library of Blattella germanica were partially sequenced and searched using BLAST as a means of analyzing the transcribed sequences of its genome. A total of 363 expressed sequence tags (ESTs) were generated from 465 clones after editing and trimming the vector and ambiguous sequences. About 42% (154/363) of these clones showed significant homology with other data base registered genes. These new B. germanica genes constituted a broad range of transcripts distributed among ribosomal proteins, energy metabolism, allergens, proteases, protease inhibitors, enzymes, translation, cell signaling pathways, and proteins of unknown function. Eighty clones were not well-matched by database searches, and these represent new B. germanica-specific ESTs. Some genes which drew our attention are discussed. The information obtained increases our understanding of the B. germanica genome. PMID:16340304

  3. Obtaining accurate translations from expressed sequence tags.

    PubMed

    Wasmuth, James; Blaxter, Mark

    2009-01-01

    The genomes of an increasing number of species are being investigated through the generation of expressed sequence tags (ESTs). However, ESTs are prone to sequencing errors and typically define incomplete transcripts, making downstream annotation difficult. Annotation would be greatly improved with robust polypeptide translations. Many current solutions for EST translation require a large number of full-length gene sequences for training purposes, a resource that is not available for the majority of EST projects. As part of our ongoing EST programs investigating these "neglected" genomes, we have developed a polypeptide prediction pipeline, prot4EST. It incorporates freely available software to produce final translations that are more accurate than those derived from any single method. We describe how this integrated approach goes a long way to overcoming the deficit in training data.

  4. RED: the analysis, management and dissemination of expressed sequence tags.

    PubMed

    Everitt, R; Minnema, S E; Wride, M A; Koster, C S; Hance, J E; Mansergh, F C; Rancourt, D E

    2002-12-01

    The Rancourt EST Database (RED) is a web-based system for the analysis, management, and dissemination of expressed sequence tags (ESTs). RED represents a flexible template DNA sequence database that can be easily manipulated to suit the needs of other laboratories undertaking mid-size sequencing projects.

  5. Next-generation tag sequencing for cancer gene expression profiling.

    PubMed

    Morrissy, A Sorana; Morin, Ryan D; Delaney, Allen; Zeng, Thomas; McDonald, Helen; Jones, Steven; Zhao, Yongjun; Hirst, Martin; Marra, Marco A

    2009-10-01

    We describe a new method, Tag-seq, which employs ultra high-throughput sequencing of 21 base pair cDNA tags for sensitive and cost-effective gene expression profiling. We compared Tag-seq data to LongSAGE data and observed improved representation of several classes of rare transcripts, including transcription factors, antisense transcripts, and intronic sequences, the latter possibly representing novel exons or genes. We observed increases in the diversity, abundance, and dynamic range of such rare transcripts and took advantage of the greater dynamic range of expression to identify, in cancers and normal libraries, altered expression ratios of alternative transcript isoforms. The strand-specific information of Tag-seq reads further allowed us to detect altered expression ratios of sense and antisense (S-AS) transcripts between cancer and normal libraries. S-AS transcripts were enriched in known cancer genes, while transcript isoforms were enriched in miRNA targeting sites. We found that transcript abundance had a stronger GC-bias in LongSAGE than Tag-seq, such that AT-rich tags were less abundant than GC-rich tags in LongSAGE. Tag-seq also performed better in gene discovery, identifying >98% of genes detected by LongSAGE and profiling a distinct subset of the transcriptome characterized by AT-rich genes, which was expressed at levels below those detectable by LongSAGE. Overall, Tag-seq is sensitive to rare transcripts, has less sequence composition bias relative to LongSAGE, and allows differential expression analysis for a greater range of transcripts, including transcripts encoding important regulatory molecules.

  6. Gene Discovery through Expressed Sequence Tag Sequencing in Trypanosoma cruzi

    PubMed Central

    Verdun, Ramiro E.; Di Paolo, Nelson; Urmenyi, Turan P.; Rondinelli, Edson; Frasch, Alberto C. C.; Sanchez, Daniel O.

    1998-01-01

    Analysis of expressed sequence tags (ESTs) constitutes a useful approach for gene identification that, in the case of human pathogens, might result in the identification of new targets for chemotherapy and vaccine development. As part of the Trypanosoma cruzi genome project, we have partially sequenced the 5′ ends of 1,949 clones to generate ESTs. The clones were randomly selected from a normalized CL Brener epimastigote cDNA library. A total of 14.6% of the clones were homologous to previously identified T. cruzi genes, while 18.4% had significant matches to genes from other organisms in the database. A total of 67% of the ESTs had no matches in the database, and thus, some of them might be T. cruzi-specific genes. Functional groups of those sequences with matches in the database were constructed according to their putative biological functions. The two largest categories were protein synthesis (23.3%) and cell surface molecules (10.8%). The information reported in this paper should be useful for researchers in the field to analyze genes and proteins of their own interest. PMID:9784549

  7. Comparative analyses of potato expressed sequence tag libraries.

    PubMed

    Ronning, Catherine M; Stegalkina, Svetlana S; Ascenzi, Robert A; Bougri, Oleg; Hart, Amy L; Utterbach, Teresa R; Vanaken, Susan E; Riedmuller, Steve B; White, Joseph A; Cho, Jennifer; Pertea, Geo M; Lee, Yuandan; Karamycheva, Svetlana; Sultana, Razvan; Tsai, Jennifer; Quackenbush, John; Griffiths, Helen M; Restrepo, Silvia; Smart, Christine D; Fry, William E; Van Der Hoeven, Rutger; Tanksley, Steve; Zhang, Peifen; Jin, Hailing; Yamamoto, Miki L; Baker, Barbara J; Buell, C Robin

    2003-02-01

    The cultivated potato (Solanum tuberosum) shares similar biology with other members of the Solanaceae, yet has features unique within the family, such as modified stems (stolons) that develop into edible tubers. To better understand potato biology, we have undertaken a survey of the potato transcriptome using expressed sequence tags (ESTs) from diverse tissues. A total of 61,940 ESTs were generated from aerial tissues, below-ground tissues, and tissues challenged with the late-blight pathogen (Phytophthora infestans). Clustering and assembly of these ESTs resulted in a total of 19,892 unique sequences with 8,741 tentative consensus sequences and 11,151 singleton ESTs. We were able to identify a putative function for 43.7% of these sequences. A number of sequences (48) were expressed throughout the libraries sampled, representing constitutively expressed sequences. Other sequences (13,068, 21%) were uniquely expressed and were detected only in a single library. Using hierarchal and k means clustering of the EST sequences, we were able to correlate changes in gene expression with major physiological events in potato biology. Using pair-wise comparisons of tuber-related tissues, we were able to associate genes with tuber initiation, dormancy, and sprouting. We also were able to identify a number of characterized as well as novel sequences that were unique to the incompatible interaction of late-blight pathogen, thereby providing a foundation for further understanding the mechanism of resistance.

  8. Searching the expressed sequence tag (EST) databases: panning for genes.

    PubMed

    Jongeneel, C V

    2000-02-01

    The genomes of living organisms contain many elements, including genes coding for proteins. The portions of the genes expressed as mature mRNA, collectively known as the transcriptome, represent only a small part of the genome. The expressed sequence tag (EST) databases contain an increasingly large part of the transcriptome of many species. For this reason, these databases are probably the most abundant source of new coding sequences available today. However, the raw data deposited in the EST databases are to a large extent unorganised, unannotated, redundant and of relatively low quality. This paper reviews some of the characteristics of the EST data, and the methods that can be used to find novel protein sequences within them. It also documents a collection of databases, software and web sites that can be useful to biologists interested in mining the EST databases over the Internet, or in establishing a local environment for such analyses.

  9. Analyses of Expressed Sequence Tags from Apple1

    PubMed Central

    Newcomb, Richard D.; Crowhurst, Ross N.; Gleave, Andrew P.; Rikkerink, Erik H.A.; Allan, Andrew C.; Beuning, Lesley L.; Bowen, Judith H.; Gera, Emma; Jamieson, Kim R.; Janssen, Bart J.; Laing, William A.; McArtney, Steve; Nain, Bhawana; Ross, Gavin S.; Snowden, Kimberley C.; Souleyre, Edwige J.F.; Walton, Eric F.; Yauk, Yar-Khing

    2006-01-01

    The domestic apple (Malus domestica; also known as Malus pumila Mill.) has become a model fruit crop in which to study commercial traits such as disease and pest resistance, grafting, and flavor and health compound biosynthesis. To speed the discovery of genes involved in these traits, develop markers to map genes, and breed new cultivars, we have produced a substantial expressed sequence tag collection from various tissues of apple, focusing on fruit tissues of the cultivar Royal Gala. Over 150,000 expressed sequence tags have been collected from 43 different cDNA libraries representing 34 different tissues and treatments. Clustering of these sequences results in a set of 42,938 nonredundant sequences comprising 17,460 tentative contigs and 25,478 singletons, together representing what we predict are approximately one-half the expressed genes from apple. Many potential molecular markers are abundant in the apple transcripts. Dinucleotide repeats are found in 4,018 nonredundant sequences, mainly in the 5′-untranslated region of the gene, with a bias toward one repeat type (containing AG, 88%) and against another (repeats containing CG, 0.1%). Trinucleotide repeats are most common in the predicted coding regions and do not show a similar degree of sequence bias in their representation. Bi-allelic single-nucleotide polymorphisms are highly abundant with one found, on average, every 706 bp of transcribed DNA. Predictions of the numbers of representatives from protein families indicate the presence of many genes involved in disease resistance and the biosynthesis of flavor and health-associated compounds. Comparisons of some of these gene families with Arabidopsis (Arabidopsis thaliana) suggest instances where there have been duplications in the lineages leading to apple of biosynthetic and regulatory genes that are expressed in fruit. This resource paves the way for a concerted functional genomics effort in this important temperate fruit crop. PMID:16531485

  10. Expressed sequence tag analysis in tef (Eragrostis tef (Zucc) Trotter).

    PubMed

    Yu, Ju-Kyung; Sun, Qi; Rota, Mauricio La; Edwards, Hugh; Tefera, Hailu; Sorrells, Mark E

    2006-04-01

    Tef (Eragrostis tef (Zucc.) Trotter) is the most important cereal crop in Ethiopia; however, there is very little DNA sequence information available for this species. Expressed sequence tags (ESTs) were generated from 4 cDNA libraries: seedling leaf, seedling root, and inflorescence of E. tef and seedling leaf of Eragrostis pilosa, a wild relative of E. tef. Clustering of 3603 sequences produced 530 clusters and 1890 singletons, resulting in 2420 tef unigenes. Approximately 3/4 of tef unigenes matched protein or nucleotide sequences in public databases. Annotation of unigenes associated 68% of the putative tef genes with gene ontology categories. Identification of the translated unigenes for conserved protein domains revealed 389 protein family domains (Pfam), the most frequent of which was protein kinase. A total of 170 ESTs containing simple sequence repeats (EST-SSRs) were identified and 80 EST-SSR markers were developed. In addition, 19 single-nucleotide polymorphism (SNP) and (or) insertion-deletion (indel) and 34 intron fragment length polymorphism (IFLP) markers were developed. The EST database and molecular markers generated in this study will be valuable resources for further tef genetic research.

  11. A blackberry (Rubus L.) expressed sequence tag library for the development of simple sequence repeat markers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A blackberry (Rubus L.) expressed sequence tag (EST) library was produced for developing simple sequence repeat (SSR) markers from the tetraploid blackberry cultivar, Merton Thornless, the source of the thornless trait in commercial cultivars. RNA was extracted from young expanding leaves and used f...

  12. Analysis of the dermatophyte Trichophyton rubrum expressed sequence tags

    PubMed Central

    Wang, Lingling; Ma, Li; Leng, Wenchuan; Liu, Tao; Yu, Lu; Yang, Jian; Yang, Li; Zhang, Wenliang; Zhang, Qian; Dong, Jie; Xue, Ying; Zhu, Yafang; Xu, Xingye; Wan, Zhe; Ding, Guohui; Yu, Fudong; Tu, Kang; Li, Yixue; Li, Ruoyu; Shen, Yan; Jin, Qi

    2006-01-01

    Background Dermatophytes are the primary causative agent of dermatophytoses, a disease that affects billions of individuals worldwide. Trichophyton rubrum is the most common of the superficial fungi. Although T. rubrum is a recognized pathogen for humans, little is known about how its transcriptional pattern is related to development of the fungus and establishment of disease. It is therefore necessary to identify genes whose expression is relevant to growth, metabolism and virulence of T. rubrum. Results We generated 10 cDNA libraries covering nearly the entire growth phase and used them to isolate 11,085 unique expressed sequence tags (ESTs), including 3,816 contigs and 7,269 singletons. Comparisons with the GenBank non-redundant (NR) protein database revealed putative functions or matched homologs from other organisms for 7,764 (70%) of the ESTs. The remaining 3,321 (30%) of ESTs were only weakly similar or not similar to known sequences, suggesting that these ESTs represent novel genes. Conclusion The present data provide a comprehensive view of fungal physiological processes including metabolism, sexual and asexual growth cycles, signal transduction and pathogenic mechanisms. PMID:17032460

  13. Expressed sequence tags (ESTs) analysis of Acanthamoeba healyi

    PubMed Central

    Kong, Hyun-Hee; Hwang, Mee-Yeul; Kim, Hyo-Kyung

    2001-01-01

    Randomly selected 435 clones from Acanthamoeba healyi cDNA library were sequenced and a total of 387 expressed sequence tags (ESTs) had been generated. Based on the results of BLAST search, 130 clones (34.4%) were identified as the genes enconding surface proteins, enzymes for DNA, energy production or other metabolism, kinases and phosphatases, protease, proteins for signal transduction, structural and cytoskeletal proteins, cell cycle related proteins, transcription factors, transcription and translational machineries, and transporter proteins. Most of the genes (88.5%) are newly identified in the genus Acanthamoeba. Although 15 clones matched the genes of Acanthamoeba located in the public databases, twelve clones were actin gene which was the most frequently expressed gene in this study. These ESTs of Acanthamoeba would give valuable information to study the organism as a model system for biological investigations such as cytoskeleton or cell movement, signal transduction, transcriptional and translational regulations. These results would also provide clues to elucidate factors for pathogenesis in human granulomatous amoebic encephalitis or keratitis by Acanthamoeba. PMID:11441502

  14. Analysis of expressed sequence tags from the Ulva prolifera (Chlorophyta)

    NASA Astrophysics Data System (ADS)

    Niu, Jianfeng; Hu, Haiyan; Hu, Songnian; Wang, Guangce; Peng, Guang; Sun, Song

    2010-01-01

    In 2008, a green tide broke out before the sailing competition of the 29th Olympic Games in Qingdao. The causative species was determined to be Enteromorpha prolifera ( Ulva prolifera O. F. Müller), a familiar green macroalga along the coastline of China. Rapid accumulation of a large biomass of floating U. prolifera prompted research on different aspects of this species. In this study, we constructed a nonnormalized cDNA library from the thalli of U. prolifera and acquired 10 072 high-quality expressed sequence tags (ESTs). These ESTs were assembled into 3 519 nonredundant gene groups, including 1 446 clusters and 2 073 singletons. After annotation with the nr database, a large number of genes were found to be related with chloroplast and ribosomal protein, GO functional classification showed 1 418 ESTs participated in photosynthesis and 1 359 ESTs were responsible for the generation of precursor metabolites and energy. In addition, rather comprehensive carbon fixation pathways were found in U. prolifera using KEGG. Some stress-related and signal transduction-related genes were also found in this study. All the evidences displayed that U. prolifera had substance and energy foundation for the intense photosynthesis and the rapid proliferation. Phylogenetic analysis of cytochrome c oxidase subunit I revealed that this green-tide causative species is most closely affiliated to Pseudendoclonium akinetum (Ulvophyceae).

  15. Discovering conserved insect microRNAs from expressed sequence tags.

    PubMed

    Jia, Qidong; Lin, Kejian; Liang, Jingdong; Yu, Lun; Li, Fei

    2010-12-01

    MicroRNAs (miRNA) participate in regulating diverse biological pathways by translational repression in animals. They have attracted increasing attention recently. However, little work has been done on the miRNA genes in agriculturally important pests. Because the transcripts of most miRNA genes are the products of type-II RNA polymerase, pri-miRNA has a poly(A) tail and appears in expressed sequence tags (EST). We developed a computational pipeline to identify miRNA genes from insect ESTs. First, 980,697 ESTs from 63 insects were collected and used to search the nr database. The ESTs which did not share significant similarities with any known protein-coding genes were treated as non-coding ESTs. Next, known mature miRNAs were used to align with non-coding ESTs. The ESTs which contain the sequence of mature miRNA were treated as candidate ESTs. Finally, putative precursors were extracted flanking the mature miRNA region in candidate ESTs and evaluated by the Triplet-SVM algorithm. As a result, 86 miRNAs from 30 insect species were found based on a strict criterion while 330 miRNAs from 51 species were found based on a loose criterion. Evolution analysis indicated that mir-467, mir-297 and mir-466 were the highest conserved miRNA families in insects. To confirm the reliability of putative insect miRNAs, the expression profile of nine predicted miRNAs in Locusta migratoria was investigated. Eight miRNAs were successfully detected by RT-PCR. Most miRNAs were expressed ubiquitously at all examined tissues and developmental stages whereas Lmi-mir-509 was specifically expressed in the thorax of the 2nd, 4th and 5th instars and adult locust. In all, our work reported an efficient computational strategy for predicting miRNA genes from insect ESTs and presented tens of miRNAs in diverse insect species which are expected to participate in many important physiological processes.

  16. Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags

    PubMed Central

    de Souza, Sandro J.; Camargo, Anamaria A.; Briones, Marcelo R. S.; Costa, Fernando F.; Nagai, Maria Aparecida; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; de Fátima Sonati, Maria; Tajara, Eloiza H.; Valentini, Sandro R.; Acencio, Marcio; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Bengtson, Mário Henrique; Carraro, Dirce M.; Carvalho, Alex F.; Carvalho, Lúcia Helena; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Costa, Maria Cristina R.; Curcio, Cyntia; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Leite, Luciana C. C.; Maia, Gustavo; Majumder, Paromita; Marins, Mozart; Matsukuma, Adriana; Melo, Analy S. A.; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana Gilbert; Rahal, Paula; Rainho, Claudia A.; da Ro's, Nancy; de Sá, Renata G.; Sales, Magaly M.; da Silva, Neusa P.; Silva, Tereza C.; da Silva, Wilson; Simão, Daniel F.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Zalcberg, Heloisa; Brentani, Ricardo R.; Reis, Luis F. L.; Dias-Neto, Emmanuel; Simpson, Andrew J. G.

    2000-01-01

    Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1,181 (1.45%) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6%) of the 247 known genes, for 67 (44.6%) of the 150 related genes, and for 45 of the 148 (30.4%) EST-predicted genes on this chromosome. Using a set of stringent criteria to validate our sequences, we identified a further 219 previously unannotated transcribed sequences on chromosome 22. Of these, 171 were in fact also defined by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15% of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48 transcribed sequences on chromosome 22 not defined by other sequences. All of the transcribed sequences defined by ORESTES coincided with DNA regions predicted as encoding exons by genscan. (http://genes.mit.edu/GENSCAN.html). PMID:11070084

  17. Peanut (Arachis hypogaea) expressed sequence tag (EST) project: Progress and application.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Millions of expressed sequence tag (EST) sequences from several hundred plant species have been deposited in public EST databases. Many plant ESTs have been sequenced as an alternative to whole genome sequences, including peanut because of the genome size and complexity. The US peanut research commu...

  18. Identification of expressed resistance gene analogs from peanut (Arachis hypogaea L.) expressed sequence tags.

    PubMed

    Liu, Zhanji; Feng, Suping; Pandey, Manish K; Chen, Xiaoping; Culbreath, Albert K; Varshney, Rajeev K; Guo, Baozhu

    2013-05-01

    Low genetic diversity makes peanut (Arachis hypogaea L.) very vulnerable to plant pathogens, causing severe yield loss and reduced seed quality. Several hundred partial genomic DNA sequences as nucleotide-binding-site leucine-rich repeat (NBS-LRR) resistance genes (R) have been identified, but a small portion with expressed transcripts has been found. We aimed to identify resistance gene analogs (RGAs) from peanut expressed sequence tags (ESTs) and to develop polymorphic markers. The protein sequences of 54 known R genes were used to identify homologs from peanut ESTs from public databases. A total of 1,053 ESTs corresponding to six different classes of known R genes were recovered, and assembled 156 contigs and 229 singletons as peanut-expressed RGAs. There were 69 that encoded for NBS-LRR proteins, 191 that encoded for protein kinases, 82 that encoded for LRR-PK/transmembrane proteins, 28 that encoded for Toxin reductases, 11 that encoded for LRR-domain containing proteins and four that encoded for TM-domain containing proteins. Twenty-eight simple sequence repeats (SSRs) were identified from 25 peanut expressed RGAs. One SSR polymorphic marker (RGA121) was identified. Two polymerase chain reaction-based markers (Ahsw-1 and Ahsw-2) developed from RGA013 were homologous to the Tomato Spotted Wilt Virus (TSWV) resistance gene. All three markers were mapped on the same linkage group AhIV. These expressed RGAs are the source for RGA-tagged marker development and identification of peanut resistance genes.

  19. Analysis of expressed sequence tags (ESTs) from Agrostis species obtained using sequence related amplified polymorphism.

    PubMed

    Dinler, Gizem; Budak, Hikmet

    2008-10-01

    Bentgrass (Agrostis spp.), a genus of the Poaceae family, consists of more than 200 species and is mainly used in athletic fields and golf courses. Creeping bentgrass (A. stolonifera L.) is the most commonly used species in maintaining golf courses, followed by colonial bentgrass (A. capillaris L.) and velvet bentgrass (A. canina L.). The presence and nature of sequence related amplified polymorphism (SRAP) at the cDNA level were investigated. We isolated 80 unique cDNA fragment bands from these species using 56 SRAP primer combinations. Sequence analysis of cDNA clones and analysis of putative translation products revealed that some encoded amino acid sequences were similar to proteins involved in DNA synthesis, transcription, and signal transduction. The cytosolic glyceraldehyde-3-phosphate dehydrogenase (GAPDH) gene (GenBank accession no. EB812822) was also identified from velvet bentgrass, and the corresponding protein sequence is further analyzed due to its critical role in many cellular processes. The partial peptide sequence obtained was 112 amino acids long, presenting a high degree of homology to parts of the N-terminal and C-terminal regions of cytosolic phosphorylating GAPDH (GapC). The existence of common expressed sequence tags (ESTs) revealed by a minimum evolutionary dendrogram among the Agrostis ESTs indicated the usefulness of SRAP for comparative genome analysis of transcribed genes in the grass species.

  20. Quantitative gene expression profiles in real time from expressed sequence tag databases.

    PubMed

    Funari, Vincent A; Voevodski, Konstantin; Leyfer, Dimitry; Yerkes, Laura; Cramer, Donald; Tolan, Dean R

    2010-01-01

    An accumulation of expressed sequence tag (EST) data in the public domain and the availability of bioinformatic programs have made EST gene expression profiling a common practice. However, the utility and validity of using EST databases (e.g., dbEST) has been criticized, particularly for quantitative assessment of gene expression. Problems with EST sequencing errors, library construction, EST annotation, and multiple paralogs make generation of specific and sensitive qualitative arid quantitative expression profiles a concern. In addition, most EST-derived expression data exists in previously assembled databases. The Virtual Northern Blot (VNB) (http: //tlab.bu.edu/vnb.html) allows generation, evaluation, and optimization of expression profiles in real time, which is especially important for alternatively spliced, novel, or poorly characterized genes. Representative gene families with variable nucleotide sequence identity, tissue specificity, and levels of expression (bcl-xl, aldoA, and cyp2d9) are used to assess the quality of VNB's output. The profiles generated by VNB are more sensitive and specific than those constructed with ESTs listed in preindexed databases at UCSC and NCBI. Moreover, quantitative expression profiles produced by VNB are comparable to quantization obtained from Northern blots and qPCR. The VNB pipeline generates real-time gene expression profiles for single-gene queries that are both qualitatively and quantitatively reliable.

  1. Quantitative Gene Expression Profiles in Real Time From Expressed Sequence Tag Databases

    PubMed Central

    FUNARI, VINCENT A.; VOEVODSKI, KONSTANTIN; LEYFER, DIMITRY; YERKES, LAURA; CRAMER, DONALD; TOLAN, DEAN R.

    2010-01-01

    An accumulation of expressed sequence tag (EST) data in the public domain and the availability of bioinformatic programs have made EST gene expression profiling a common practice. However, the utility and validity of using EST databases (e.g., dbEST) has been criticized, particularly for quantitative assessment of gene expression. Problems with EST sequencing errors, library construction, EST annotation, and multiple paralogs make generation of specific and sensitive qualitative and quantitative expression profiles a concern. In addition, most EST-derived expression data exists in previously assembled databases. The Virtual Northern Blot (VNB) (http://tlab.bu.edu/vnb.html) allows generation, evaluation, and optimization of expression profiles in real time, which is especially important for alternatively spliced, novel, or poorly characterized genes. Representative gene families with variable nucleotide sequence identity, tissue specificity, and levels of expression (bcl-xl, aldoA, and cyp2d9) are used to assess the quality of VNB’s output. The profiles generated by VNB are more sensitive and specific than those constructed with ESTs listed in preindexed databases at UCSI and NCBI. Moreover, quantitative expression profiles produced by VNB are comparable to quantization obtained from Northern blots and qPCR. The VNB pipeline generates real-time gene expression profiles for single-gene queries that are both qualitatively and quantitatively reliable. PMID:20635574

  2. Gene expression profile of human bone marrow stromal cells: high-throughput expressed sequence tag sequencing analysis.

    PubMed

    Jia, Libin; Young, Marian F; Powell, John; Yang, Liming; Ho, Nicola C; Hotchkiss, Robert; Robey, Pamela Gehron; Francomano, Clair A

    2002-01-01

    Human bone marrow stromal cells (HBMSC) are pluripotent cells with the potential to differentiate into osteoblasts, chondrocytes, myelosupportive stroma, and marrow adipocytes. We used high-throughput DNA sequencing analysis to generate 4258 single-pass sequencing reactions (known as expressed sequence tags, or ESTs) obtained from the 5' (97) and 3' (4161) ends of human cDNA clones from a HBMSC cDNA library. Our goal was to obtain tag sequences from the maximum number of possible genes and to deposit them in the publicly accessible database for ESTs (dbEST of the National Center for Biotechnology Information). Comparisons of our EST sequencing data with nonredundant human mRNA and protein databases showed that the ESTs represent 1860 gene clusters. The EST sequencing data analysis showed 60 novel genes found only in this cDNA library after BLAST analysis against 3.0 million ESTs in NCBI's dbEST database. The BLAST search also showed the identified ESTs that have close homology to known genes, which suggests that these may be newly recognized members of known gene families. The gene expression profile of this cell type is revealed by analyzing both the frequency with which a message is encountered and the functional categorization of expressed sequences. Comparing an EST sequence with the human genomic sequence database enables assignment of an EST to a specific chromosomal region (a process called digital gene localization) and often enables immediate partial determination of intron/exon boundaries within the genomic structure. It is expected that high-throughput EST sequencing and data mining analysis will greatly promote our understanding of gene expression in these cells and of growth and development of the skeleton.

  3. Inferring gene structures in genomic sequences using pattern recognition and expressed sequence tags.

    PubMed

    Xu, Y; Mural, R J; Uberbacher, E C

    1997-01-01

    Computational methods for gene identification in genomic sequences typically have two phases: coding region prediction and gene parsing. While there are many effective methods for predicting coding regions (exons), parsing the predicted exons into proper gene structures, to a large extent, remains an unsolved problem. This paper presents an algorithm for inferring gene structures from predicted exon candidates, based on Expressed Sequence Tags (ESTs) and biological intuition/rules. The algorithm first finds all the related ESTs in the EST database (dbEST) for each predicted exon, and infers the boundaries of one or a series of genes based on the available EST information and biological rules. Then it constructs gene models within each pair of gene boundaries, that are most consistent with the EST information. By exploiting EST information and biological rules, the algorithm can (1) model complicated multiple gene structures, including embedded genes, (2) identify falsely-predicted exons and locate missed exons, and (3) make more accurate exon boundary predictions. The algorithm has been implemented and tested on long genomic sequences with a number of genes. Test results show that very accurate (predicted) gene models can be expected when related ESTs exist for the predicted exons.

  4. Inferring gene structures in genomic sequences using pattern recognition and expressed sequence tags

    SciTech Connect

    Xu, Y.; Mural, R.; Uberbacher, E.

    1997-02-01

    Computational methods for gene identification in genomic sequences typically have two phases: coding region prediction and gene parsing. While there are many effective methods for predicting coding regions (exons), parsing the predicted exons into proper gene structures, to a large extent, remains an unsolved problem. This paper presents an algorithm for inferring gene structures from predicted exon candidates, based on Expressed Sequence Tags (ESTs) and biological intuition/rules. The algorithm first finds all the related ESTs in the EST database (dbEST) for each predicted exon, and infers the boundaries of one or a series of genes based on the available EST information and biological rules. Then it constructs gene models within each pair of gene boundaries, that are most consistent with the EST information. By exploiting EST information and biological rules, the algorithm can (1) model complicated multiple gene structures, including embedded genes, (2) identify falsely-predicted exons and locate missed exons, and (3) make more accurate exon boundary predictions. The algorithm has been implemented and tested on long genomic sequences with a number of genes. Test results show that very accurate (predicted) gene models can be expected when related ESTs exist for the predicted exons.

  5. Development of expressed sequence tag and expressed sequence tag–simple sequence repeat marker resources for Musa acuminata

    PubMed Central

    Passos, Marco A. N.; de Oliveira Cruz, Viviane; Emediato, Flavia L.; de Camargo Teixeira, Cristiane; Souza, Manoel T.; Matsumoto, Takashi; Rennó Azevedo, Vânia C.; Ferreira, Claudia F.; Amorim, Edson P.; de Alencar Figueiredo, Lucio Flavio; Martins, Natalia F.; de Jesus Barbosa Cavalcante, Maria; Baurens, Franc-Christophe; da Silva, Orzenil Bonfim; Pappas, Georgios J.; Pignolet, Luc; Abadie, Catherine; Ciampi, Ana Y.; Piffanelli, Pietro; Miller, Robert N. G.

    2012-01-01

    Background and aims Banana (Musa acuminata) is a crop contributing to global food security. Many varieties lack resistance to biotic stresses, due to sterility and narrow genetic background. The objective of this study was to develop an expressed sequence tag (EST) database of transcripts expressed during compatible and incompatible banana–Mycosphaerella fijiensis (Mf) interactions. Black leaf streak disease (BLSD), caused by Mf, is a destructive disease of banana. Microsatellite markers were developed as a resource for crop improvement. Methodology cDNA libraries were constructed from in vitro-infected leaves from BLSD-resistant M. acuminata ssp. burmaniccoides Calcutta 4 (MAC4) and susceptible M. acuminata cv. Cavendish Grande Naine (MACV). Clones were 5′-end Sanger sequenced, ESTs assembled with TGICL and unigenes annotated using BLAST, Blast2GO and InterProScan. Mreps was used to screen for simple sequence repeats (SSRs), with markers evaluated for polymorphism using 20 diploid (AA) M. acuminata accessions contrasting in resistance to Mycosphaerella leaf spot diseases. Principal results A total of 9333 high-quality ESTs were obtained for MAC4 and 3964 for MACV, which assembled into 3995 unigenes. Of these, 2592 displayed homology to genes encoding proteins with known or putative function, and 266 to genes encoding proteins with unknown function. Gene ontology (GO) classification identified 543 GO terms, 2300 unigenes were assigned to EuKaryotic orthologous group categories and 312 mapped to Kyoto Encyclopedia of Genes and Genomes pathways. A total of 624 SSR loci were identified, with trinucleotide repeat motifs the most abundant in MAC4 (54.1 %) and MACV (57.6 %). Polymorphism across M. acuminata accessions was observed with 75 markers. Alleles per polymorphic locus ranged from 2 to 8, totalling 289. The polymorphism information content ranged from 0.08 to 0.81. Conclusions This EST collection offers a resource for studying functional genes, including

  6. Insights into a dinoflagellate genome through expressed sequence tag analysis

    PubMed Central

    Hackett, Jeremiah D; Scheetz, Todd E; Yoon, Hwan Su; Soares, Marcelo B; Bonaldo, Maria F; Casavant, Thomas L; Bhattacharya, Debashish

    2005-01-01

    Background Dinoflagellates are important marine primary producers and grazers and cause toxic "red tides". These taxa are characterized by many unique features such as immense genomes, the absence of nucleosomes, and photosynthetic organelles (plastids) that have been gained and lost multiple times. We generated EST sequences from non-normalized and normalized cDNA libraries from a culture of the toxic species Alexandrium tamarense to elucidate dinoflagellate evolution. Previous analyses of these data have clarified plastid origin and here we study the gene content, annotate the ESTs, and analyze the genes that are putatively involved in DNA packaging. Results Approximately 20% of the 6,723 unique (11,171 total 3'-reads) ESTs data could be annotated using Blast searches against GenBank. Several putative dinoflagellate-specific mRNAs were identified, including one novel plastid protein. Dinoflagellate genes, similar to other eukaryotes, have a high GC-content that is reflected in the amino acid codon usage. Highly represented transcripts include histone-like (HLP) and luciferin binding proteins and several genes occur in families that encode nearly identical proteins. We also identified rare transcripts encoding a predicted protein highly similar to histone H2A.X. We speculate this histone may be retained for its role in DNA double-strand break repair. Conclusion This is the most extensive collection to date of ESTs from a toxic dinoflagellate. These data will be instrumental to future research to understand the unique and complex cell biology of these organisms and for potentially identifying the genes involved in toxin production. PMID:15921535

  7. Single nucleotide polymorphisms from Theobroma cacao expressed sequence tags associated with witches' broom disease in cacao.

    PubMed

    Lima, L S; Gramacho, K P; Carels, N; Novais, R; Gaiotto, F A; Lopes, U V; Gesteira, A S; Zaidan, H A; Cascardo, J C M; Pires, J L; Micheli, F

    2009-07-14

    In order to increase the efficiency of cacao tree resistance to witches' broom disease, which is caused by Moniliophthora perniciosa (Tricholomataceae), we looked for molecular markers that could help in the selection of resistant cacao genotypes. Among the different markers useful for developing marker-assisted selection, single nucleotide polymorphisms (SNPs) constitute the most common type of sequence difference between alleles and can be easily detected by in silico analysis from expressed sequence tag libraries. We report the first detection and analysis of SNPs from cacao-M. perniciosa interaction expressed sequence tags, using bioinformatics. Selection based on analysis of these SNPs should be useful for developing cacao varieties resistant to this devastating disease.

  8. Analysis and Functional Annotation of an Expressed Sequence Tag Collection for Tropical Crop Sugarcane

    PubMed Central

    Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo

    2003-01-01

    To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979

  9. Identification of molecular motors in the Woods Hole squid, Loligo pealei: an expressed sequence tag approach.

    PubMed

    DeGiorgis, Joseph A; Cavaliere, Kimberly R; Burbach, J Peter H

    2011-10-01

    The squid giant axon and synapse are unique systems for studying neuronal function. While a few nucleotide and amino acid sequences have been obtained from squid, large scale genetic and proteomic information is lacking. We have been particularly interested in motors present in axons and their roles in transport processes. Here, to obtain genetic data and to identify motors expressed in squid, we initiated an expressed sequence tag project by single-pass sequencing mRNAs isolated from the stellate ganglia of the Woods Hole Squid, Loligo pealei. A total of 22,689 high quality expressed sequence tag (EST) sequences were obtained and subjected to basic local alignment search tool analysis. Seventy six percent of these sequences matched genes in the National Center for Bioinformatics databases. By CAP3 analysis this library contained 2459 contigs and 7568 singletons. Mining for motors successfully identified six kinesins, six myosins, a single dynein heavy chain, as well as components of the dynactin complex, and motor light chains and accessory proteins. This initiative demonstrates that EST projects represent an effective approach to obtain sequences of interest.

  10. Mining for single nucleotide polymorphisms and insertions/deletions in maize expressed sequence tag data.

    PubMed

    Batley, Jacqueline; Barker, Gary; O'Sullivan, Helen; Edwards, Keith J; Edwards, David

    2003-05-01

    We have developed a computer based method to identify candidate single nucleotide polymorphisms (SNPs) and small insertions/deletions from expressed sequence tag data. Using a redundancy-based approach, valid SNPs are distinguished from erroneous sequence by their representation multiple times in an alignment of sequence reads. A second measure of validity was also calculated based on the cosegregation of the SNP pattern between multiple SNP loci in an alignment. The utility of this method was demonstrated by applying it to 102,551 maize (Zea mays) expressed sequence tag sequences. A total of 14,832 candidate polymorphisms were identified with an SNP redundancy score of two or greater. Segregation of these SNPs with haplotype indicates that candidate SNPs with high redundancy and cosegregation confidence scores are likely to represent true SNPs. This was confirmed by validation of 264 candidate SNPs from 27 loci, with a range of redundancy and cosegregation scores, in four inbred maize lines. The SNP transition/transversion ratio and insertion/deletion size frequencies correspond to those observed by direct sequencing methods of SNP discovery and suggest that the majority of predicted SNPs and insertion/deletions identified using this approach represent true genetic variation in maize.

  11. 2058 Expressed sequence tags (ESTs) from a human fetal lung cDNA library

    SciTech Connect

    Kazunori, Sudo |; Katsuya Chinen; Yusuke Nakamura

    1994-11-15

    ESTs (expressed sequence tags) provide complementary resources for structural and functional analyses of the human genome. The authors have performed single-pass sequencing of 2058 randomly selected, directionally cloned cDNAs isolated from a fetal-lung cDNA library constructed with oligo (dT) primers. Computer analyses of the 5{prime}-end sequences revealed that 60.4% of the clones were considered to be identical to previously reported human genes or ESTs; 9.0% of them showed significant homology to known genes in human, other mammals, or lower organisms; 30.6% showed no homology to any genes or DNA sequences in the public database. These data and reagents will be useful for future investigations of gene expression during prenatal development of human lung. 11 refs., 1 fig., 2 tabs.

  12. Generation of 7137 non-redundant expressed sequence tags from a legume, Lotus japonicus.

    PubMed

    Asamizu, E; Nakamura, Y; Sato, S; Tabata, S

    2000-04-28

    For comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 22,983 5' end expressed sequence tags (ESTs) were accumulated from normalized and size-selected cDNA libraries constructed from young (2 weeks old) plants. The EST sequences were clustered into 7137 non-redundant groups. Similarity search against public non-redundant protein database indicated that 3302 groups showed similarity to genes of known function, 1143 groups to hypothetical genes, and 2692 were novel sequences. Homologues of 5 nodule-specific genes which have been reported in other legume species were contained in the collected ESTs, suggesting that the EST source generated in this study will become a useful tool for identification of genes related to legume-specific biological processes. The sequence data of individual ESTs are available at the web site: http://www.kazusa.or.jp/en/plant/lotus/EST/.

  13. Evaluation of anonymous and expressed sequence tag derived polymorphic microsatellite markers in the tobacco budworm Heliothis virescens (Lepidoptera: noctuidae)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Polymorphic genetic markers were identified and characterized using a partial genomic library of Heliothis virescens enriched for simple sequence repeats (SSR) and nucleotide sequences of expressed sequence tags (EST). Nucleotide sequences of 192 clones from the partial genomic library yielded 147 u...

  14. Large-scale detection and application of expressed sequence tag single nucleotide polymorphisms in Nicotiana.

    PubMed

    Wang, Y; Zhou, D; Wang, S; Yang, L

    2015-07-14

    Single nucleotide polymorphisms (SNPs) are widespread in the Nicotiana genome. Using an alignment and variation detection method, we developed 20,607,973 SNPs, based on the expressed sequence tag sequences of 10 Nicotiana species. The replacement rate was much higher than the transversion rate in the SNPs, and SNPs widely exist in the Nicotiana. In vitro verification indicated that all of the SNPs were high quality and accurate. Evolutionary relationships between 15 varieties were investigated by polymerase chain reaction with a special primer; the specific 302 locus of these sequence results clearly indicated the origin of Zhongyan 100. A database of Nicotiana SNPs (NSNP) was developed to store and search for SNPs in Nicotiana. NSNP is a tool for researchers to develop SNP markers of sequence data.

  15. Generation of expressed sequence tags from a normalized porcine skeletal muscle cDNA library.

    PubMed

    Yao, Jianbo; Coussens, Paul M; Saama, Peter; Suchyta, Steven; Ernst, Catherine W

    2002-11-01

    Recent developments in microarray technologies permit scientists to analyze expression of thousands of genes simultaneously in diverse biological systems. In an effort to provide integrated resources for application of microarray technologies to studies of skeletal muscle growth and development in swine, we have constructed a normalized cDNA library from porcine skeletal muscle. The effectiveness of normalization was evaluated by DNA sequencing of clones randomly picked from the library before and after normalization, and also by Southern blot hybridization using probes representing abundant transcripts. Our data suggests that the normalization procedure successfully reduced the highly abundant cDNA species in the normalized library. To date, a total of 782 EST (expressed sequence tag) sequences have been generated from this normalized library (687 ESTs) and the original library (95 ESTs). The sequence information of these ESTs plus their BLAST results has been made available through a web accessible database (http://nbfgc.msu.edu). Cluster analysis of the data indicates that a total of 742 unique sequences are present in this collection. BLASTN search of the 742 EST sequences against the public database (dbEST) revealed that 139 had no significant matches (E-value > 10(-15)) to porcine ESTs already entered in the database, suggesting the possibility of their specific expression in porcine skeletal muscle. Generation of non-redundant ESTs from this library will allow us to construct cDNA microarrays for identification of gene expression changes that regulate muscle growth and affect meat quality in swine.

  16. Analysis of expressed sequence tags from a naked foraminiferan Reticulomyxa filosa.

    PubMed

    Burki, Fabien; Nikolaev, Sergey I; Bolivar, Ignacio; Guiard, Jackie; Pawlowski, Jan

    2006-08-01

    Foraminifers are a major component of modern marine ecosystems and one of the most important oceanic producers of calcium carbonate. They are a key phylogenetic group among amoeboid protists, but our knowledge of their genome is still mostly limited to a few conserved genes. Here, we report the first study of expressed genes by means of expressed sequence tag (EST) from the freshwater naked foraminiferan Reticulomyxa filosa. Cluster analysis of 1630 valid ESTs enabled the identification of 178 groups of related sequences and 871 singlets. Approximately 50% of the putative unique 1059 ESTs could be annotated using Blast searches against the protein database SwissProt + TrEMBL. The EST database described here is the first step towards gene discovery in Foraminifera and should provide the basis for new insights into the genomic and transcriptomic characteristics of these interesting but poorly understood protists.

  17. Rapid in silico cloning of genes using expressed sequence tags (ESTs).

    PubMed

    Gill, R W; Sanseau, P

    2000-01-01

    Expressed sequence tags (ESTs) are short single-pass DNA sequences obtained from either end of cDNA clones. These ESTs are derived from a vast number of cDNA libraries obtained from different species. Human ESTs are the bulk of the data and have been widely used to identify new members of gene families, as markers on the human chromosomes, to discover polymorphism sites and to compare expression patterns in different tissues or pathologies states. Information strategies have been devised to query EST databases. Since most of the analysis is performed with a computer, the term "in silico" strategy has been coined. In this chapter we will review the current status of EST databases, the pros and cons of EST-type data and describe possible strategies to retrieve meaningful information.

  18. Generation and analysis of expressed sequence tags from the ciliate protozoan parasite Ichthyophthirius multifiliis

    PubMed Central

    Abernathy, Jason W; Xu, Peng; Li, Ping; Xu, De-Hai; Kucuktas, Huseyin; Klesius, Phillip; Arias, Covadonga; Liu, Zhanjiang

    2007-01-01

    Background The ciliate protozoan Ichthyophthirius multifiliis (Ich) is an important parasite of freshwater fish that causes 'white spot disease' leading to significant losses. A genomic resource for large-scale studies of this parasite has been lacking. To study gene expression involved in Ich pathogenesis and virulence, our goal was to generate expressed sequence tags (ESTs) for the development of a powerful microarray platform for the analysis of global gene expression in this species. Here, we initiated a project to sequence and analyze over 10,000 ESTs. Results We sequenced 10,368 EST clones using a normalized cDNA library made from pooled samples of the trophont, tomont, and theront life-cycle stages, and generated 9,769 sequences (94.2% success rate). Post-sequencing processing led to 8,432 high quality sequences. Clustering analysis of these ESTs allowed identification of 4,706 unique sequences containing 976 contigs and 3,730 singletons. These unique sequences represent over two million base pairs (~10% of Plasmodium falciparum genome, a phylogenetically related protozoan). BLASTX searches produced 2,518 significant (E-value < 10-5) hits and further Gene Ontology (GO) analysis annotated 1,008 of these genes. The ESTs were analyzed comparatively against the genomes of the related protozoa Tetrahymena thermophila and P. falciparum, allowing putative identification of additional genes. All the EST sequences were deposited by dbEST in GenBank (GenBank: EG957858–EG966289). Gene discovery and annotations are presented and discussed. Conclusion This set of ESTs represents a significant proportion of the Ich transcriptome, and provides a material basis for the development of microarrays useful for gene expression studies concerning Ich development, pathogenesis, and virulence. PMID:17577414

  19. Mining and gene ontology based annotation of SSR markers from expressed sequence tags of Humulus lupulus.

    PubMed

    Singh, Swati; Gupta, Sanchita; Mani, Ashutosh; Chaturvedi, Anoop

    2012-01-01

    Humulus lupulus is commonly known as hops, a member of the family moraceae. Currently many projects are underway leading to the accumulation of voluminous genomic and expressed sequence tag sequences in public databases. The genetically characterized domains in these databases are limited due to non-availability of reliable molecular markers. The large data of EST sequences are available in hops. The simple sequence repeat markers extracted from EST data are used as molecular markers for genetic characterization, in the present study. 25,495 EST sequences were examined and assembled to get full-length sequences. Maximum frequency distribution was shown by mononucleotide SSR motifs i.e. 60.44% in contig and 62.16% in singleton where as minimum frequency are observed for hexanucleotide SSR in contig (0.09%) and pentanucleotide SSR in singletons (0.12%). Maximum trinucleotide motifs code for Glutamic acid (GAA) while AT/TA were the most frequent repeat of dinucleotide SSRs. Flanking primer pairs were designed in-silico for the SSR containing sequences. Functional categorization of SSRs containing sequences was done through gene ontology terms like biological process, cellular component and molecular function.

  20. Contamination of cDNA libraries and expressed sequence-tags databases

    SciTech Connect

    Dean, M.; Allikmets, R.

    1995-11-01

    Partially sequenced cDNAs, or expressed sequence tags (ESTs), are claimed to represent an efficient strategy for characterizing an organism`s genes. By necessity, these sequences are incompletely characterized, and examples of contamination of cDNA libraries with sequences from other species have been described. It has been suggested that a Human T-cell cDNA library (Clontech HL1963g) is contaminated by sequences from yeast (Saccharomyces cerevisiae) and an unknown bacterium. We are characterizing human ESTs that represent new members of the ATP-binding cassette transporter super-family. In examining human ESTs generated from the T-cell library, we have encountered one gene that was in fact a yeast sequence (Genbank Z15214 = SSH2 locus) and several genes that do not hybridize to human DNA or RNA. PCR primers from these sequences failed to amplify a product from human, yeast, or Escherichia coli DNA but did produce a product from a Clontech kidney cDNA library (HL1123a). To determine the source of the contamination, we amplified a conserved segment of the 16S rDNA (following a suggestion from Dr. C. Savakis) from the kidney library. The sequence of this product was nearly identical to that of the bacterium Leuconostoc lactis (300 of 304 bp). Leuconostoc species are commonly found in dairy products, fruits, vegetables, and wine and are nonpathogenic to humans. 6 refs., 1 fig.

  1. Expression sequence tag library derived from peripheral blood mononuclear cells of the chlorocebus sabaeus

    PubMed Central

    2012-01-01

    Background African Green Monkeys (AGM) are amongst the most frequently used nonhuman primate models in clinical and biomedical research, nevertheless only few genomic resources exist for this species. Such information would be essential for the development of dedicated new generation technologies in fundamental and pre-clinical research using this model, and would deliver new insights into primate evolution. Results We have exhaustively sequenced an Expression Sequence Tag (EST) library made from a pool of Peripheral Blood Mononuclear Cells from sixteen Chlorocebus sabaeus monkeys. Twelve of them were infected with the Simian Immunodeficiency Virus. The mononuclear cells were or not stimulated in vitro with Concanavalin A, with lipopolysacharrides, or through mixed lymphocyte reaction in order to generate a representative and broad library of expressed sequences in immune cells. We report here 37,787 sequences, which were assembled into 14,410 contigs representing an estimated 12% of the C. sabaeus transcriptome. Using data from primate genome databases, 9,029 assembled sequences from C. sabaeus could be annotated. Sequences have been systematically aligned with ten cDNA references of primate species including Homo sapiens, Pan troglodytes, and Macaca mulatta to identify ortholog transcripts. For 506 transcripts, sequences were quasi-complete. In addition, 6,576 transcript fragments are potentially specific to the C. sabaeus or corresponding to not yet described primate genes. Conclusions The EST library we provide here will prove useful in gene annotation efforts for future sequencing of the African Green Monkey genomes. Furthermore, this library, which particularly well represents immunological and hematological gene expression, will be an important resource for the comparative analysis of gene expression in clinically relevant nonhuman primate and human research. PMID:22726727

  2. Expressed sequence tags from a NaCl-treated Suaeda salsa cDNA library.

    PubMed

    Zhang, L; Ma, X L; Zhang, Q; Ma, C L; Wang, P P; Sun, Y F; Zhao, Y X; Zhang, H

    2001-04-18

    Past efforts to improve plant tolerance to osmotic stress have had limited success owing to the genetic complexity of stress responses. The first step towards cataloging and categorizing genetically complex abotic stress responses is the rapid discovery of genes by the large-scale partial sequencing of randomly selected cDNA clones or expressed sequence tags (ESTs). Suaeda salsa, which can survive seawater-level salinity, is a favorite halophytic model for salt tolerant research. We constructed a NaCl-treated cDNA library of Suaeda salsa and sequenced 1048 randomly selected clones, out of which 1016 clones produced readable sequences (773 showed homology to previously identified genes, 227 matched unknown protein coding regions, 16 anomalous sequences or sequences of bacterial origin were excluded from further analysis). By sequence analysis we identified 492 unique clones: 315 showed homology to previously identified genes, 177 matched unknown protein coding regions (101 of which have been found before in other organisms and 76 are completely novel). All our EST data are available on the Internet. We believe that our dbEST and the associated DNA materials will be a useful source to scientists engaging in stress-tolerance study.

  3. The construction of Arabidopsis expressed sequence tag assemblies. A new resource to facilitate gene identification.

    PubMed Central

    Rounsley, S D; Glodek, A; Sutton, G; Adams, M D; Somerville, C R; Venter, J C; Kerlavage, A R

    1996-01-01

    The generation of large numbers of partial cDNA sequences, or expressed sequence tags (ESTs), has provided a method with which to sample a large number of genes from an organism. More than 25,000 Arabidopsis thaliana ESTs have been deposited in public databases, producing the largest collection of ESTs for any plant species. We describe here the application of a method of reducing redundancy and increasing information content in this collection by grouping overlapping ESTs representing the same gene into a "contig" or assembly. The increased information content of these assemblies allows more putative identifications to be assigned based on the results of similarity searches with nucleotide and protein databases. The results of this analysis indicate that sequence information is available for approximately 12,600 nonoverlapping ESTs from Arabidopsis. Comparison of the assemblies with 953 Arabidopsis coding sequences indicates that up to 57% of all Arabidopsis genes are represented by an EST. Clustering analysis of these sequences suggests that between 300 and 700 gene families are represented by between 700 and 2000 sequences in the EST database. A database of the assembled sequences, their putative identifications, and cellular roles is available through the World Wide Web. PMID:8938416

  4. Transcript profiling of expressed sequence tags from semimembranosus muscle of commercial and naturalized pig breeds.

    PubMed

    Nascimento, C S; Peixoto, J O; Verardo, L L; Campos, C F; Weller, M M C; Faria, V R; Botelho, M E; Martins, M F; Machado, M A; Silva, F F; Lopes, P S; Guimarães, S E F

    2012-09-17

    In general, genetic differences across different breeds of pig lead to variation in mature body size and slaughter age. The Commercial breeds Duroc and Large White and the local Brazilian breed Piau are ostensibly distinct in terms of growth and muscularity, commercial breeds are much leaner while local breeds grow much slower and are fat type pigs. However, the genetic factors that underlie such distinctions remain unclear. We used expressed sequence tags (ESTs) to characterize and compare transcript profiles in the semimembranosus muscle of these pig breeds. Our aim was to identify differences in breed-related gene expression that might influence growth performance and meat quality. We constructed three non-normalized cDNA libraries from semimembranosus muscle, using two samples from each one, of these three breeds; 6902 high-quality ESTs were obtained. Cluster analysis was performed and these sequences were clustered into 3670 unique sequences; 24.7% of the sequences were categorized as contigs and 75.3% of the sequences were singletons. Based on homology searches against the SwissProt protein database, we were able to assign a putative protein identity to only 1050 unique sequences. Among these, 58.5% were full-length protein sequences and 17.2% were pig-specific sequences. Muscle structural and cytoskeletal proteins, such as actin, and myosin, were the most abundant transcripts (16.7%) followed by those related to mitochondrial function (12.9%), and ribosomal proteins (12.4%). Furthermore, ESTs generated in this study provide a rich source for identification of novel genes and for the comparative analysis of gene expression patterns in divergent pig breeds.

  5. Gene ontology based characterization of expressed sequence tags (ESTs) of Brassica rapa cv. Osome.

    PubMed

    Arasan, Senthil Kumar Thamil; Park, Jong-In; Ahmed, Nasar Uddin; Jung, Hee-Jeong; Lee, In-Ho; Cho, Yong-Gu; Lim, Yong-Pyo; Kang, Kwon-Kyoo; Nou, Ill-Sup

    2013-07-01

    Chinese cabbage (Brassica rapa) is widely recognized for its economic importance and contribution to human nutrition but abiotic and biotic stresses are main obstacle for its quality, nutritional status and production. In this study, 3,429 Express Sequence Tag (EST) sequences were generated from B. rapa cv. Osome cDNA library and the unique transcripts were classified functionally using a gene ontology (GO) hierarchy, Kyoto encyclopedia of genes and genomes (KEGG). KEGG orthology and the structural domain data were obtained from the biological database for stress related genes (SRG). EST datasets provided a wide outlook of functional characterization of B. rapa cv. Osome. In silico analysis revealed % 83 of ESTs to be well annotated towards reeds one dimensional concept. Clustering of ESTs returned 333 contigs and 2,446 singlets, giving a total of 3,284 putative unigene sequences. This dataset contained 1,017 EST sequences functionally annotated to stress responses and from which expression of randomly selected SRGs were analyzed against cold, salt, drought, ABA, water and PEG stresses. Most of the SRGs showed differentially expression against these stresses. Thus, the EST dataset is very important for discovering the potential genes related to stress resistance in Chinese cabbage, and can be of useful resources for genetic engineering of Brassica sp.

  6. Generation and analysis of expressed sequence tags from the bone marrow of Chinese Sika deer.

    PubMed

    Yao, Baojin; Zhao, Yu; Zhang, Mei; Li, Juan

    2012-03-01

    Sika deer is one of the best-known and highly valued animals of China. Despite its economic, cultural, and biological importance, there has not been a large-scale sequencing project for Sika deer to date. With the ultimate goal of sequencing the complete genome of this organism, we first established a bone marrow cDNA library for Sika deer and generated a total of 2,025 reads. After processing the sequences, 2,017 high-quality expressed sequence tags (ESTs) were obtained. These ESTs were assembled into 1,157 unigenes, including 238 contigs and 919 singletons. Comparative analyses indicated that 888 (76.75%) of the unigenes had significant matches to sequences in the non-redundant protein database, In addition to highly expressed genes, such as stearoyl-CoA desaturase, cytochrome c oxidase, adipocyte-type fatty acid-binding protein, adiponectin and thymosin beta-4, we also obtained vascular endothelial growth factor-A and heparin-binding growth-associated molecule, both of which are of great importance for angiogenesis research. There were 244 (21.09%) unigenes with no significant match to any sequence in current protein or nucleotide databases, and these sequences may represent genes with unknown function in Sika deer. Open reading frame analysis of the sequences was performed using the getorf program. In addition, the sequences were functionally classified using the gene ontology hierarchy, clusters of orthologous groups of proteins and Kyoto encyclopedia of genes and genomes databases. Analysis of ESTs described in this paper provides an important resource for the transcriptome exploration of Sika deer, and will also facilitate further studies on functional genomics, gene discovery and genome annotation of Sika deer.

  7. TBestDB: a taxonomically broad database of expressed sequence tags (ESTs)

    PubMed Central

    O'Brien, Emmet A.; Koski, Liisa B.; Zhang, Yue; Yang, LiuSong; Wang, Eric; Gray, Michael W.; Burger, Gertraud; Lang, B. Franz

    2007-01-01

    The TBestDB database contains ∼370 000 clustered expressed sequence tag (EST) sequences from 49 organisms, covering a taxonomically broad range of poorly studied, mainly unicellular eukaryotes, and includes experimental information, consensus sequences, gene annotations and metabolic pathway predictions. Most of these ESTs have been generated by the Protist EST Program, a collaboration among six Canadian research groups. EST sequences are read from trace files up to a minimum quality cut-off, vector and linker sequence is masked, and the ESTs are clustered using phrap. The resulting consensus sequences are automatically annotated by using the AutoFACT program. The datasets are automatically checked for clustering errors due to chimerism and potential cross-contamination between organisms, and suspect data are flagged in or removed from the database. Access to data deposited in TBestDB by individual users can be restricted to those users for a limited period. With this first report on TBestDB, we open the database to the research community for free processing, annotation, interspecies comparisons and GenBank submission of EST data generated in individual laboratories. For instructions on submission to TBestDB, contact tbestdb@bch.umontreal.ca. The database can be queried at . PMID:17202165

  8. Mining expressed sequence tag (EST) libraries for cancer-associated genes.

    PubMed

    Schmitt, Armin O

    2010-01-01

    Originally established in the beginning of the 1990s as a direct route to gene finding, expressed sequence tags (ESTs) still lend themselves as a means to analyze gene expression in almost all human tissues. The type of questions that can be addressed using public EST libraries ranges from tissue-specific gene profiling to the comparison between tissues in diseased and healthy states. Thanks to a multitude of web-based online bioinformatics resources, mining in EST libraries is not restricted to experts in the field of data analysis, but can readily be performed by the medical or life scientist. In this chapter, a couple of cases studies are presented that guide the scientist to the most useful online resources so that they can conduct their own research.

  9. Studies of a Biochemical Factory: Tomato Trichome Deep Expressed Sequence Tag Sequencing and Proteomics1[W][OA

    PubMed Central

    Schilmiller, Anthony L.; Miner, Dennis P.; Larson, Matthew; McDowell, Eric; Gang, David R.; Wilkerson, Curtis; Last, Robert L.

    2010-01-01

    Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces β-caryophyllene and α-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells. PMID:20431087

  10. Generation and analysis of expressed sequence tags from the medicinal plant Salvia miltiorrhiza.

    PubMed

    Yan, YaPing; Wang, ZheZhi; Tian, Wei; Dong, ZhongMin; Spencer, David F

    2010-02-01

    Salvia miltiorrhiza Bge. is a well-known traditional Chinese herb. Its roots have been formulated and used clinically for the treatment of various diseases. However, little genetic information has so far been available and this fact has become a major obstacle for molecular studies. To address this lack of genetic information, an Expressed Sequence Tag (EST) library from whole plantlets of S. miltiorrhiza was generated. From the 12959 cDNA clones that were randomly selected and subjected to single-pass sequencing from their 5' ends, 10288 ESTs (with sizes > or = 100 bp) were selected and assembled into 1288 contigs, leaving 2937 singletons, for a total of 4225 unigenes. These were analyzed using BLASTX (against protein databases), RPS-BLAST (against a conserved domain database) as well as the web-based KEGG Automatic Annotation Server for metabolic enzyme assignment. Based on the metabolic enzyme assignment, expression patterns of 14 secondary metabolic enzyme genes in different organs and under different treatments were verified using real-time PCR analysis. Additionally, a total of 122 microsatellites were identified from the ESTs, with 89 having sufficient flanking sequences for primer design. This set of ESTs represents a significant proportion of the S. miltiorrhiza transcriptome, and gives preliminary insights into the gene complement of S. miltiorrhiza. They will prove useful for uncovering secondary metabolic pathways, analyzing cDNA-array based gene expression, genetic manipulation to improve yield of desirable secondary products, and molecular marker identification.

  11. Generation and analysis of the expressed sequence tags from the mycelium of Ganoderma lucidum.

    PubMed

    Huang, Yen-Hua; Wu, Hung-Yi; Wu, Keh-Ming; Liu, Tze-Tze; Liou, Ruey-Fen; Tsai, Shih-Feng; Shiao, Ming-Shi; Ho, Low-Tone; Tzean, Shean-Shong; Yang, Ueng-Cheng

    2013-01-01

    Ganoderma lucidum (G. lucidum) is a medicinal mushroom renowned in East Asia for its potential biological effects. To enable a systematic exploration of the genes associated with the various phenotypes of the fungus, the genome consortium of G. lucidum has carried out an expressed sequence tag (EST) sequencing project. Using a Sanger sequencing based approach, 47,285 ESTs were obtained from in vitro cultures of G. lucidum mycelium of various durations. These ESTs were further clustered and merged into 7,774 non-redundant expressed loci. The features of these expressed contigs were explored in terms of over-representation, alternative splicing, and natural antisense transcripts. Our results provide an invaluable information resource for exploring the G. lucidum transcriptome and its regulation. Many cases of the genes over-represented in fast-growing dikaryotic mycelium are closely related to growth, such as cell wall and bioactive compound synthesis. In addition, the EST-genome alignments containing putative cassette exons and retained introns were manually curated and then used to make inferences about the predominating splice-site recognition mechanism of G. lucidum. Moreover, a number of putative antisense transcripts have been pinpointed, from which we noticed that two cases are likely to reveal hitherto undiscovered biological pathways. To allow users to access the data and the initial analysis of the results of this project, a dedicated web site has been created at http://csb2.ym.edu.tw/est/.

  12. Analysis of transcripts from intracellular stages of Eimeria acervulina using expressed sequence tags.

    PubMed

    Miska, K B; Fetterer, R H; Rosenberg, G H

    2008-04-01

    Coccidiosis in chickens is caused by 7 species of Eimeria. Even though coccidiosis is a complex disease that can be caused by any combination of these species, most of the molecular research concerning chicken coccidiosis has been limited to Eimeria tenella. The present study describes the first large-scale analysis of expressed sequence tags (ESTs) generated primarily from second-stage merozoites (and schizonts) of E. acervulina. In total, 1,847 ESTs were sequenced; these represent 1,026 unique sequences. Approximately half of the ESTs encode proteins of unknown function, or hypothetical proteins. Twenty-nine percent of the E. acervulina ESTs share significant sequence identity with sequences in the E. tenella genome. Additionally, EST hits seem to be much different compared with those of E. tenella. One of the differences is the very low number of ESTs that encode putative microneme proteins. This study underlines the potential differences in the molecular aspects of 2 Eimeria species that in the past were thought to be highly similar in nature.

  13. Large scale in silico identification of MYB family genes from wheat expressed sequence tags.

    PubMed

    Cai, Hongsheng; Tian, Shan; Dong, Hansong

    2012-10-01

    The MYB proteins constitute one of the largest transcription factor families in plants. Much research has been performed to determine their structures, functions, and evolution, especially in the model plants, Arabidopsis, and rice. However, this transcription factor family has been much less studied in wheat (Triticum aestivum), for which no genome sequence is yet available. Despite this, expressed sequence tags are an important resource that permits opportunities for large scale gene identification. In this study, a total of 218 sequences from wheat were identified and confirmed to be putative MYB proteins, including 1RMYB, R2R3-type MYB, 3RMYB, and 4RMYB types. A total of 36 R2R3-type MYB genes with complete open reading frames were obtained. The putative orthologs were assigned in rice and Arabidopsis based on the phylogenetic tree. Tissue-specific expression pattern analyses confirmed the predicted orthologs, and this meant that gene information could be inferred from the Arabidopsis genes. Moreover, the motifs flanking the MYB domain were analyzed using the MEME web server. The distribution of motifs among wheat MYB proteins was investigated and this facilitated subfamily classification.

  14. An expressed sequence tag (EST) data mining strategy succeeding in the discovery of new G-protein coupled receptors.

    PubMed

    Wittenberger, T; Schaller, H C; Hellebrand, S

    2001-03-30

    We have developed a comprehensive expressed sequence tag database search method and used it for the identification of new members of the G-protein coupled receptor superfamily. Our approach proved to be especially useful for the detection of expressed sequence tag sequences that do not encode conserved parts of a protein, making it an ideal tool for the identification of members of divergent protein families or of protein parts without conserved domain structures in the expressed sequence tag database. At least 14 of the expressed sequence tags found with this strategy are promising candidates for new putative G-protein coupled receptors. Here, we describe the sequence and expression analysis of five new members of this receptor superfamily, namely GPR84, GPR86, GPR87, GPR90 and GPR91. We also studied the genomic structure and chromosomal localization of the respective genes applying in silico methods. A cluster of six closely related G-protein coupled receptors was found on the human chromosome 3q24-3q25. It consists of four orphan receptors (GPR86, GPR87, GPR91, and H963), the purinergic receptor P2Y1, and the uridine 5'-diphosphoglucose receptor KIAA0001. It seems likely that these receptors evolved from a common ancestor and therefore might have related ligands. In conclusion, we describe a data mining procedure that proved to be useful for the identification and first characterization of new genes and is well applicable for other gene families.

  15. Micropreparative capillary gel electrophoresis of DNA: rapid expressed sequence tag library construction.

    PubMed

    Shi, Liang; Khandurina, Julia; Ronai, Zsolt; Li, Bi-Yu; Kwan, Wai King; Wang, Xun; Guttman, András

    2003-01-01

    A capillary gel electrophoresis based automated DNA fraction collection technique was developed to support a novel DNA fragment-pooling strategy for expressed sequence tag (EST) library construction. The cDNA population is first cleaved by BsaJ I and EcoR I restriction enzymes, and then subpooled by selective ligation with specific adapters followed by polymerase chain reaction (PCR) amplification and labeling. Combination of this cDNA fingerprinting method with high-resolution capillary gel electrophoresis separation and precise fractionation of individual cDNA transcript representatives avoids redundant fragment selection and concomitant repetitive sequencing of abundant transcripts. Using a computer-controlled capillary electrophoresis device the transcript representatives were separated by their size and fractions were automatically collected in every 30 s into 96-well plates. The high resolving power of the sieving matrix ensured sequencing grade separation of the DNA fragments (i.e., single-base resolution) and successful fraction collection. Performance and precision of the fraction collection procedure was validated by PCR amplification of the collected DNA fragments followed by capillary electrophoresis analysis for size and purity verification. The collected and PCR-amplified transcript representatives, ranging up to several hundred base pairs, were then sequenced to create an EST library.

  16. Development of expressed sequence tag-simple sequence repeat markers for Chrysanthemum morifolium and closely related species.

    PubMed

    Liu, H; Zhang, Q X; Sun, M; Pan, H T; Kong, Z X

    2015-07-13

    With the development of chrysanthemum breeding in recent years, an increasing number of wild species in genera related to Chrysanthemum were introduced to extend the genetic resources and facilitate the genetic improvement of chrysanthemums via hybridization. However, few simple sequence repeat (SSR) markers are available for marker-assisted breeding and population genetic studies of chrysanthemum and closely related species. Expressed sequence tags (ESTs) in public databases and cross-species transferable markers are considered to be a cost-effective means for developing sequence-based markers. In this study, 25 EST-SSRs were successfully developed from Chrysanthemum EST sequences for Chrysanthemum morifolium and closely related species. In total, 4164 unigene sequences were assembled from 7180 ESTs of chrysanthemum in GenBank, which were subsequently used to screen for the presence of microsatellites with the SSRIT software. The screening criteria were 8, 5, 4, and 3 repeating units for di-, tri-, tetra-, and penta- and higher-order nucleotides, respectively. Moreover, 310 SSR loci from 296 sequences were identified, and 198 primer pairs for SSR amplification were designed with the Primer Premier 5.0 software, of which 25 SSR loci showed polymorphic amplification in 52 species and varieties belonging to Chrysanthemum, Ajania, and Opisthopappus. The application of EST-SSR markers to the identification of intergeneric hybrids between Chrysanthemum and Ajania was demonstrated. Therefore, EST-SSRs can be developed for species that lack gene sequences or ESTs by utilizing ESTs of closely related species.

  17. Evaluation of cleaved amplified polymorphic sequence markers for Chamaecyparis obtusa based on expressed sequence tag information from Cryptomeria japonica.

    PubMed

    Matsumoto, A; Tsumura, Y

    2004-12-01

    We have developed and evaluated sequence-tagged site (STS) primers based on expressed sequence-tag information derived from sugi (Cryptomeria japonica) for use in hinoki (Chamaecyparis obtusa), a species that belongs to a different family (although it appears to be fairly closely related to sugi). Of the 417 C. japonica STS primer pairs we screened, 120 (approximately 30%) were transferable and provided specific PCR amplification products from 16 C. obtusa plus trees. We used haploid megagametophytes to investigate the homology of 80 STS fragments between C. obtusa and C. japonica and to identify orthologous loci. Nearly 90% of the fragments showed high (>70%) degrees of similarity between the species, and 35 STSs indicated homology to entries with the same putative function in a public DNA database. Of the 120 STS fragments amplified, 72 showed restriction fragment length polymorphisms; in addition, the CC2430 primers detected amplicon length polymorphism. We assessed the inheritance pattern of 27 cleaved amplified polymorphic sequence markers, using 20 individuals from the segregation population. All the markers analyzed were consistent with the marker inheritance patterns obtained from the screening panel, and no markers (except CC2716) showed significant (P<0.01) deviation from the expected segregation ratio. In total, 136 polymorphic markers were developed using C. japonica-based STS primers without any sequence modification. In addition, the applicability of STS-based markers developed in one species to other species was found to closely reflect the evolutionary distance between the species, which is roughly concordant with the difference between their rbcL sequences. We plan to use these markers for genetic studies in C. obtusa. Most of the markers should also provide reliable anchor loci for comparative mapping studies of the C. obtusa and C. japonica genomes.

  18. Mining of expressed sequence tag libraries of cacao for microsatellite markers using five computational tools.

    PubMed

    Riju, Aikkal; Rajesh, M K; Sherin, P T P Fasila; Chandrasekar, A; Apshara, S Elain; Arunachalam, Vadivel

    2009-08-01

    Expressed sequence tags (ESTs) provide researchers with a quick and inexpensive route for discovering new genes, data on gene expression and regulation, and also provide genic markers that help in constructing genome maps. Cacao is an important perennial crop of humid tropics. Cacao EST sequences, as available in the public domain, were downloaded and made into contigs. Microsatellites were located in these ESTs and contigs using five softwares (MISA, TRA, TROLL, SSRIT and SSR primer). MISA gave maximum coverage of SSRs in cacao ESTs and contigs, although TRA was able to detect higher order (5-mer) repeats. The frequency of SSRs was one per 26.9 kb in the known set of ESTs. One-third of the repeats in EST-contigs were found to be trimeric. A few rare repeats like 21-mer repeat were also located. A/T repeats were most abundant among the mononucleotide repeats and the AG/GA/TC/CT type was the most frequent among dimerics. Flanking primers were designed using Primer3 program and verified experimentally for PCR amplification. The results of the study are made available freely online database (http://riju.byethost31.com/cocoa/). Seven primer pairs amplified genomic DNA isolated from leaves were used to screen a representative set of 12 accessions of cacao.

  19. Mining for single nucleotide polymorphisms and insertions / deletions in expressed sequence tag libraries of oil palm.

    PubMed

    Riju, Aykkal; Chandrasekar, Arumugam; Arunachalam, Vadivel

    2007-01-01

    The oil palm is a tropical oil bearing tree. Recently EST-derived SNPs and SSRs are a free by-product of the currently expanding EST (Expressed Sequence Tag) data bases. The development of high-throughput methods for the detection of SNPs (Single Nucleotide Polymorphism) and small indels (insertion / deletion) has led to a revolution in their use as molecular markers. Available (5452) Oil palm EST sequences were mined from dbEST of NCBI. CAP3 program was used to assemble EST sequences into contigs. Candidate SNPs and Indel polymorphisms were detected using the perl script auto_snip version 1.0 which has used 576 ESTs for detecting SNPs and Indel sites. We found 1180 SNP sites and 137 indel polymorphisms with frequency 1.36 SNPs / 100 bp. Among the six tissues from which the EST libraries had been generated, mesocarp had high frequency of 2.91 SNPs and indels per 100 bp whereas the zygotic embryos had lowest frequency of 0.15 per 100 bp. We also used the Shannon index to analyze the proportion of ten possible types of SNP/indels. ESTs from tissues of normal apex showed highest values of Shannon index (0.60) whereas abnormal apex had least value (0.02). The present report deals the use of Shannon index for comparing SNP/ indel frequencies mined from ESTlibraries and also confirm that the frequency of SNP occurrence in oil palm to use them as markers for genetic studies.

  20. Application of Cydia pomonella expressed sequence tags: Identification and expression of three general odorant binding proteins in codling moth.

    PubMed

    Garczynski, Stephen F; Coates, Brad S; Unruh, Thomas R; Schaeffer, Scott; Jiwan, Derick; Koepke, Tyson; Dhingra, Amit

    2013-10-01

    The codling moth, Cydia pomonella, is one of the most important pests of pome fruits in the world, yet the molecular genetics and the physiology of this insect remain poorly understood. A combined assembly of 8 341 expressed sequence tags was generated from Roche 454 GS-FLX sequencing of eight tissue-specific cDNA libraries. Putative chemosensory proteins (12) and odorant binding proteins (OBPs) (18) were annotated, which included three putative general OBP (GOBP), one more than typically reported for other Lepidoptera. To further characterize CpomGOBPs, we cloned cDNA copies of their transcripts and determined their expression patterns in various tissues. Cloning and sequencing of the 698 nt transcript for CpomGOBP1 resulted in the prediction of a 163 amino acid coding region, and subsequent RT-PCR indicated that the transcripts were mainly expressed in antennae and mouthparts. The 1 289 nt (160 amino acid) CpomGOBP2 and the novel 702 nt (169 amino acid) CpomGOBP3 transcripts are mainly expressed in antennae, mouthparts, and female abdomen tips. These results indicate that next generation sequencing is useful for the identification of novel transcripts of interest, and that codling moth expresses a transcript encoding for a new member of the GOBP subfamily.

  1. Sequencing, analysis, and annotation of expressed sequence tags for Camelus dromedarius.

    PubMed

    Al-Swailem, Abdulaziz M; Shehata, Maher M; Abu-Duhier, Faisel M; Al-Yamani, Essam J; Al-Busadah, Khalid A; Al-Arawi, Mohammed S; Al-Khider, Ali Y; Al-Muhaimeed, Abdullah N; Al-Qahtani, Fahad H; Manee, Manee M; Al-Shomrani, Badr M; Al-Qhtani, Saad M; Al-Harthi, Amer S; Akdemir, Kadir C; Inan, Mehmet S; Otu, Hasan H

    2010-05-19

    Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and approximately 40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism.

  2. Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius

    PubMed Central

    Al-Swailem, Abdulaziz M.; Shehata, Maher M.; Abu-Duhier, Faisel M.; Al-Yamani, Essam J.; Al-Busadah, Khalid A.; Al-Arawi, Mohammed S.; Al-Khider, Ali Y.; Al-Muhaimeed, Abdullah N.; Al-Qahtani, Fahad H.; Manee, Manee M.; Al-Shomrani, Badr M.; Al-Qhtani, Saad M.; Al-Harthi, Amer S.; Akdemir, Kadir C.; Otu, Hasan H.

    2010-01-01

    Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and ∼40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism. PMID:20502665

  3. Expressed sequence tag analysis of guinea pig (Cavia porcellus) eye tissues for NEIBank

    PubMed Central

    Simpanya, Mukoma F.; Wistow, Graeme; Gao, James; David, Larry L.; Giblin, Frank J.

    2008-01-01

    Purpose To characterize gene expression patterns in guinea pig ocular tissues and identify orthologs of human genes from NEIBank expressed sequence tags. Methods RNA was extracted from dissected eye tissues of 2.5-month-old guinea pigs to make three unamplified and unnormalized cDNA libraries in the pCMVSport-6 vector for the lens, retina, and eye minus lens and retina. Over 4,000 clones were sequenced from each library and were analyzed using GRIST for clustering and gene identification. Lens crystallin EST data were validated using two-dimensional electrophoresis (2-DE), matrix assisted laser desorption (MALDI), and electrospray ionization mass spectrometry (ESIMS). Results Combined data from the three libraries generated a total of 6,694 distinctive gene clusters, with each library having between 1,000 and 3,000 clusters. Approximately 60% of the total gene clusters were novel cDNA sequences and had significant homologies to other mammalian sequences in GenBank. Complete cDNA sequences were obtained for many guinea pig lens proteins, including αA/αAinsert-, γN-, and γS-crystallins, lengsin and GRIFIN. The ratio of αA- to αB-crystallin on 2-DE gels was 8: 1 in the lens nucleus and 6.5: 1 in the cortex. Analysis of ESTs, genome sequence, and proteins (by MALDI), did not reveal any evidence for the presence of γD-, γE-, and γF-crystallin in the guinea pig. Predicted masses of many guinea pig lens crystallins were confirmed by ESIMS analysis. For the retina, orthologs of human phototransduction genes were found, such as Rhodopsin, S-antigen (Sag, Arrestin), and Transducin. The guinea-pig ortholog of NRL, a key rod photoreceptor-specific transcription factor, was also represented in EST data. In the ‘rest-of-eye’ library, the most abundant transcripts included decorin and keratin 12, representative of the cornea. Conclusions Genomic analysis of guinea pig eye tissues provides sequence-verified clones for future studies. Guinea pig orthologs of many human

  4. Functional analysis and comparative genomics of expressed sequence tags from the lycophyte Selaginella moellendorffii

    PubMed Central

    Weng, Jing-Ke; Tanurdzic, Milos; Chapple, Clint

    2005-01-01

    Background The lycophyte Selaginella moellendorffii is a member of one of the oldest lineages of vascular plants on Earth. Fossil records show that the lycophyte clade arose 400 million years ago, 150–200 million years earlier than angiosperms, a group of plants that includes the well-studied flowering plant Arabidopsis thaliana. S. moellendorffii has a genome size of approximately 100 Mbp, as small or smaller than that of A. thaliana. S. moellendorffii has the potential to provide significant comparative information to better understand the evolution of vascular plants. Results We sequenced 2181 Expressed Sequence Tags (ESTs) from a S. moellendorffii cDNA library. One thousand three hundred and one non-redundant sequences were assembled, containing 291 contigs and 1010 singletons. Approximately 75% of the ESTs matched proteins in the non-redundant protein database. Among 1301 clusters, 343 were categorized according to Gene Ontology (GO) hierarchy and were compared to the GO mapping of A. thaliana tentative consensus sequences. We compared S. moellendorffii ESTs to the A. thaliana and Physcomitrella patens EST databases, using the tBLASTX algorithm. Approximately 60% of the ESTs exhibited similarity with both A. thaliana and P. patens ESTs; whereas, 13% and 1% of the ESTs had exclusive similarity with A. thaliana and P. patens ESTs, respectively. A substantial proportion of the ESTs (26%) had no match with A. thaliana or P. patens ESTs. Conclusion We discovered 1301 putative unigenes in S. moellendorffii. These results give an initial insight into its transcriptome that will aid in the study of the S. moellendorffii genome in the near future. PMID:15938755

  5. Development of polymorphic expressed sequence tag-single sequence repeat markers in the common Chinese cuttlefish, Sepiella maindroni.

    PubMed

    Li, R H; Lu, S K; Zhang, C L; Song, W W; Mu, C K; Wang, C L

    2014-07-25

    The common Chinese cuttlefish (Sepiella maindroni) is one of the popular edible cephalopod consumed across Asia. To facilitate the population genetic investigation of this species, we developed fourteen polymorphic microsatellite makers from expressed sequence tags of S. maindroni. The number of alleles at each locus ranged from 6 to 10 with an average of 7.9 alleles per locus. The ranges of observed and expected heterozygosity were from 0.615 to 0.962 and 0.685 to 0.888, respectively. Four loci were found deviated significantly from Hardy-Weinberg equilibrium. The polymorphism information content ranged from 0.638 to 0.833. These polymorphic microsatellite loci will be helpful for the population genetic, genetic linkage map, and other genetic studies of S. maindroni.

  6. Generation and analysis of expressed sequence tags in the extreme large genomes Lilium and Tulipa

    PubMed Central

    2012-01-01

    Background Bulbous flowers such as lily and tulip (Liliaceae family) are monocot perennial herbs that are economically very important ornamental plants worldwide. However, there are hardly any genetic studies performed and genomic resources are lacking. To build genomic resources and develop tools to speed up the breeding in both crops, next generation sequencing was implemented. We sequenced and assembled transcriptomes of four lily and five tulip genotypes using 454 pyro-sequencing technology. Results Successfully, we developed the first set of 81,791 contigs with an average length of 514 bp for tulip, and enriched the very limited number of 3,329 available ESTs (Expressed Sequence Tags) for lily with 52,172 contigs with an average length of 555 bp. The contigs together with singletons covered on average 37% of lily and 39% of tulip estimated transcriptome. Mining lily and tulip sequence data for SSRs (Simple Sequence Repeats) showed that di-nucleotide repeats were twice more abundant in UTRs (UnTranslated Regions) compared to coding regions, while tri-nucleotide repeats were equally spread over coding and UTR regions. Two sets of single nucleotide polymorphism (SNP) markers suitable for high throughput genotyping were developed. In the first set, no SNPs flanking the target SNP (50 bp on either side) were allowed. In the second set, one SNP in the flanking regions was allowed, which resulted in a 2 to 3 fold increase in SNP marker numbers compared with the first set. Orthologous groups between the two flower bulbs: lily and tulip (12,017 groups) and among the three monocot species: lily, tulip, and rice (6,900 groups) were determined using OrthoMCL. Orthologous groups were screened for common SNP markers and EST-SSRs to study synteny between lily and tulip, which resulted in 113 common SNP markers and 292 common EST-SSR. Lily and tulip contigs generated were annotated and described according to Gene Ontology terminology. Conclusions Two transcriptome sets

  7. Genome-wide characterization and selection of expressed sequence tag simple sequence repeat primers for optimized marker distribution and reliability in peach

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Expressed sequence tag (EST) simple sequence repeats (SSRs) in Prunus were mined, and flanking primers designed and used for genome-wide characterization and selection of primers to optimize marker distribution and reliability. A total of 12,618 contigs were assembled from 84,727 ESTs, along with 34...

  8. Expressed sequence tags reveal genetic diversity and putative virulence factors of the pathogenic oomycete Pythium insidiosum.

    PubMed

    Krajaejun, Theerapong; Khositnithikul, Rommanee; Lerksuthirat, Tassanee; Lowhnoo, Tassanee; Rujirawat, Thidarat; Petchthong, Thanom; Yingyong, Wanta; Suriyaphol, Prapat; Smittipat, Nat; Juthayothin, Tada; Phuntumart, Vipaporn; Sullivan, Thomas D

    2011-07-01

    Oomycetes are unique eukaryotic microorganisms that share a mycelial morphology with fungi. Many oomycetes are pathogenic to plants, and a more limited number are pathogenic to animals. Pythium insidiosum is the only oomycete that is capable of infecting both humans and animals, and causes a life-threatening infectious disease, called "pythiosis". In the majority of pythiosis patients life-long handicaps result from the inevitable radical excision of infected organs, and many die from advanced infection. Better understanding P. insidiosum pathogenesis at molecular levels could lead to new forms of treatment. Genetic and genomic information is lacking for P. insidiosum, so we have undertaken an expressed sequence tag (EST) study, and report on the first dataset of 486 ESTs, assembled into 217 unigenes. Of these, 144 had significant sequence similarity with known genes, including 47 with ribosomal protein homology. Potential virulence factors included genes involved in antioxidation, thermal adaptation, immunomodulation, and iron and sterol binding. Effectors resembling pathogenicity factors of plant-pathogenic oomycetes were also discovered, such as, a CBEL-like protein (possible involvement in host cell adhesion and hemagglutination), a putative RXLR effector (possibly involved in host cell modulation) and elicitin-like (ELL) proteins. Phylogenetic analysis mapped P. insidiosum ELLs to several novel clades of oomycete elicitins (ELIs), and homology modeling predicted that P. insidiosum ELLs should bind sterols. Most of the P. insidiosum ESTs showed homology to sequences in the genome or EST databases of other oomycetes, but one putative gene, with unknown function, was found to be unique to P. insidiosum. The EST dataset reported here represents the first steps in identifying genes of P. insidiosum and beginning transcriptome analysis. This genetic information will facilitate understanding of pathogenic mechanisms of this devastating pathogen.

  9. Single nucleotide polymorphism discovery from expressed sequence tags in the waterflea Daphnia magna

    PubMed Central

    2011-01-01

    Background Daphnia (Crustacea: Cladocera) plays a central role in standing aquatic ecosystems, has a well known ecology and is widely used in population studies and environmental risk assessments. Daphnia magna is, especially in Europe, intensively used to study stress responses of natural populations to pollutants, climate change, and antagonistic interactions with predators and parasites, which have all been demonstrated to induce micro-evolutionary and adaptive responses. Although its ecology and evolutionary biology is intensively studied, little is known on the functional genomics underpinning of phenotypic responses to environmental stressors. The aim of the present study was to find genes expressed in presence of environmental stressors, and target such genes for single nucleotide polymorphic (SNP) marker development. Results We developed three expressed sequence tag (EST) libraries using clonal lineages of D. magna exposed to ecological stressors, namely fish predation, parasite infection and pesticide exposure. We used these newly developed ESTs and other Daphnia ESTs retrieved from NCBI GeneBank to mine for SNP markers targeting synonymous as well as non synonymous genetic variation. We validate the developed SNPs in six natural populations of D. magna distributed at regional scale. Conclusions A large proportion (47%) of the produced ESTs are Daphnia lineage specific genes, which are potentially involved in responses to environmental stress rather than to general cellular functions and metabolic activities, or reflect the arthropod's aquatic lifestyle. The characterization of genes expressed under stress and the validation of their SNPs for population genetic study is important for identifying ecologically responsive genes in D. magna. PMID:21668940

  10. Expressed sequence tag analysis in Cycas, the most primitive living seed plant

    PubMed Central

    Brenner, Eric D; Stevenson, Dennis W; McCombie, Richard W; Katari, Manpreet S; Rudd, Stephen A; Mayer, Klaus FX; Palenchar, Peter M; Runko, Suzan J; Twigg, Richard W; Dai, Guangwei; Martienssen, Rob A; Benfey, Phillip N; Coruzzi, Gloria M

    2003-01-01

    Background Cycads are ancient seed plants (living fossils) with origins in the Paleozoic. Cycads are sometimes considered a 'missing link' as they exhibit characteristics intermediate between vascular non-seed plants and the more derived seed plants. Cycads have also been implicated as the source of 'Guam's dementia', possibly due to the production of S(+)-beta-methyl-alpha, beta-diaminopropionic acid (BMAA), which is an agonist of animal glutamate receptors. Results A total of 4,200 expressed sequence tags (ESTs) were created from Cycas rumphii and clustered into 2,458 contigs, of which 1,764 had low-stringency BLAST similarity to other plant genes. Among those cycad contigs with similarity to plant genes, 1,718 cycad 'hits' are to angiosperms, 1,310 match genes in gymnosperms and 734 match lower (non-seed) plants. Forty-six contigs were found that matched only genes in lower plants and gymnosperms. Upon obtaining the complete sequence from the clones of 37/46 contigs, 14 still matched only gymnosperms. Among those cycad contigs common to higher plants, ESTs were discovered that correspond to those involved in development and signaling in present-day flowering plants. We purified a cycad EST for a glutamate receptor (GLR)-like gene, as well as ESTs potentially involved in the synthesis of the GLR agonist BMAA. Conclusions Analysis of cycad ESTs has uncovered conserved and potentially novel genes. Furthermore, the presence of a glutamate receptor agonist, as well as a glutamate receptor-like gene in cycads, supports the hypothesis that such neuroactive plant products are not merely herbivore deterrents but may also serve a role in plant signaling. PMID:14659015

  11. Changes on microsatellites of expressed sequence tag of sugarcane (Saccharum spp) during vegetative propagation.

    PubMed

    Augusto, R; Maranho, R C; Mangolin, C A; Filho, J C Bespalhok; Machado, M F P S

    2017-03-08

    The reduction in sugarcane productivity in subsequent cutting stages may be related to a gradual decrease of the allele number and mean observed heterozygosity (HO) in the sugarcane ratoon. This hypothesis was tested assessing the number of alleles and HO values in 10 expressed sequence tag microsatellites (Est-SSR loci) of the sugarcane varieties RB72454 and RB867515 in different cutting stages. Changes of allele numbers in samples of different cutting stages were observed in seven and six EstSSR loci of the RB72454 and RB867515 varieties, respectively. Reduction of allele numbers was observed in the samples collected in the fourth and sixth cutting stages of the RB72454 variety. In contrast, an increase of the allele numbers was detected in the samples collected on fourth, sixth, and seventh cutting stages of the RB867515 variety. Unchanged allele numbers were observed only in EstB41, EstC84, and EstB130 loci of the RB72454 variety, and EstB41, EstC67, EstA68, and EstB130 loci of the RB867515 variety. The variety RB867515 has lower polymorphism and values of HO than the RB72454 variety in different stages of cutting. At molecular level, in Est-SSR loci, the RB72454 variety showed higher changes in subsequent stages of cutting than RB867515. The similarities and divergences at molecular level between varieties RB72454 and RB867515 observed in the 10 Est-SSR loci during subsequent cutting stages can not explain the reduced productivity frequently observed after subsequent cutting stages but showed that phenotypic and physiological changes after each cutting stage are also accompanied by changes at genomic level.

  12. Analysis of expressed sequence tags from the venom ducts of Conus striatus: focusing on the expression profile of conotoxins.

    PubMed

    Pi, Canhui; Liu, Yun; Peng, Can; Jiang, Xiuhua; Liu, Junliang; Xu, Bin; Yu, Xuesong; Yu, Yanghong; Jiang, Xiaoyu; Wang, Lei; Dong, Meiling; Chen, Shangwu; Xu, An-Long

    2006-02-01

    Cone snails (genus Conus) are predatory marine gastropods that use venom peptides for interacting with prey, predators and competitors. A majority of these peptides, generally known as conotoxins demonstrate striking selectivity in targeting specific subtypes of ion channels and neurotransmitter receptors. So they are not only useful tools in neuroscience to characterize receptors and receptor subtypes, but offer great potential in new drug research and development as well. Here, a cDNA library from the venom ducts of a fish-hunting cone snail species, Conus striatus is described for the generation of expressed sequence tags (ESTs). A total of 429 ESTs were grouped into 137 clusters or singletons. Among these sequences, 221 were toxin sequences, accounting for 52.1% (corresponding to 19 clusters) of all transcripts. A-superfamily (132 ESTs) and O-superfamily conotoxins (80 ESTs) constitute the predominant toxin components. Some non-disulfide-rich Conus peptides were also found. The expression profile of conotoxins also explained to some extent the pharmacological and physiological reactions elicited by this typical piscivorous species. For the first time, a nonstop transcript of conotoxin was identified, which is suggestive that alternative polyadenylation may be a means of post-transcriptional regulation of conotoxin production. A comparison analysis of these conotoxins reveals the different variation and divergence patterns in these two superfamilies. Our investigations indicate that focal hyper-mutation, block substitution and exon shuffling are three main mechanisms leading to the conotoxin diversity in a species. The comprehensive set of Conus gene sequences allowed the identification of the representative classes of conotoxins and related components, which may lay the foundation for further research and development of conotoxins.

  13. Identification of expressed resistance gene analogs from peanut (Arachis hypogaea L.) expressed sequence tags

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cultivated peanut is an important source of protein and oil. However, low genetic diversity makes peanut vulnerable to many diseases. Several hundred of partial genomic DNA sequences targeting nucleotide-binding-site leucine-rich repeat (NBS-LRR) resistance (R) genes have been reported. Only a small...

  14. Expression profiling of salinity-alkali stress responses by large-scale expressed sequence tag analysis in Tamarix hispid.

    PubMed

    Gao, Caiqiu; Wang, Yucheng; Liu, Guifeng; Yang, Chuanping; Jiang, Jing; Li, Huiyu

    2008-02-01

    Tamarix hispida, a woody halophyte, thrives in saline and saline-alkali soil. To better understand the gene expression profiles that manifest in response to saline-alkali stress, three cDNA libraries were constructed from leaf tissue of T. hispida plants that were well watered and exposed to NaHCO3 for 24 and 52 h. A total of 9,447 high quality expressed sequence tags (ESTs) were obtained from the three libraries. These ESTs represent 3,945 unigenes, including 986 contigs and 2,959 singlets. The numbers of unigenes obtained from the three libraries were 1,752, 1,558 and 1,675, respectively. The EST analysis was performed to compare gene expression in the three cDNA libraries; the transcripts responsive to NaHCO3 were identified. The differentially expressed transcripts were identified. The up-regulation genes were involved in a variety function areas, such as stress-related proteins, hormone signaling transduction, antioxidative response, transcriptional regulators, protein synthesis and destination, ion homeostasis, photosynthesis and metabolism. The results indicated that the response to NaHCO3 in T. hispida is a complex one, involving multiple physiological and metabolic pathways. Nine gene expression patterns were compared in response to NaHCO3 and NaCl using real time reverse transcription-polymerase chain reaction (RT-PCR). Gene expression trends were similar after a 24-h exposure to either NaCl or NaHCO3, however, great variability was found after a 52-h exposure, indicating that short-term responses to either salt may not be obviously different.

  15. Analyses of Expressed Sequence Tags from the Maize Foliar Pathogen Cercospora Zeae-Maydis Identifing Novel Genes expressed during Vegetative, Infectious, & Reproductive Growth

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The fungus Cercospora zeae-maydis is an aggressive foliar pathogen of maize that causes substantial yield losses annually throughout the western hemisphere. To learn more about the molecular regulation of pathogenesis in C. zeae-maydis, we generated a collection of expressed sequence tags (ESTs) and...

  16. Annotated expressed sequence tags and cDNA microarrays for studies of brain and behavior in the honey bee.

    PubMed

    Whitfield, Charles W; Band, Mark R; Bonaldo, Maria F; Kumar, Charu G; Liu, Lei; Pardinas, Jose R; Robertson, Hugh M; Soares, M Bento; Robinson, Gene E

    2002-04-01

    To accelerate the molecular analysis of behavior in the honey bee (Apis mellifera), we created expressed sequence tag (EST) and cDNA microarray resources for the bee brain. Over 20,000 cDNA clones were partially sequenced from a normalized (and subsequently subtracted) library generated from adult A. mellifera brains. These sequences were processed to identify 15,311 high-quality ESTs representing 8912 putative transcripts. Putative transcripts were functionally annotated (using the Gene Ontology classification system) based on matching gene sequences in Drosophila melanogaster. The brain ESTs represent a broad range of molecular functions and biological processes, with neurobiological classifications particularly well represented. Roughly half of Drosophila genes currently implicated in synaptic transmission and/or behavior are represented in the Apis EST set. Of Apis sequences with open reading frames of at least 450 bp, 24% are highly diverged with no matches to known protein sequences. Additionally, over 100 Apis transcript sequences conserved with other organisms appear to have been lost from the Drosophila genome. DNA microarrays were fabricated with over 7000 EST cDNA clones putatively representing different transcripts. Using probe derived from single bee brain mRNA, microarrays detected gene expression for 90% of Apis cDNAs two standard deviations greater than exogenous control cDNAs. [The sequence data described in this paper have been submitted to Genbank data library under accession nos. BI502708-BI517278. The sequences are also available at http://titan.biotec.uiuc.edu/bee/honeybee_project.htm.

  17. A survey of canine expressed sequence tags and a display of their annotations through a flexible web-based interface.

    PubMed

    Palmer, L E; O'Shaughnessy, A L; Preston, R R; Santos, L; Balija, V S; Nascimento, L U; Zutavern, T L; Henthorn, P S; Hannon, G J; McCombie, W R

    2003-01-01

    We have initially sequenced approximately 8,000 canine expressed sequence tags (ESTs) from several complementary DNA (cDNA) libraries: testes, whole brain, and Madin-Darby canine kidney (MDCK) cells. Analysis of these sequences shows that they provide partial sequence information for about 5%-10% of the canine genes. An analysis pipeline has been created to cluster the ESTs and to map individual ESTs as well as clustered ESTs to both the human genome and the human proteome. Gene ontology (GO) terms have been assigned to the ESTs and clusters based on their top matches to the International Protein Index (IPI) set of human proteins. The data generated is stored in a MySQL relational database for analysis and display. A Web-based Perl script has been written to display the analyzed data to the scientific community.

  18. Expressed sequence tags and molecular cloning and characterization of gene encoding pinoresinol/lariciresinol reductase from Podophyllum hexandrum.

    PubMed

    Wankhede, Dhammaprakash Pandhari; Biswas, Dipul Kumar; Rajkumar, Subramani; Sinha, Alok Krishna

    2013-12-01

    Podophyllotoxin, an aryltetralin lignan, is the source of important anticancer drugs etoposide, teniposide, and etopophos. Roots/rhizome of Podophyllum hexandrum form one of the most important sources of podophyllotoxin. In order to understand genes involved in podophyllotoxin biosynthesis, two suppression subtractive hybridization libraries were synthesized, one each from root/rhizome and leaves using high and low podophyllotoxin-producing plants of P. hexandrum. Sequencing of clones identified a total of 1,141 Expressed Sequence Tags (ESTs) resulting in 354 unique ESTs. Several unique ESTs showed sequence similarity to the genes involved in metabolism, stress/defense responses, and signalling pathways. A few ESTs also showed high sequence similarity with genes which were shown to be involved in podophyllotoxin biosynthesis in other plant species such as pinoresinol/lariciresinol reductase. A full length coding sequence of pinoresinol/lariciresinol reductase (PLR) has been cloned from P. hexandrum which was found to encode protein with 311 amino acids and show sequence similarity with PLR from Forsythia intermedia and Linum spp. Spatial and stress-inducible expression pattern of PhPLR and other known genes of podophyllotoxin biosynthesis, secoisolariciresinol dehydrogenase (PhSDH), and dirigent protein oxidase (PhDPO) have been studied. All the three genes showed wounding and methyl jasmonate-inducible expression pattern. The present work would form a basis for further studies to understand genomics of podophyllotoxin biosynthesis in P. hexandrum.

  19. Development, characterization and cross species amplification of polymorphic microsatellite markers from expressed sequence tags of turmeric (Curcuma longa L.).

    PubMed

    Siju, S; Dhanya, K; Syamkumar, S; Sasikumar, B; Sheeja, T E; Bhat, A I; Parthasarathy, V A

    2010-02-01

    Expressed sequence tags (ESTs) from turmeric (Curcuma longa L.) were used for the screening of type and frequency of Class I (hypervariable) simple sequence repeats (SSRs). A total of 231 microsatellite repeats were detected from 12,593 EST sequences of turmeric after redundancy elimination. The average density of Class I SSRs accounts to one SSR per 17.96 kb of EST. Mononucleotides were the most abundant class of microsatellite repeat in turmeric ESTs followed by trinucleotides. A robust set of 17 polymorphic EST-SSRs were developed and used for evaluating 20 turmeric accessions. The number of alleles detected ranged from 3 to 8 per loci. The developed markers were also evaluated in 13 related species of C. longa confirming high rate (100%) of cross species transferability. The polymorphic microsatellite markers generated from this study could be used for genetic diversity analysis and resolving the taxonomic confusion prevailing in the genus.

  20. Expressed sequence tags in cultivated peanut (Arachis hypogaea): discovery of genes in seed development and response to Ralstonia solanacearum challenge.

    PubMed

    Huang, Jiaquan; Yan, Liying; Lei, Yong; Jiang, Huifang; Ren, Xiaoping; Liao, Boshou

    2012-11-01

    Although an important oil crop, peanut has only 162,030 expressed sequence tags (ESTs) publicly available, 86,943 of which are from cultivated plants. More ESTs from cultivated peanuts are needed for isolation of stress-resistant, tissue-specific and developmentally important genes. Here, we generated 63,234 ESTs from our 5 constructed peanut cDNA libraries of Ralstonia solanacearum challenged roots, R. solanacearum challenged leaves, and unchallenged cultured peanut roots, leaves and developing seeds. Among these ESTs, there were 14,547 unique sequences with 7,961 tentative consensus sequences and 6,586 singletons. Putative functions for 47.8 % of the sequences were identified, including transcription factors, tissue-specific genes, genes involved in fatty acid biosynthesis and oil formation regulation, and resistance gene analogue genes. Additionally, differentially expressed genes, including those involved in ethylene and jasmonic acid signal transduction pathways, from both peanut leaves and roots, were identified in R. solanacearum challenged samples. This large expression dataset from different peanut tissues will be a valuable source for marker development and gene expression analysis. It will also be helpful for finding candidate genes for fatty acid synthesis and oil formation regulation as well as for studying mechanisms of interactions between the peanut host and R. solanacearum pathogen.

  1. Cell-free translational screening of an expression sequence tag library of Clonorchis sinensis for novel antigen discovery.

    PubMed

    Kasi, Devi; Catherine, Christy; Lee, Seung-Won; Lee, Kyung-Ho; Kim, Yu Jung; Ro Lee, Myeong; Ju, Jung Won; Kim, Dong-Myung

    2017-01-27

    The rapidly evolving cloning and sequencing technologies have enabled understanding of genomic structure of parasite genomes, opening up new ways of combatting parasite-related diseases. To make the most of the exponentially accumulating genomic data, however, it is crucial to analyze the proteins encoded by these genomic sequences. In this study, we adopted an engineered cell-free protein synthesis system for large-scale expression screening of an expression sequence tag (EST) library of Clonorchis sinensis to identify potential antigens that can be used for diagnosis and treatment of clonorchiasis. To allow high-throughput expression and identification of individual genes comprising the library, a cell-free synthesis reaction was designed such that both the template DNA and the expressed proteins were co-immobilized on the same microbeads, leading to microbead-based linkage of the genotype and phenotype. This reaction configuration allowed streamlined expression, recovery, and analysis of proteins. This approach enabled us to identify 21 antigenic proteins. © 2017 American Institute of Chemical Engineers Biotechnol. Prog., 2017.

  2. Immune gene discovery by expressed sequence tag (EST) analysis of hemocytes in the ridgetail white prawn Exopalaemon carinicauda

    PubMed Central

    Duan, Yafei; Liu, Ping; Li, Jitao; Li, Jian; Chen, Ping

    2013-01-01

    The ridgetail white prawn Exopalaemon carinicauda is one of the most important commercial species in eastern China. However, little information of immune genes in E. carinicauda has been reported. To identify distinctive genes associated with immunity, an expressed sequence tag (EST) library was constructed from hemocytes of E. carinicauda. A total of 3411 clones were sequenced, yielding 2853 ESTs and the average sequence length is 436 bp. The cluster and assembly analysis yielded 1053 unique sequences including 329 contigs and 724 singletons. Blast analysis identified 593 (56.3%) of the unique sequences as orthologs of genes from other organisms (E-value < 1e-5). Based on the COG and Gene Ontology (GO), 593 unique sequences were classified. Through comparison with previous studies, 153 genes assembled from 367 ESTs have been identified as possibly involved in defense or immune functions. These genes are categorized into seven categories according to their putative functions in shrimp immune system: antimicrobial peptides, prophenoloxidase activating system, antioxidant defense systems, chaperone proteins, clottable proteins, pattern recognition receptors and other immune-related genes. According to EST abundance, the major immune-related genes were thioredoxin (141, 4.94% of all ESTs) and calmodulin (14, 0.49% of all ESTs). The EST sequences of E. carinicauda hemocytes provide important information of the immune system and lay the groundwork for development of molecular markers related to disease resistance in prawn species. PMID:23092732

  3. Generation of expressed sequence tags from low-CO2 and high-CO2 adapted cells of Chlamydomonas reinhardtii.

    PubMed

    Asamizu, E; Miura, K; Kucho, K; Inoue, Y; Fukuzawa, H; Ohyama, K; Nakamura, Y; Tabata, S

    2000-10-31

    To characterize genes whose expression is induced in carbon-stress conditions, 12,969 and 13,450 5'-end expressed sequence tags (ESTs) were generated from cells grown in low-CO2 and high-CO2 conditions of the unicellular green alga, Chlamydomonas reinhardtii. These ESTs were clustered into 4436 and 3566 non-redundant EST groups, respectively. Comparison of their sequences with those of 3433 non-redundant ESTs previously generated from the cells under the standard growth condition indicated that 2665 and 1879 EST groups occurred only in the low-CO2 and high-CO2 populations, respectively. It was also noted that 96.2% and 96.0% of the cDNA species respectively obtained from the low-CO2 and high-CO2 conditions had no similar EST sequence deposited in the public databases. The EST species identified only in the low-CO2 treated cells included genes previously reported to be expressed specifically in low-CO2 acclimatized cells, suggesting that the ESTs generated in this study will be a useful source for analysis of genes related to carbon-stress acclimatization. The sequence information and search results of each clone will appear at the web site: http://www.kazusa.or.jp/en/plant/chlamy/EST/.

  4. Annotated Expressed Sequence Tags and cDNA Microarrays for Studies of Brain and Behavior in the Honey Bee

    PubMed Central

    Whitfield, Charles W.; Band, Mark R.; Bonaldo, Maria F.; Kumar, Charu G.; Liu, Lei; Pardinas, Jose R.; Robertson, Hugh M.; Soares, M. Bento; Robinson, Gene E.

    2002-01-01

    To accelerate the molecular analysis of behavior in the honey bee (Apis mellifera), we created expressed sequence tag (EST) and cDNA microarray resources for the bee brain. Over 20,000 cDNA clones were partially sequenced from a normalized (and subsequently subtracted) library generated from adult A. mellifera brains. These sequences were processed to identify 15,311 high-quality ESTs representing 8912 putative transcripts. Putative transcripts were functionally annotated (using the Gene Ontology classification system) based on matching gene sequences in Drosophila melanogaster. The brain ESTs represent a broad range of molecular functions and biological processes, with neurobiological classifications particularly well represented. Roughly half of Drosophila genes currently implicated in synaptic transmission and/or behavior are represented in the Apis EST set. Of Apis sequences with open reading frames of at least 450 bp, 24% are highly diverged with no matches to known protein sequences. Additionally, over 100 Apis transcript sequences conserved with other organisms appear to have been lost from the Drosophila genome. DNA microarrays were fabricated with over 7000 EST cDNA clones putatively representing different transcripts. Using probe derived from single bee brain mRNA, microarrays detected gene expression for 90% of Apis cDNAs two standard deviations greater than exogenous control cDNAs. [The sequence data described in this paper have been submitted to Genbank data library under accession nos. BI502708–BI517278. The sequences are also available at http://titan.biotec.uiuc.edu/bee/honeybee_project.htm.] PMID:11932240

  5. Characterization of expressed sequence tags from a full-length enriched cDNA library of Cryptomeria japonica male strobili

    PubMed Central

    Futamura, Norihiro; Totoki, Yasushi; Toyoda, Atsushi; Igasaki, Tomohiro; Nanjo, Tokihiko; Seki, Motoaki; Sakaki, Yoshiyuki; Mari, Adriano; Shinozaki, Kazuo; Shinohara, Kenji

    2008-01-01

    Background Cryptomeria japonica D. Don is one of the most commercially important conifers in Japan. However, the allergic disease caused by its pollen is a severe public health problem in Japan. Since large-scale analysis of expressed sequence tags (ESTs) in the male strobili of C. japonica should help us to clarify the overall expression of genes during the process of pollen development, we constructed a full-length enriched cDNA library that was derived from male strobili at various developmental stages. Results We obtained 36,011 expressed sequence tags (ESTs) from either one or both ends of 19,437 clones derived from the cDNA library of C. japonica male strobili at various developmental stages. The 19,437 cDNA clones corresponded to 10,463 transcripts. Approximately 80% of the transcripts resembled ESTs from Pinus and Picea, while approximately 75% had homologs in Arabidopsis. An analysis of homologies between ESTs from C. japonica male strobili and known pollen allergens in the Allergome Database revealed that products of 180 transcripts exhibited significant homology. Approximately 2% of the transcripts appeared to encode transcription factors. We identified twelve genes for MADS-box proteins among these transcription factors. The twelve MADS-box genes were classified as DEF/GLO/GGM13-, AG-, AGL6-, TM3- and TM8-like MIKCC genes and type I MADS-box genes. Conclusion Our full-length enriched cDNA library derived from C. japonica male strobili provides information on expression of genes during the development of male reproductive organs. We provided potential allergens in C. japonica. We also provided new information about transcription factors including MADS-box genes expressed in male strobili of C. japonica. Large-scale gene discovery using full-length cDNAs is a valuable tool for studies of gymnosperm species. PMID:18691438

  6. Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed S...

  7. Expressed sequences tags of the anther smut fungus, Microbotryum violaceum, identify mating and pathogenicity genes

    PubMed Central

    Yockteng, Roxana; Marthey, Sylvain; Chiapello, Hélène; Gendrault, Annie; Hood, Michael E; Rodolphe, François; Devier, Benjamin; Wincker, Patrick; Dossat, Carole; Giraud, Tatiana

    2007-01-01

    Background The basidiomycete fungus Microbotryum violaceum is responsible for the anther-smut disease in many plants of the Caryophyllaceae family and is a model in genetics and evolutionary biology. Infection is initiated by dikaryotic hyphae produced after the conjugation of two haploid sporidia of opposite mating type. This study describes M. violaceum ESTs corresponding to nuclear genes expressed during conjugation and early hyphal production. Results A normalized cDNA library generated 24,128 sequences, which were assembled into 7,765 unique genes; 25.2% of them displayed significant similarity to annotated proteins from other organisms, 74.3% a weak similarity to the same set of known proteins, and 0.5% were orphans. We identified putative pheromone receptors and genes that in other fungi are involved in the mating process. We also identified many sequences similar to genes known to be involved in pathogenicity in other fungi. The M. violaceum EST database, MICROBASE, is available on the Web and provides access to the sequences, assembled contigs, annotations and programs to compare similarities against MICROBASE. Conclusion This study provides a basis for cloning the mating type locus, for further investigation of pathogenicity genes in the anther smut fungi, and for comparative genomics. PMID:17692127

  8. Development of expressed sequence tag resources for Vanda Mimi Palmer and data mining for EST-SSR.

    PubMed

    Teh, Seow-Ling; Chan, Wai-Sun; Abdullah, Janna Ong; Namasivayam, Parameswari

    2011-08-01

    Vanda Mimi Palmer (VMP) is a highly sought as fragrant-orchid hybrid in Malaysia. It is economically important in cosmetic and beauty industries and also a famous potted ornamental plant. To date, no work on fragrance-related genes of vandaceous orchids has been reported from other research groups although the analysis of floral fragrance or volatiles have been extensively studied. An expressed sequence tag (EST) resource was developed for VMP principally to mine any potential fragrance-related expressed sequence tag-simple sequence repeat (EST-SSR) for future development as markers in the identification of fragrant vandaceous orchids endemic to Malaysia. Clustering, annotation and assembling of the ESTs identified 1,196 unigenes which defined 966 singletons and 230 contigs. The VMP dbEST was functionally classified by gene ontology (GO) into three groups: molecular functions (51.2%), cellular components (16.4%) and biological processes (24.6%) while the remaining 7.8% showed no hits with GO identifier. A total of 112 EST-SSR (9.4%) was mined on which at least five units of di-, tri-, tetra-, penta-, or hexa-nucleotide repeats were predicted. The di-nucleotide motif repeats appeared to be the most frequent repeats among the detected SSRs with the AT/TA types as the most abundant among the dimerics, while AAG/TTC, AGA/TCT-type were the most frequent trimerics. The mined EST-SSR is believed to be useful in the development of EST-SSR markers that is applicable in the screening and characterization of fragrance-related transcripts in closely related species.

  9. Identification of anhydrobiosis-related genes from an expressed sequence tag database in the cryptobiotic midge Polypedilum vanderplanki (Diptera; Chironomidae).

    PubMed

    Cornette, Richard; Kanamori, Yasushi; Watanabe, Masahiko; Nakahara, Yuichi; Gusev, Oleg; Mitsumasu, Kanako; Kadono-Okuda, Keiko; Shimomura, Michihiko; Mita, Kazuei; Kikawada, Takahiro; Okuda, Takashi

    2010-11-12

    Some organisms are able to survive the loss of almost all their body water content, entering a latent state known as anhydrobiosis. The sleeping chironomid (Polypedilum vanderplanki) lives in the semi-arid regions of Africa, and its larvae can survive desiccation in an anhydrobiotic form during the dry season. To unveil the molecular mechanisms of this resistance to desiccation, an anhydrobiosis-related Expressed Sequence Tag (EST) database was obtained from the sequences of three cDNA libraries constructed from P. vanderplanki larvae after 0, 12, and 36 h of desiccation. The database contained 15,056 ESTs distributed into 4,807 UniGene clusters. ESTs were classified according to gene ontology categories, and putative expression patterns were deduced for all clusters on the basis of the number of clones in each library; expression patterns were confirmed by real-time PCR for selected genes. Among up-regulated genes, antioxidants, late embryogenesis abundant (LEA) proteins, and heat shock proteins (Hsps) were identified as important groups for anhydrobiosis. Genes related to trehalose metabolism and various transporters were also strongly induced by desiccation. Those results suggest that the oxidative stress response plays a central role in successful anhydrobiosis. Similarly, protein denaturation and aggregation may be prevented by marked up-regulation of Hsps and the anhydrobiosis-specific LEA proteins. A third major feature is the predicted increase in trehalose synthesis and in the expression of various transporter proteins allowing the distribution of trehalose and other solutes to all tissues.

  10. Pilot survey of expressed sequence tags (ESTs) from the asexual blood stages of Plasmodium vivax in human patients

    PubMed Central

    Merino, Emilio F; Fernandez-Becerra, Carmen; Madeira, Alda MBN; Machado, Ariane L; Durham, Alan; Gruber, Arthur; Hall, Neil; del Portillo, Hernando A

    2003-01-01

    Background Plasmodium vivax is the most widely distributed human malaria, responsible for 70–80 million clinical cases each year and large socio-economical burdens for countries such as Brazil where it is the most prevalent species. Unfortunately, due to the impossibility of growing this parasite in continuous in vitro culture, research on P. vivax remains largely neglected. Methods A pilot survey of expressed sequence tags (ESTs) from the asexual blood stages of P. vivax was performed. To do so, 1,184 clones from a cDNA library constructed with parasites obtained from 10 different human patients in the Brazilian Amazon were sequenced. Sequences were automatedly processed to remove contaminants and low quality reads. A total of 806 sequences with an average length of 586 bp met such criteria and their clustering revealed 666 distinct events. The consensus sequence of each cluster and the unique sequences of the singlets were used in similarity searches against different databases that included P. vivax, Plasmodium falciparum, Plasmodium yoelii, Plasmodium knowlesi, Apicomplexa and the GenBank non-redundant database. An E-value of <10-30 was used to define a significant database match. ESTs were manually assigned a gene ontology (GO) terminology Results A total of 769 ESTs could be assigned a putative identity based upon sequence similarity to known proteins in GenBank. Moreover, 292 ESTs were annotated and a GO terminology was assigned to 164 of them. Conclusion These are the first ESTs reported for P. vivax and, as such, they represent a valuable resource to assist in the annotation of the P. vivax genome currently being sequenced. Moreover, since the GC-content of the P. vivax genome is strikingly different from that of P. falciparum, these ESTs will help in the validation of gene predictions for P. vivax and to create a gene index of this malaria parasite. PMID:12914668

  11. Expressed sequence tag analysis of Antarctic hairgrass Deschampsia antarctica from King George Island, Antarctica.

    PubMed

    Lee, Hyoungseok; Cho, Hyun Hee; Kim, Il-Chan; Yim, Joung Han; Lee, Hong Kum; Lee, Yoo Kyung

    2008-04-30

    Deschampsia antarctica is the only monocot that thrives in the tough conditions of the Antarctic region. It is an invaluable resource for the identification of genes associated with tolerance to various environmental pressures. In order to identify genes that are differentially regulated between greenhouse-grown and Antarctic field-grown plants, we initiated a detailed gene expression analysis. Antarctic plants were collected and greenhouse plants served as controls. Two different cDNA libraries were constructed with these plants. A total of 2,112 cDNA clones was sequenced and grouped into 1,199 unigene clusters consisting of 243 consensus and 956 singleton sequences. Using similarity searches against several public databases, we constructed a functional classification of the ESTs into categories such as genes related to responses to stimuli, as well as photosynthesis and metabolism. Real-time PCR analysis of various stress responsive genes revealed different patterns of regulation in the different environments, suggesting that these genes are involved in responses to specific environmental factors.

  12. Digital cloning: identification of human cDNAs homologous to novel kinases through expressed sequence tag database searching.

    PubMed

    Chen, H C; Kung, H J; Robinson, D

    1998-01-01

    Identification of novel kinases based on their sequence conservation within kinase catalytic domain has relied so far on two major approaches, low-stringency hybridization of cDNA libraries, and PCR method using degenerate primers. Both of these approaches at times are technically difficult and time-consuming. We have developed a procedure that can significantly reduce the time and effort involved in searching for novel kinases and increase the sensitivity of the analysis. This procedure exploits the computer analysis of a vast resource of human cDNA sequences represented in the expressed sequence tag (EST) database. Seventeen novel human cDNA clones showing significant homology to serine/threonine kinases, including STE-20, CDK- and YAK-related family kinases, were identified by searching EST database. Further sequence analysis of these novel kinases obtained either directly from EST clones or from PCR-RACE products confirmed their identity as protein kinases. Given the rapid accumulation of the EST database and the advent of powerful computer analysis software, this approach provides a fast, sensitive, and economical way to identify novel kinases as well as other genes from EST database.

  13. Construction of a Lotus japonicus late nodulin expressed sequence tag library and identification of novel nodule-specific genes.

    PubMed Central

    Szczyglowski, K; Hamburger, D; Kapranov, P; de Bruijn, F J

    1997-01-01

    A range of novel expressed sequence tags (ESTs) associated with late developmental events during nodule organogenesis in the legume Lotus japonicus were identified using mRNA differential display; 110 differentially displayed polymerase chain reaction products were cloned and analyzed. Of 88 unique cDNAs obtained, 22 shared significant homology to DNA/protein sequences in the respective databases. This group comprises, among others, a nodule-specific homolog of protein phosphatase 2C, a peptide transporter protein, and a nodule-specific form of cytochrome P450. RNA gel-blot analysis of 16 differentially displayed ESTs confirmed their nodule-specific expression pattern. The kinetics of mRNA accumulation of the majority of the ESTs analyzed were found to resemble the expression pattern observed for the L. japonicus leghemoglobin gene. These results indicate that the newly isolated molecular markers correspond to genes induced during late developmental stages of L. japonicus nodule organogenesis and provide important, novel tools for the study of nodulation. PMID:9276951

  14. Development of Expressed Sequence Tag (EST)-based Cleaved Amplified Polymorphic Sequence (CAPS) markers of tea plant and their application to cultivar identification.

    PubMed

    Ujihara, Tomomi; Taniguchi, Fumiya; Tanaka, Jun-Ichi; Hayashi, Nobuyuki

    2011-03-09

    To develop cleaved amplified polymorphic sequence (CAPS) markers for cultivar identification of the tea leaf, 5 primer pairs designed on the basis of genes that encode proteins related to nitrogen assimilation and 26 primer pairs based on expressed sequence tag (EST) sequences of the root of tea plant were screened. From combinations of primer pair and restriction enzyme that showed polymorphism among tea plants, 16 markers were selected and applied to DNA fingerprinting of Japanese tea cultivars. Sixty-three cultivars, except for a bud sport (Kiraka) and its original cultivar (Yabukita) and a pair that was the progeny of the same crossing parent (Harumoegi and Sakimidori), were distinguished from one another. By combining the 16 markers with previously developed CAPS markers and observing the physical appearance, 67 cultivars were distinguishable. The cultivars involve approximately 95% of total tea cultivating area in Japan; therefore, about 95% of tea leaves produced in Japan can be authenticated by labeling their cultivars.

  15. SSH Analysis of Endosperm Transcripts and Characterization of Heat Stress Regulated Expressed Sequence Tags in Bread Wheat

    PubMed Central

    Goswami, Suneha; Kumar, Ranjeet R.; Dubey, Kavita; Singh, Jyoti P.; Tiwari, Sachidanand; Kumar, Ashok; Smita, Shuchi; Mishra, Dwijesh C.; Kumar, Sanjeev; Grover, Monendra; Padaria, Jasdeep C.; Kala, Yugal K.; Singh, Gyanendra P.; Pathak, Himanshu; Chinnusamy, Viswanathan; Rai, Anil; Praveen, Shelly; Rai, Raj D.

    2016-01-01

    Heat stress is one of the major problems in agriculturally important cereal crops, especially wheat. Here, we have constructed a subtracted cDNA library from the endosperm of HS-treated (42°C for 2 h) wheat cv. HD2985 by suppression subtractive hybridization (SSH). We identified ~550 recombinant clones ranging from 200 to 500 bp with an average size of 300 bp. Sanger's sequencing was performed with 205 positive clones to generate the differentially expressed sequence tags (ESTs). Most of the ESTs were observed to be localized on the long arm of chromosome 2A and associated with heat stress tolerance and metabolic pathways. Identified ESTs were BLAST search using Ensemble, TriFLD, and TIGR databases and the predicted CDS were translated and aligned with the protein sequences available in pfam and InterProScan 5 databases to predict the differentially expressed proteins (DEPs). We observed eight different types of post-translational modifications (PTMs) in the DEPs corresponds to the cloned ESTs-147 sites with phosphorylation, 21 sites with sumoylation, 237 with palmitoylation, 96 sites with S-nitrosylation, 3066 calpain cleavage sites, and 103 tyrosine nitration sites, predicted to sense the heat stress and regulate the expression of stress genes. Twelve DEPs were observed to have transmembrane helixes (TMH) in their structure, predicted to play the role of sensors of HS. Quantitative Real-Time PCR of randomly selected ESTs showed very high relative expression of HSP17 under HS; up-regulation was observed more in wheat cv. HD2985 (thermotolerant), as compared to HD2329 (thermosusceptible) during grain-filling. The abundance of transcripts was further validated through northern blot analysis. The ESTs and their corresponding DEPs can be used as molecular marker for screening or targeted precision breeding program. PTMs identified in the DEPs can be used to elucidate the thermotolerance mechanism of wheat—a novel step toward the development of

  16. SSH Analysis of Endosperm Transcripts and Characterization of Heat Stress Regulated Expressed Sequence Tags in Bread Wheat.

    PubMed

    Goswami, Suneha; Kumar, Ranjeet R; Dubey, Kavita; Singh, Jyoti P; Tiwari, Sachidanand; Kumar, Ashok; Smita, Shuchi; Mishra, Dwijesh C; Kumar, Sanjeev; Grover, Monendra; Padaria, Jasdeep C; Kala, Yugal K; Singh, Gyanendra P; Pathak, Himanshu; Chinnusamy, Viswanathan; Rai, Anil; Praveen, Shelly; Rai, Raj D

    2016-01-01

    Heat stress is one of the major problems in agriculturally important cereal crops, especially wheat. Here, we have constructed a subtracted cDNA library from the endosperm of HS-treated (42°C for 2 h) wheat cv. HD2985 by suppression subtractive hybridization (SSH). We identified ~550 recombinant clones ranging from 200 to 500 bp with an average size of 300 bp. Sanger's sequencing was performed with 205 positive clones to generate the differentially expressed sequence tags (ESTs). Most of the ESTs were observed to be localized on the long arm of chromosome 2A and associated with heat stress tolerance and metabolic pathways. Identified ESTs were BLAST search using Ensemble, TriFLD, and TIGR databases and the predicted CDS were translated and aligned with the protein sequences available in pfam and InterProScan 5 databases to predict the differentially expressed proteins (DEPs). We observed eight different types of post-translational modifications (PTMs) in the DEPs corresponds to the cloned ESTs-147 sites with phosphorylation, 21 sites with sumoylation, 237 with palmitoylation, 96 sites with S-nitrosylation, 3066 calpain cleavage sites, and 103 tyrosine nitration sites, predicted to sense the heat stress and regulate the expression of stress genes. Twelve DEPs were observed to have transmembrane helixes (TMH) in their structure, predicted to play the role of sensors of HS. Quantitative Real-Time PCR of randomly selected ESTs showed very high relative expression of HSP17 under HS; up-regulation was observed more in wheat cv. HD2985 (thermotolerant), as compared to HD2329 (thermosusceptible) during grain-filling. The abundance of transcripts was further validated through northern blot analysis. The ESTs and their corresponding DEPs can be used as molecular marker for screening or targeted precision breeding program. PTMs identified in the DEPs can be used to elucidate the thermotolerance mechanism of wheat-a novel step toward the development of "climate-smart" wheat.

  17. Identification and Validation of Expressed Sequence Tags from Pigeonpea (Cajanus cajan L.) Root

    PubMed Central

    Kumar, Ravi Ranjan; Yadav, Shailesh; Joshi, Shourabh; Bhandare, Prithviraj P.; Patil, Vinod Kumar; Kulkarni, Pramod B.; Sonkawade, Swati; Naik, G. R.

    2014-01-01

    Pigeonpea (Cajanus cajan (L) Millsp.) is an important food legume crop of rain fed agriculture in the arid and semiarid tropics of the world. It has deep and extensive root system which serves a number of important physiological and metabolic functions in plant development and growth. In order to identify genes associated with pigeonpea root, ESTs were generated from the root tissues of pigeonpea (GRG-295 genotype) by normalized cDNA library. A total of 105 high quality ESTs were generated by sequencing of 250 random clones which resulted in 72 unigenes comprising 25 contigs and 47 singlets. The ESTs were assigned to 9 functional categories on the basis of their putative function. In order to validate the possible expression of transcripts, four genes, namely, S-adenosylmethionine synthetase, phosphoglycerate kinase, serine carboxypeptidase, and methionine aminopeptidase, were further analyzed by reverse transcriptase PCR. The possible role of the identified transcripts and their functions associated with root will also be a valuable resource for the functional genomics study in legume crop. PMID:24895494

  18. Generation of 10,154 expressed sequence tags from a leafy gametophyte of a marine red alga, Porphyra yezoensis.

    PubMed

    Nikaido, I; Asamizu, E; Nakajima, M; Nakamura, Y; Saga, N; Tabata, S

    2000-06-30

    A total of 10,154 5'-end expressed sequence tags (EST) were established from the normalized and size-selected cDNA libraries of a marine red alga, Porphyra yezoensis. Among the ESTs, 2140 were unique species, and the remaining 8014 were grouped into 1127 species. Database search of the 3267 non-redundant ESTs by BLAST algorithm showed that the sequences of 1080 species (33.1%) have similarity to those of registered genes from various organisms including higher plants, mammals, yeasts, and cyanobacteria, while 2187 (66.9%) are novel. Codon usage analysis in the coding regions of 101 non-redundant EST groups showing significant similarity to known genes indicated the higher GC contents at the third position of codons (79.4%) than the first (62.2%) and the second position (45.0%), suggesting that the genome has been exposed to high GC pressure during evolution. The sequence data of individual ESTs are available at the web site http://www.kazusa.or.jp/en/plant/porphyra/EST/.

  19. In silico identification of conserved microRNAs and their target transcripts from expressed sequence tags of three earthworm species.

    PubMed

    Gong, Ping; Xie, Fuliang; Zhang, Baohong; Perkins, Edward J

    2010-12-01

    MicroRNAs are a recently identified class of small regulatory RNAs that target more than 30% protein-coding genes. Elevating evidence shows that miRNAs play a critical role in many biological processes, including developmental timing, tissue differentiation, and response to chemical exposure. In this study, we applied a computational approach to analyze expressed sequence tags, and identified 32 miRNAs belonging to 22 miRNA families, in three earthworm species Eisenia fetida, Eisenia andrei, and Lumbricus rubellus. These newly identified earthworm miRNAs possess a difference of 2-4 nucleotides from their homologous counterparts in Caenorhabditis elegans. They also share similar features with other known animal miRNAs, for instance, the nucleotide U being dominant in both mature and pre-miRNA sequences, particularly in the first position of mature miRNA sequences at the 5' end. The newly identified earthworm miRNAs putatively regulate mRNA genes that are involved in many important biological processes and pathways related to development, growth, locomotion, and reproduction as well as response to stresses, particularly oxidative stress. Future efforts will focus on experimental validation of their presence and target mRNA genes to further elucidate their biological functions in earthworms.

  20. Generation of expressed sequence tags of random root cDNA clones of Brassica napus by single-run partial sequencing.

    PubMed Central

    Park, Y S; Kwak, J M; Kwon, O Y; Kim, Y S; Lee, D S; Cho, M J; Lee, H H; Nam, H G

    1993-01-01

    Two hundred thirty-seven expressed sequence tags (ESTs) of Brassica napus were generated by single-run partial sequencing of 197 random root cDNA clones. A computer search of these root ESTs revealed that 21 ESTs show significant similarity to the protein-coding sequences in the existing data bases, including five stress- or defense-related genes and four clones related to the genes from other kingdoms. Northern blot analysis of the 10 data base-matched cDNA clones revealed that many of the clones are expressed most abundantly in root but less abundantly in other organs. However, two clones were highly root specific. The results show that generation of the root ESTs by partial sequencing of random cDNA clones along with the expression analysis is an efficient approach to isolate genes that are functional in plant root in a large scale. We also discuss the results of the examination of cDNA libraries and sequencing methods suitable for this approach. PMID:8029332

  1. Identification and analysis of safener-inducible expressed sequence tags in Populus using a cDNA microarray.

    PubMed

    Rishi, A S; Munir, Shirin; Kapur, Vivek; Nelson, Neil D; Goyal, Arun

    2004-12-01

    Safeners are the chemicals used to protect plants from detrimental effects of herbicides, but their mode of action at the molecular level is not well understood. As an initial step towards understanding the molecular mechanism of safener action in trees, homologous genes in hybrid poplar (Populus nigra x Populus maximowiczii) that were induced by a safener were identified. We here describe the identification of differentially expressed genes in Populus that are induced by Concep-III, a herbicide safener. Expressed sequence tags (ESTs) enriched for transcriptionally induced genes were isolated by suppressive subtractive hybridization (SSH). The SSH library cDNA inserts were used to construct a cDNA microarray for high-throughput validation of the up-regulated expression of safener-induced genes. Single-pass and partial sequences of 1,344 safener-induced ESTs were assembled into 418 singletons and 328 clusters, but the putative functions of almost 53% of the ESTs are not known. Genes encoding proteins involved in all three different phases of safener action, viz., oxidation, conjugation, and sequestration, were found in the SSH library. Almost 75% of genes that showed greater than 2-fold expression upon safener treatment were redundant in the SSH library. The expression pattern for selected genes was validated by reverse transcription-polymerase chain reaction. A few safener-induced genes that were not previously reported to be induced by safeners, but which may have a role in herbicide metabolism, were identified. The newly identified genes could have potential for application in genetic engineering of plants for herbicide detoxification and tolerance.

  2. Comprehensive analysis of expressed sequence tags from the pulp of the red mutant 'Cara Cara' navel orange (Citrus sinensis Osbeck).

    PubMed

    Ye, Jun-Li; Zhu, An-Dan; Tao, Neng-Guo; Xu, Qiang; Xu, Juan; Deng, Xiu-Xin

    2010-10-01

    Expressed sequence tag (EST) analysis of the pulp of the red-fleshed mutant 'Cara Cara' navel orange provided a starting point for gene discovery and transcriptome survey during citrus fruit maturation. Interpretation of the EST datasets revealed that the mutant pulp transcriptome held a high section of stress responses related genes, such as the type III metallothionein-like gene (6.0%), heat shock protein (2.8%), Cu/Zn superoxide dismutase (0.8%), late embryogenesis abundant protein 5 (0.8%), etc. 133 transcripts were detected to be differentially expressed between the red mutant and its orange-color wild genotype 'Washington' via digital expression analysis. Among them, genes involved in metabolism, defense/stress and signal transduction were statistical overrepresented. Fifteen transcription factors, composed of NAM, ATAF, and CUC transcription factor (NAC); myeloblastosis (MYB); myelocytomatosis (MYC); basic helix-loop-helix (bHLH); basic leucine zipper (bZIP) domain members, were also included. The data reflected the distinct expression profile and the unique regulatory module associated with these two genotypes. Eight differently expressed genes analyzed in digital were validated by quantitative real-time polymerase chain reaction. For structural polymorphism, both simple sequence repeats and single nucleotide polymorphisms (SNP) loci were surveyed; dinucleotide presentation revealed a bias toward AG/GA/TC/CT repeats (52.5%), against GC/CG repeats (0%). SNPs analysis found that transitions (73%) outnumbered transversions (27%). Seventeen potential cultivar-specific and 387 heterozygous SNP loci were detected from 'Cara Cara' and 'Washington' EST pool.

  3. Analysis and functional annotation of expressed sequence tags (ESTs) from multiple tissues of oil palm (Elaeis guineensis Jacq.)

    PubMed Central

    Ho, Chai-Ling; Kwan, Yen-Yen; Choi, Mei-Chooi; Tee, Sue-Sean; Ng, Wai-Har; Lim, Kok-Ang; Lee, Yang-Ping; Ooi, Siew-Eng; Lee, Weng-Wah; Tee, Jin-Ming; Tan, Siang-Hee; Kulaveerasingam, Harikrishna; Alwee, Sharifah Shahrul Rabiah Syed; Abdullah, Meilina Ong

    2007-01-01

    Background Oil palm is the second largest source of edible oil which contributes to approximately 20% of the world's production of oils and fats. In order to understand the molecular biology involved in in vitro propagation, flowering, efficient utilization of nitrogen sources and root diseases, we have initiated an expressed sequence tag (EST) analysis on oil palm. Results In this study, six cDNA libraries from oil palm zygotic embryos, suspension cells, shoot apical meristems, young flowers, mature flowers and roots, were constructed. We have generated a total of 14537 expressed sequence tags (ESTs) from these libraries, from which 6464 tentative unique contigs (TUCs) and 2129 singletons were obtained. Approximately 6008 of these tentative unique genes (TUGs) have significant matches to the non-redundant protein database, from which 2361 were assigned to one or more Gene Ontology categories. Predominant transcripts and differentially expressed genes were identified in multiple oil palm tissues. Homologues of genes involved in many aspects of flower development were also identified among the EST collection, such as CONSTANS-like, AGAMOUS-like (AGL)2, AGL20, LFY-like, SQUAMOSA, SQUAMOSA binding protein (SBP) etc. Majority of them are the first representatives in oil palm, providing opportunities to explore the cause of epigenetic homeotic flowering abnormality in oil palm, given the importance of flowering in fruit production. The transcript levels of two flowering-related genes, EgSBP and EgSEP were analysed in the flower tissues of various developmental stages. Gene homologues for enzymes involved in oil biosynthesis, utilization of nitrogen sources, and scavenging of oxygen radicals, were also uncovered among the oil palm ESTs. Conclusion The EST sequences generated will allow comparative genomic studies between oil palm and other monocotyledonous and dicotyledonous plants, development of gene-targeted markers for the reference genetic map, design and

  4. A high-throughput data mining of single nucleotide polymorphisms in Coffea species expressed sequence tags suggests differential homeologous gene expression in the allotetraploid Coffea arabica.

    PubMed

    Vidal, Ramon Oliveira; Mondego, Jorge Maurício Costa; Pot, David; Ambrósio, Alinne Batista; Andrade, Alan Carvalho; Pereira, Luiz Filipe Protasio; Colombo, Carlos Augusto; Vieira, Luiz Gonzaga Esteves; Carazzolle, Marcelo Falsarella; Pereira, Gonçalo Amarante Guimarães

    2010-11-01

    Polyploidization constitutes a common mode of evolution in flowering plants. This event provides the raw material for the divergence of function in homeologous genes, leading to phenotypic novelty that can contribute to the success of polyploids in nature or their selection for use in agriculture. Mounting evidence underlined the existence of homeologous expression biases in polyploid genomes; however, strategies to analyze such transcriptome regulation remained scarce. Important factors regarding homeologous expression biases remain to be explored, such as whether this phenomenon influences specific genes, how paralogs are affected by genome doubling, and what is the importance of the variability of homeologous expression bias to genotype differences. This study reports the expressed sequence tag assembly of the allopolyploid Coffea arabica and one of its direct ancestors, Coffea canephora. The assembly was used for the discovery of single nucleotide polymorphisms through the identification of high-quality discrepancies in overlapped expressed sequence tags and for gene expression information indirectly estimated by the transcript redundancy. Sequence diversity profiles were evaluated within C. arabica (Ca) and C. canephora (Cc) and used to deduce the transcript contribution of the Coffea eugenioides (Ce) ancestor. The assignment of the C. arabica haplotypes to the C. canephora (CaCc) or C. eugenioides (CaCe) ancestral genomes allowed us to analyze gene expression contributions of each subgenome in C. arabica. In silico data were validated by the quantitative polymerase chain reaction and allele-specific combination TaqMAMA-based method. The presence of differential expression of C. arabica homeologous genes and its implications in coffee gene expression, ontology, and physiology are discussed.

  5. Protein identities from 'Graphocephala atropunctata' expressed sequence tags: Expanding leafhopper vector biology

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Heat shock proteins and 44 protein sequences from the blue-green sharpshooter, BGSS, were produced and identified. The sequences were submitted and published under accession numbers: DQ445499-DQ445542, in the National Center for Biotechnology Information, NCBI, Public Database. The blue-green sharps...

  6. Water stress-responsive genes in loblolly pine (Pinus taeda) roots identified by analyses of expressed sequence tag libraries.

    PubMed

    Lorenz, W Walter; Sun, Feng; Liang, Chun; Kolychev, Dmitri; Wang, Haiming; Zhao, Xin; Cordonnier-Pratt, Marie-Michele; Pratt, Lee H; Dean, Jeffrey F D

    2006-01-01

    Drought stress is the principal cause of seedling mortality in pine forests of the southeastern United States and in many other forested regions around the globe. As part of a larger effort to discover loblolly pine genes, this study subjected rooted cuttings of three unrelated pine genotypes to three watering regimens. Expressed sequence tags (ESTs) were obtained from both the 3' and 5' ends of 12,918 randomly selected cDNAs generated from root tissues. These ESTs were clustered to identify 6,765 unique transcripts (UniScripts) derived from 6,202 putative unique genes (UniGenes-S). Tentative annotations were assigned on the basis of BLASTX comparisons to the Protein Information Resource Nonredundant Reference (PIR-NREF) database. Expression levels of 42 UniScripts varied with high statistical significance with respect to treatment. Many of them resembled gene products shown to be important for drought tolerance in other species, including dehydrins, endochitinases, cytochrome P450 enzymes, pathogenesis-related proteins and various late-embryogenesis abundant (LEA) gene products. Similarly, expression levels of 110 UniScripts varied with high statistical significance among genotypes, indicating that gene expression patterns in this species are much more dependent on genotype than on treatment. Most of the water stress-induced pine UniScripts that appeared to encode products resembling drought tolerance factors in other species were most highly induced in a single genotype, suggesting that particularly useful adaptive alleles for drought tolerance might exist within the collection of cDNAs characterized from this genotype. Mining and visualizing the complete data set, as well as downloading of both EST and UniScript contig sequences, are possible using MAGIC Gene Discovery at http://fungen.org/genediscovery/.

  7. Transcriptome analysis of the phytopathogenic fungus Rhizoctonia solani AG1-IB 7/3/14 applying high-throughput sequencing of expressed sequence tags (ESTs).

    PubMed

    Wibberg, Daniel; Jelonek, Lukas; Rupp, Oliver; Kröber, Magdalena; Goesmann, Alexander; Grosch, Rita; Pühler, Alfred; Schlüter, Andreas

    2014-01-01

    Rhizoctonia solani is a soil-borne plant pathogenic fungus of the phylum Basidiomycota. It affects a wide range of agriculturally important crops and hence is responsible for economically relevant crop losses. Transcriptome analysis of the bottom rot pathogen R. solani AG1-1B (isolate 7/3/14) by applying high-throughput sequencing and bioinformatics methods addressing Expressed Sequence Tag (EST) data interpretation provided new insights in expressed genes of this fungus. Two normalized cDNA libraries representing different cultivation conditions of the fungus were sequenced on the 454 FLX (Roche) system. Subsequent to cDNA sequence assembly and quality control, ESTs were analysed applying advanced bioinformatics methods. More than 14 000 transcript isoforms originating from approximately 10 000 predictable R. solani AG1-IB 7/3/14 genes are represented in each dataset. Comparative analyses revealed several differentially expressed genes depending on the growth conditions applied. Determinants with predicted functions in recognition processes between the fungus and the host plant were identified. Moreover, many R. solani AG1-IB ESTs were predicted to encode putative cellulose, pectin, and lignin degrading enzymes. Furthermore, genes playing a possible role in mitogen-activated protein (MAP) kinase cascades, 4-aminobutyric acid (GABA) metabolism, melanin synthesis, plant defence antagonism, phytotoxin, and mycotoxin synthesis were detected.

  8. The venom composition of the parasitic wasp Chelonus inanitus resolved by combined expressed sequence tags analysis and proteomic approach

    PubMed Central

    2010-01-01

    Background Parasitic wasps constitute one of the largest group of venomous animals. Although some physiological effects of their venoms are well documented, relatively little is known at the molecular level on the protein composition of these secretions. To identify the majority of the venom proteins of the endoparasitoid wasp Chelonus inanitus (Hymenoptera: Braconidae), we have randomly sequenced 2111 expressed sequence tags (ESTs) from a cDNA library of venom gland. In parallel, proteins from pure venom were separated by gel electrophoresis and individually submitted to a nano-LC-MS/MS analysis allowing comparison of peptides and ESTs sequences. Results About 60% of sequenced ESTs encoded proteins whose presence in venom was attested by mass spectrometry. Most of the remaining ESTs corresponded to gene products likely involved in the transcriptional and translational machinery of venom gland cells. In addition, a small number of transcripts were found to encode proteins that share sequence similarity with well-known venom constituents of social hymenopteran species, such as hyaluronidase-like proteins and an Allergen-5 protein. An overall number of 29 venom proteins could be identified through the combination of ESTs sequencing and proteomic analyses. The most highly redundant set of ESTs encoded a protein that shared sequence similarity with a venom protein of unknown function potentially specific of the Chelonus lineage. Venom components specific to C. inanitus included a C-type lectin domain containing protein, a chemosensory protein-like protein, a protein related to yellow-e3 and ten new proteins which shared no significant sequence similarity with known sequences. In addition, several venom proteins potentially able to interact with chitin were also identified including a chitinase, an imaginal disc growth factor-like protein and two putative mucin-like peritrophins. Conclusions The use of the combined approaches has allowed to discriminate between cellular

  9. Expressed Sequence Tags Analysis and Design of Simple Sequence Repeats Markers from a Full-Length cDNA Library in Perilla frutescens (L.)

    PubMed Central

    Seong, Eun Soo; Yoo, Ji Hye; Choi, Jae Hoo; Kim, Chang Heum; Jeon, Mi Ran; Kang, Byeong Ju; Lee, Jae Geun; Choi, Seon Kang; Ghimire, Bimal Kumar; Yu, Chang Yeon

    2015-01-01

    Perilla frutescens is valuable as a medicinal plant as well as a natural medicine and functional food. However, comparative genomics analyses of P. frutescens are limited due to a lack of gene annotations and characterization. A full-length cDNA library from P. frutescens leaves was constructed to identify functional gene clusters and probable EST-SSR markers via analysis of 1,056 expressed sequence tags. Unigene assembly was performed using basic local alignment search tool (BLAST) homology searches and annotated Gene Ontology (GO). A total of 18 simple sequence repeats (SSRs) were designed as primer pairs. This study is the first to report comparative genomics and EST-SSR markers from P. frutescens will help gene discovery and provide an important source for functional genomics and molecular genetic research in this interesting medicinal plant. PMID:26664999

  10. DNA sequence-based "bar codes" for tracking the origins of expressed sequence tags from a maize cDNA library constructed using multiple mRNA sources.

    PubMed

    Qiu, Fang; Guo, Ling; Wen, Tsui-Jung; Liu, Feng; Ashlock, Daniel A; Schnable, Patrick S

    2003-10-01

    To enhance gene discovery, expressed sequence tag (EST) projects often make use of cDNA libraries produced using diverse mixtures of mRNAs. As such, expression data are lost because the origins of the resulting ESTs cannot be determined. Alternatively, multiple libraries can be prepared, each from a more restricted source of mRNAs. Although this approach allows the origins of ESTs to be determined, it requires the production of multiple libraries. A hybrid approach is reported here. A cDNA library was prepared using 21 different pools of maize (Zea mays) mRNAs. DNA sequence "bar codes" were added during first-strand cDNA synthesis to uniquely identify the mRNA source pool from which individual cDNAs were derived. Using a decoding algorithm that included error correction, it was possible to identify the source mRNA pool of more than 97% of the ESTs. The frequency at which a bar code is represented in an EST contig should be proportional to the abundance of the corresponding mRNA in the source pool. Consistent with this, all ESTs derived from several genes (zein and adh1) that are known to be exclusively expressed in kernels or preferentially expressed under anaerobic conditions, respectively, were exclusively tagged with bar codes associated with mRNA pools prepared from kernel and anaerobically treated seedlings, respectively. Hence, by allowing for the retention of expression data, the bar coding of cDNA libraries can enhance the value of EST projects.

  11. Exploitation of a turbot (Scophthalmus maximus L.) immune-related expressed sequence tag (EST) database for microsatellite screening and validation.

    PubMed

    Navajas-Pérez, R; Robles, F; Molina-Luzón, M J; De La Herrán, R; Alvarez-Dios, J A; Pardo, B G; Vera, M; Bouza, C; Martínez, P

    2012-07-01

    In this study, we identified and characterized 160 microsatellite loci from an expressed sequence tag (EST) database generated from immune-related organs of turbot (Scophthalmus maximus). A final set of 83 new polymorphic microsatellites were validated after the analysis of 40 individuals of Atlantic origin including both wild and farmed individuals. The allele number and the expected heterozygosity ranged from 2 to 18 and from 0.021 to 0.951, respectively. Evidences of null alleles at moderate-high frequencies were detected at six loci using population data. None of the analysed loci showed deviations from Mendelian segregation after the analysis of five full-sib families including approximately 92 individuals/family. The markers are used to consolidate the turbot genetic map, and because they are mostly EST-derived, they will be very useful for comparative genomic studies within flatfishes and with model fish species. Using an in silico approach, we detected significant homologies of microsatellite sequences with the EST databases of the flatfish species with highest genomic resources (Senegalese sole, Atlantic halibut, bastard halibut) in 31% of these turbot markers. The conservation of these microsatellites within Pleuronectiformes will pave the way for anchoring genetic maps of different species and identifying genomic regions related to productive traits.

  12. Mining of assembled expressed sequence tag (EST) data for protein families: application to the G protein-coupled receptor superfamily.

    PubMed

    Conklin, D; Yee, D P; Millar, R; Engelbrecht, J; Vissing, H

    2000-02-01

    The availability of large expressed sequence tag (EST) databases has led to a revolution in the way new genes are identified. Mining of these databases using known protein sequences as queries is a powerful technique for discovering orthologous and paralogous genes. The scientist is often confronted, however, by an enormous amount of search output owing to the inherent redundancy of EST data. In addition, high search sensitivity often cannot be achieved using only a single member of a protein superfamily as a query. In this paper a technique for addressing both of these issues is described. Assembled EST databases are queried with every member of a protein superfamily, the results are integrated and false positives are pruned from the set. The result is a set of assemblies enriched in members of the protein superfamily under consideration. The technique is applied to the G protein-coupled receptor (GPCR) superfamily in the construction of a GPCR Resource. A novel full-length human GPCR identified from the GPCR Resource is presented, illustrating the utility of the method.

  13. A semiautomated approach to gene discovery through expressed sequence tag data mining: discovery of new human transporter genes.

    PubMed

    Brown, Shoshana; Chang, Jean L; Sadée, Wolfgang; Babbitt, Patricia C

    2003-01-01

    Identification and functional characterization of the genes in the human genome remain a major challenge. A principal source of publicly available information used for this purpose is the National Center for Biotechnology Information database of expressed sequence tags (dbEST), which contains over 4 million human ESTs. To extract the information buried in this data more effectively, we have developed a semiautomated method to mine dbEST for uncharacterized human genes. Starting with a single protein input sequence, a family of related proteins from all species is compiled. This entire family is then used to mine the human EST database for new gene candidates. Evaluation of putative new gene candidates in the context of a family of characterized proteins provides a framework for inference of the structure and function of the new genes. When applied to a test data set of 28 families within the major facilitator superfamily (MFS) of membrane transporters, our protocol found 73 previously characterized human MFS genes and 43 new MFS gene candidates. Development of this approach provided insights into the problems and pitfalls of automated data mining using public databases.

  14. Identification and characterization of 43 microsatellite markers derived from expressed sequence tags of the sea cucumber ( Apostichopus japonicus)

    NASA Astrophysics Data System (ADS)

    Jiang, Qun; Li, Qi; Yu, Hong; Kong, Lingfeng

    2011-06-01

    The sea cucumber Apostichopus japonicus is a commercially and ecologically important species in China. A total of 3056 potential unigenes were generated after assembling 7597 A. japonicus expressed sequence tags (ESTs) downloaded from Gen-Bank. Two hundred and fifty microsatellite-containing ESTs (8.18%) and 299 simple sequence repeats (SSRs) were detected. The average density of SSRs was 1 per 7.403 kb of EST after redundancy elimination. Di-nucleotide repeat motifs appeared to be the most abundant type with a percentage of 69.90%. Of the 126 primer pairs designed, 90 amplified the expected products and 43 showed polymorphism in 30 individuals tested. The number of alleles per locus ranged from 2 to 26 with an average of 7.0 alleles, and the observed and expected heterozygosities varied from 0.067 to 1.000 and from 0.066 to 0.959, respectively. These new EST-derived microsatellite markers would provide sufficient polymorphism for population genetic studies and genome mapping of this sea cucumber species.

  15. Analysis of expressed sequence tags (ESTs) from cocoa (Theobroma cacao L) upon infection with Phytophthora megakarya.

    PubMed

    Naganeeswaran, Sudalaimuthu Asari; Subbian, Elain Apshara; Ramaswamy, Manimekalai

    2012-01-01

    Phytophthora megakarya, the causative agent of cacao black pod disease in West African countries causes an extensive loss of yield. In this study we have analyzed 4 libraries of ESTs derived from Phytophthora megakarya infected cocoa leaf and pod tissues. Totally 6379 redundant sequences were retrieved from ESTtik database and EST processing was performed using seqclean tool. Clustering and assembling using CAP3 generated 3333 non-redundant (907 contigs and 2426 singletons) sequences. The primary sequence analysis of 3333 non-redundant sequences showed that the GC percentage was 42.7 and the sequence length ranged from 101 - 2576 nucleotides. Further, functional analysis (Blast, Interproscan, Gene ontology and KEGG search) were executed and 1230 orthologous genes were annotated. Totally 272 enzymes corresponding to 114 metabolic pathways were identified. Functional annotation revealed that most of the sequences are related to molecular function, stress response and biological processes. The annotated enzymes are aldehyde dehydrogenase (E.C: 1.2.1.3), catalase (E.C: 1.11.1.6), acetyl-CoA C-acetyltransferase (E.C: 2.3.1.9), threonine ammonia-lyase (E.C: 4.3.1.19), acetolactate synthase (E.C: 2.2.1.6), O-methyltransferase (E.C: 2.1.1.68) which play an important role in amino acid biosynthesis and phenyl propanoid biosynthesis. All this information was stored in MySQL database management system to be used in future for reconstruction of biotic stress response pathway in cocoa.

  16. Comparative analysis of expressed sequence tags (ESTs) between drought-tolerant and -susceptible genotypes of chickpea under terminal drought stress

    PubMed Central

    2011-01-01

    Background Chickpea (Cicer arietinum L.) is an important grain-legume crop that is mainly grown in rainfed areas, where terminal drought is a major constraint to its productivity. We generated expressed sequence tags (ESTs) by suppression subtraction hybridization (SSH) to identify differentially expressed genes in drought-tolerant and -susceptible genotypes in chickpea. Results EST libraries were generated by SSH from root and shoot tissues of IC4958 (drought tolerant) and ICC 1882 (drought resistant) exposed to terminal drought conditions by the dry down method. SSH libraries were also constructed by using 2 sets of bulks prepared from the RNA of root tissues from selected recombinant inbred lines (RILs) (10 each) for the extreme high and low root biomass phenotype. A total of 3062 unigenes (638 contigs and 2424 singletons), 51.4% of which were novel in chickpea, were derived by cluster assembly and sequence alignment of 5949 ESTs. Only 2185 (71%) unigenes showed significant BLASTX similarity (<1E-06) in the NCBI non-redundant (nr) database. Gene ontology functional classification terms (BLASTX results and GO term), were retrieved for 2006 (92.0%) sequences, and 656 sequences were further annotated with 812 Enzyme Commission (EC) codes and were mapped to 108 different KEGG pathways. In addition, expression status of 830 unigenes in response to terminal drought stress was evaluated using macro-array (dot blots). The expression of few selected genes was validated by northern blotting and quantitative real-time PCR assay. Conclusion Our study compares not only genes that are up- and down-regulated in a drought-tolerant genotype under terminal drought stress and a drought susceptible genotype but also between the bulks of the selected RILs exhibiting extreme phenotypes. More than 50% of the genes identified have been shown to be associated with drought stress in chickpea for the first time. This study not only serves as resource for marker discovery, but can provide

  17. Comparative analysis and functional annotation of a large expressed sequence tag collection of apple

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A total of 34 apple cDNA libraries were constructed from root, leaf, bud, shoot, flower, and fruit tissues, at varying developmental stages and/or under biotic or abiotic stress conditions, and of several genotypes. From these libraries, 190,425 clones were partially sequenced from the 5’ end and 4...

  18. Protein identities - Graphocephala atropunctata expressed sequenced tags: expanding leafhopper vector biology

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A small heat shock protein was isolated and sequenced from the Blue-green sharpshooter, BGSS, Graphocephala atropunctata (Signoret) (Hemiptera: Cicadellidae). The BGSS has been the native vector of Pierce’s disease in vineyards in California for nearly a century. The importance of this vector spec...

  19. Expressed sequence tags (ESTs) from immune tissues of turbot (Scophthalmus maximus) challenged with pathogens

    PubMed Central

    Pardo, Belén G; Fernández, Carlos; Millán, Adrián; Bouza, Carmen; Vázquez-López, Araceli; Vera, Manuel; Alvarez-Dios, José A; Calaza, Manuel; Gómez-Tato, Antonio; Vázquez, María; Cabaleiro, Santiago; Magariños, Beatriz; Lemos, Manuel L; Leiro, José M; Martínez, Paulino

    2008-01-01

    Background The turbot (Scophthalmus maximus; Scophthalmidae; Pleuronectiformes) is a flatfish species of great relevance for marine aquaculture in Europe. In contrast to other cultured flatfish, very few genomic resources are available in this species. Aeromonas salmonicida and Philasterides dicentrarchi are two pathogens that affect turbot culture causing serious economic losses to the turbot industry. Little is known about the molecular mechanisms for disease resistance and host-pathogen interactions in this species. In this work, thousands of ESTs for functional genomic studies and potential markers linked to ESTs for mapping (microsatellites and single nucleotide polymorphisms (SNPs)) are provided. This information enabled us to obtain a preliminary view of regulated genes in response to these pathogens and it constitutes the basis for subsequent and more accurate microarray analysis. Results A total of 12584 cDNAs partially sequenced from three different cDNA libraries of turbot (Scophthalmus maximus) infected with Aeromonas salmonicida, Philasterides dicentrarchi and from healthy fish were analyzed. Three immune-relevant tissues (liver, spleen and head kidney) were sampled at several time points in the infection process for library construction. The sequences were processed into 9256 high-quality sequences, which constituted the source for the turbot EST database. Clustering and assembly of these sequences, revealed 3482 different putative transcripts, 1073 contigs and 2409 singletons. BLAST searches with public databases detected significant similarity (e-value ≤ 1e-5) in 1766 (50.7%) sequences and 816 of them (23.4%) could be functionally annotated. Two hundred three of these genes (24.9%), encoding for defence/immune-related proteins, were mostly identified for the first time in turbot. Some ESTs showed significant differences in the number of transcripts when comparing the three libraries, suggesting regulation in response to these pathogens. A total of

  20. Not all sequence tags are created equal: designing and validating sequence identification tags robust to indels.

    PubMed

    Faircloth, Brant C; Glenn, Travis C

    2012-01-01

    Ligating adapters with unique synthetic oligonucleotide sequences (sequence tags) onto individual DNA samples before massively parallel sequencing is a popular and efficient way to obtain sequence data from many individual samples. Tag sequences should be numerous and sufficiently different to ensure sequencing, replication, and oligonucleotide synthesis errors do not cause tags to be unrecoverable or confused. However, many design approaches only protect against substitution errors during sequencing and extant tag sets contain too few tag sequences. We developed an open-source software package to validate sequence tags for conformance to two distance metrics and design sequence tags robust to indel and substitution errors. We use this software package to evaluate several commercial and non-commercial sequence tag sets, design several large sets (max(count) = 7,198) of edit metric sequence tags having different lengths and degrees of error correction, and integrate a subset of these edit metric tags to polymerase chain reaction (PCR) primers and sequencing adapters. We validate a subset of these edit metric tagged PCR primers and sequencing adapters by sequencing on several platforms and subsequent comparison to commercially available alternatives. We find that several commonly used sets of sequence tags or design methodologies used to produce sequence tags do not meet the minimum expectations of their underlying distance metric, and we find that PCR primers and sequencing adapters incorporating edit metric sequence tags designed by our software package perform as well as their commercial counterparts. We suggest that researchers evaluate sequence tags prior to use or evaluate tags that they have been using. The sequence tag sets we design improve on extant sets because they are large, valid across the set, and robust to the suite of substitution, insertion, and deletion errors affecting massively parallel sequencing workflows on all currently used platforms.

  1. Gene expression profiling of coelomic cells and discovery of immune-related genes in the earthworm, Eisenia andrei, using expressed sequence tags.

    PubMed

    Tak, Eun Sik; Cho, Sung-Jin; Park, Soon Cheol

    2015-01-01

    The coelomic cells of the earthworm consist of leukocytes, chlorogocytes, and coelomocytes, which play an important role in innate immunity reactions. To gain insight into the expression profiles of coelomic cells of the earthworm, Eisenia andrei, we analyzed 1151 expressed sequence tags (ESTs) derived from the cDNA library of the coelomic cells. Among the 1151 ESTs analyzed, 493 ESTs (42.8%) showed a significant similarity to known genes and represented 164 unique genes, of which 93 ESTs were singletons and 71 ESTs manifested as two or more ESTs. From the 164 unique genes sequenced, we found 24 immune-related and cell defense genes. Furthermore, real-time PCR analysis showed that levels of lysenin-related proteins mRNA in coelomic cells of E. andrei were upregulated after the injection of Bacillus subtilis bacteria. This EST data-set would provide a valuable resource for future researches of earthworm immune system.

  2. Expressed sequence tag analysis of adult human optic nerve for NEIBank: Identification of cell type and tissue markers

    PubMed Central

    Bernstein, Steven L; Guo, Yan; Peterson, Katherine; Wistow, Graeme

    2009-01-01

    Background The optic nerve is a pure white matter central nervous system (CNS) tract with an isolated blood supply, and is widely used in physiological studies of white matter response to various insults. We examined the gene expression profile of human optic nerve (ON) and, through the NEIBANK online resource, to provide a resource of sequenced verified cDNA clones. An un-normalized cDNA library was constructed from pooled human ON tissues and was used in expressed sequence tag (EST) analysis. Location of an abundant oligodendrocyte marker was examined by immunofluorescence. Quantitative real time polymerase chain reaction (qRT-PCR) and Western analysis were used to compare levels of expression for key calcium channel protein genes and protein product in primate and rodent ON. Results Our analyses revealed a profile similar in many respects to other white matter related tissues, but significantly different from previously available ON cDNA libraries. The previous libraries were found to include specific markers for other eye tissues, suggesting contamination. Immune/inflammatory markers were abundant in the new ON library. The oligodendrocyte marker QKI was abundant at the EST level. Immunofluorescence revealed that this protein is a useful oligodendrocyte cell-type marker in rodent and primate ONs. L-type calcium channel EST abundance was found to be particularly low. A qRT-PCR-based comparative mammalian species analysis reveals that L-type calcium channel expression levels are significantly lower in primate than in rodent ON, which may help account for the class-specific difference in responsiveness to calcium channel blocking agents. Several known eye disease genes are abundantly expressed in ON. Many genes associated with normal axonal function, mRNAs associated with axonal transport, inflammation and neuroprotection are observed. Conclusion We conclude that the new cDNA library is a faithful representation of human ON and EST data provide an initial overview

  3. Rediscovering Medicinal Plants' Potential with OMICS: Microsatellite Survey in Expressed Sequence Tags of Eleven Traditional Plants with Potent Antidiabetic Properties

    PubMed Central

    Sahu, Jagajjit; Sen, Priyabrata; Choudhury, Manabendra Dutta; Dehury, Budheswar; Barooah, Madhumita; Modi, Mahendra Kumar

    2014-01-01

    Abstract Herbal medicines and traditionally used medicinal plants present an untapped potential for novel molecular target discovery using systems science and OMICS biotechnology driven strategies. Since up to 40% of the world's poor people have no access to government health services, traditional and folk medicines are often the only therapeutics available to them. In this vein, North East (NE) India is recognized for its rich bioresources. As part of the Indo-Burma hotspot, it is regarded as an epicenter of biodiversity for several plants having myriad traditional uses, including medicinal use. However, the improvement of these valuable bioresources through molecular breeding strategies, for example, using genic microsatellites or Simple Sequence Repeats (SSRs) or Expressed Sequence Tags (ESTs)-derived SSRs has not been fully utilized in large scale to date. In this study, we identified a total of 47,700 microsatellites from 109,609 ESTs of 11 medicinal plants (pineapple, papaya, noyontara, bitter orange, bermuda brass, ratalu, barbados nut, mango, mulberry, lotus, and guduchi) having proven antidiabetic properties. A total of 58,159 primer pairs were designed for the non-redundant 8060 SSR-positive ESTs and putative functions were assigned to 4483 unique contigs. Among the identified microsatellites, excluding mononucleotide repeats, di-/trinucleotides are predominant, among which repeat motifs of AG/CT and AAG/CTT were most abundant. Similarity search of SSR containing ESTs and antidiabetic gene sequences revealed 11 microsatellites linked to antidiabetic genes in five plants. GO term enrichment analysis revealed a total of 80 enriched GO terms widely distributed in 53 biological processes, 17 molecular functions, and 10 cellular components associated with the 11 markers. The present study therefore provides concrete insights into the frequency and distribution of SSRs in important medicinal resources. The microsatellite markers reported here markedly add to

  4. Thirty-four Musa (Musaceae) expressed sequence tag-derived microsatellite markers transferred to Musella lasiocarpa.

    PubMed

    Li, W J; Ma, H; Li, Z H; Wan, Y M; Liu, X X; Zhou, C L

    2012-08-06

    We assembled 31,308 publicly available Musa EST sequences into 21,129 unigenes; 4944 of them contained 5416 SSR motifs. In all, 238 unigenes flanking SSRs were randomly selected for primer design and then tested for amplification in Musella lasiocarpa. Seventy-eight primer pairs were found to be transferable to this species, and 49 displayed polymorphism. A set of 34 polymorphic SSR markers was analyzed in 24 individuals from four wild M. lasiocarpa populations. The mean number of alleles per locus was 3.0, ranging from 2 to 7. The observed and expected heterozygosities per marker ranged from 0.087 to 0.875 (mean 0.503) and from 0.294 to 0.788 (mean 0.544), respectively. These markers will be of practical use for genetic diversity and quantitative trait loci analysis of M. lasiocarpa.

  5. Comparative analysis of expressed sequence tag (EST) libraries in the seagrass Zostera marina subjected to temperature stress.

    PubMed

    Reusch, Thorsten B H; Veron, Amelie S; Preuss, Christoph; Weiner, January; Wissler, Lothar; Beck, Alfred; Klages, Sven; Kube, Michael; Reinhardt, Richard; Bornberg-Bauer, Erich

    2008-01-01

    Global warming is associated with increasing stress and mortality on temperate seagrass beds, in particular during periods of high sea surface temperatures during summer months, adding to existing anthropogenic impacts, such as eutrophication and habitat destruction. We compare several expressed sequence tag (EST) in the ecologically important seagrass Zostera marina (eelgrass) to elucidate the molecular genetic basis of adaptation to environmental extremes. We compared the tentative unigene (TUG) frequencies of libraries derived from leaf and meristematic tissue from a control situation with two experimentally imposed temperature stress conditions and found that TUG composition is markedly different among these conditions (all P < 0.0001). Under heat stress, we find that 63 TUGs are differentially expressed (d.e.) at 25 degrees C compared with lower, no-stress condition temperatures (4 degrees C and 17 degrees C). Approximately one-third of d.e. eelgrass genes were characteristic for the stress response of the terrestrial plant model Arabidopsis thaliana. The changes in gene expression suggest complex photosynthetic adjustments among light-harvesting complexes, reaction center subunits of photosystem I and II, and components of the dark reaction. Heat shock encoding proteins and reactive oxygen scavengers also were identified, but their overall frequency was too low to perform statistical tests. In all conditions, the most abundant transcript (3-15%) was a putative metallothionein gene with unknown function. We also find evidence that heat stress may translate to enhanced infection by protists. A total of 210 TUGs contain one or more microsatellites as potential candidates for gene-linked genetic markers. Data are publicly available in a user-friendly database at http://www.uni-muenster.de/Evolution/ebb/Services/zostera .

  6. Exploiting expressed sequence tag databases for the development and characterization of gene-derived simple sequence repeat markers in the opium poppy (Papaver somniferum L.) for forensic applications.

    PubMed

    Lee, Eun Jung; Jin, Gang Nam; Lee, Kyung Lyong; Han, Myun Soo; Lee, Yang Han; Yang, Moon Sik

    2011-09-01

    Simple sequence repeat (SSR) markers in the opium poppy (Papaver somniferum L.) were identified from an expressed sequence tag (EST) database comprised of 20,340 sequences. In total, 2780 SSR-containing sequences were identified. The most frequent microsatellite had an AT/TA motif (37%). Twenty-two opium poppy EST-SSR markers were presently developed and polymorphisms of six markers (psom 2, 4, 12, 13, 17, and 22) were utilized in 135 individuals under narcotic control investigation. An average of three alleles per locus (range: 2-5 alleles) with a mean heterozygosity of 0.167 was detected. Six loci identified 29 unique profiles in 135 individuals. The EST-SSR markers exhibited small degrees of genetic differentiation (fixation index = 0.727, p < 0.001). Other variable markers will be needed to facilitate the forensic identification of the opium poppy for future cases. To determine the potential for cross-species amplification, six markers were tested in five Papaver genera species and two Eschscholzia genera. The psom 4 and psom 17 primer pair was transferable. This is the first study to report SSR markers of the opium poppy.

  7. Sequence analysis of expressed sequence tags from an ABA-treated cDNA library identifies stress response genes in the moss Physcomitrella patens.

    PubMed

    Machuka, J; Bashiardes, S; Ruben, E; Spooner, K; Cuming, A; Knight, C; Cove, D

    1999-04-01

    Partial cDNA sequencing was used to obtain 169 expressed sequence tags (ESTs) in the moss, Physcomitrella patens. The source of ESTs was a random cDNA library constructed from 7 day-old protonemata following treatment with 10(-4) M abscisic acid (ABA). Analysis of the ESTs identified 69% with homology to known sequences, 61% of which had significant homology to sequences of plant origin. More importantly, at least 11 ESTs had significant similarities to genes which are implicated in plant stress-responses, including responses which may involve ABA. These included a cDNA associated with desiccation tolerance, two heat shock protein genes, one cold acclimation protein cDNA and five others that may be involved in either oxidative or chemical stress or both, i.e., Zn/Cu-superoxide dismutase, NADPH protochlorophyllide oxidoreductase (PorB), selenium binding protein, glutathione peroxidase and glutathione S transferase. Analysis of codon usage between P. patens and seed plants indicated that although mosses and higher plants are to a large extent similar, minor variations also exists that may represent the distinctiveness of each group.

  8. A new view of insect-crustacean relationships II. Inferences from expressed sequence tags and comparisons with neural cladistics.

    PubMed

    Andrew, David R

    2011-05-01

    The enormous diversity of Arthropoda has complicated attempts by systematists to deduce the history of this group in terms of phylogenetic relationships and phenotypic change. Traditional hypotheses regarding the relationships of the major arthropod groups (Chelicerata, Myriapoda, Crustacea, and Hexapoda) focus on suites of morphological characters, whereas phylogenomics relies on large amounts of molecular sequence data to infer evolutionary relationships. The present discussion is based on expressed sequence tags (ESTs) that provide large numbers of short molecular sequences and so provide an abundant source of sequence data for phylogenetic inference. This study presents well-supported phylogenies of diverse arthropod and metazoan outgroup taxa obtained from publicly-available databases. An in-house bioinformatics pipeline has been used to compile and align conserved orthologs from each taxon for maximum likelihood inferences. This approach resolves many currently accepted hypotheses regarding internal relationships between the major groups of Arthropoda, including monophyletic Hexapoda, Tetraconata (Crustacea + Hexapoda), Myriapoda, and Chelicerata sensu lato (Pycnogonida + Euchelicerata). "Crustacea" is a paraphyletic group with some taxa more closely related to the monophyletic Hexapoda. These results support studies that have utilized more restricted EST data for phylogenetic inference, yet they differ in important regards from recently published phylogenies employing nuclear protein-coding sequences. The present results do not, however, depart from other phylogenies that resolve Branchiopoda as the crustacean sister group of Hexapoda. Like other molecular phylogenies, EST-derived phylogenies alone are unable to resolve morphological convergences or evolved reversals and thus omit what may be crucial events in the history of life. For example, molecular data are unable to resolve whether a Hexapod-Branchiopod sister relationship infers a branchiopod

  9. Analysis of expressed sequence tags from Uromyces appendiculatus hyphae and haustoria and their comparison to sequences from other rust fungi

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Two separate cDNA libraries were prepared for RNA extracted from bean rust (Uromyces appendiculatus) hyphae and haustoria isolated from infected leaves bean leaves (Phaseolus vulgaris cv Pint 111) between 2 and 8 dpi. Approximately 13,000 clones were sequenced from both ends and the sequences assem...

  10. Characterizing the Grape Transcriptome. Analysis of Expressed Sequence Tags from Multiple Vitis Species and Development of a Compendium of Gene Expression during Berry Development1[w

    PubMed Central

    Silva, Francisco Goes da; Iandolino, Alberto; Al-Kayal, Fadi; Bohlmann, Marlene C.; Cushman, Mary Ann; Lim, Hyunju; Ergul, Ali; Figueroa, Rubi; Kabuloglu, Elif K.; Osborne, Craig; Rowe, Joan; Tattersall, Elizabeth; Leslie, Anna; Xu, Jane; Baek, JongMin; Cramer, Grant R.; Cushman, John C.; Cook, Douglas R.

    2005-01-01

    We report the analysis and annotation of 146,075 expressed sequence tags from Vitis species. The majority of these sequences were derived from different cultivars of Vitis vinifera, comprising an estimated 25,746 unique contig and singleton sequences that survey transcription in various tissues and developmental stages and during biotic and abiotic stress. Putatively homologous proteins were identified for over 17,752 of the transcripts, with 1,962 transcripts further subdivided into one or more Gene Ontology categories. A simple structured vocabulary, with modules for plant genotype, plant development, and stress, was developed to describe the relationship between individual expressed sequence tags and cDNA libraries; the resulting vocabulary provides query terms to facilitate data mining within the context of a relational database. As a measure of the extent to which characterized metabolic pathways were encompassed by the data set, we searched for homologs of the enzymes leading from glycolysis, through the oxidative/nonoxidative pentose phosphate pathway, and into the general phenylpropanoid pathway. Homologs were identified for 65 of these 77 enzymes, with 86% of enzymatic steps represented by paralogous genes. Differentially expressed transcripts were identified by means of a stringent believability index cutoff of ≥98.4%. Correlation analysis and two-dimensional hierarchical clustering grouped these transcripts according to similarity of expression. In the broadest analysis, 665 differentially expressed transcripts were identified across 29 cDNA libraries, representing a range of developmental and stress conditions. The groupings revealed expected associations between plant developmental stages and tissue types, with the notable exception of abiotic stress treatments. A more focused analysis of flower and berry development identified 87 differentially expressed transcripts and provides the basis for a compendium that relates gene expression and annotation

  11. Exploring the Host Parasitism of the Migratory Plant-Parasitic Nematode Ditylenchus destuctor by Expressed Sequence Tags Analysis

    PubMed Central

    Peng, Huan; Gao, Bing-li; Kong, Ling-an; Yu, Qing; Huang, Wen-kun; He, Xu-feng; Long, Hai-bo; Peng, De-liang

    2013-01-01

    The potato rot nematode, Ditylenchus destructor, is a very destructive nematode pest on many agriculturally important crops worldwide, but the molecular characterization of its parasitism of plant has been limited. The effectors involved in nematode parasitism of plant for several sedentary endo-parasitic nematodes such as Heterodera glycines, Globodera rostochiensis and Meloidogyne incognita have been identified and extensively studied over the past two decades. Ditylenchus destructor, as a migratory plant parasitic nematode, has different feeding behavior, life cycle and host response. Comparing the transcriptome and parasitome among different types of plant-parasitic nematodes is the way to understand more fully the parasitic mechanism of plant nematodes. We undertook the approach of sequencing expressed sequence tags (ESTs) derived from a mixed stage cDNA library of D. destructor. This is the first study of D. destructor ESTs. A total of 9800 ESTs were grouped into 5008 clusters including 3606 singletons and 1402 multi-member contigs, representing a catalog of D. destructor genes. Implementing a bioinformatics' workflow, we found 1391 clusters have no match in the available gene database; 31 clusters only have similarities to genes identified from D. africanus, the most closely related species to D. destructor; 1991 clusters were annotated using Gene Ontology (GO); 1550 clusters were assigned enzyme commission (EC) numbers; and 1211 clusters were mapped to 181 KEGG biochemical pathways. 22 ESTs had similarities to reported nematode effectors. Interestedly, most of the effectors identified in this study are involved in host cell wall degradation or modification, such as 1,4-beta-glucanse, 1,3-beta-glucanse, pectate lyase, chitinases and expansin, or host defense suppression such as calreticulin, annexin and venom allergen-like protein. This result implies that the migratory plant-parasitic nematode D. destructor secrets similar effectors to those of sedentary

  12. Chasing migration genes: a brain expressed sequence tag resource for summer and migratory monarch butterflies (Danaus plexippus).

    PubMed

    Zhu, Haisun; Casselman, Amy; Reppert, Steven M

    2008-01-09

    North American monarch butterflies (Danaus plexippus) undergo a spectacular fall migration. In contrast to summer butterflies, migrants are juvenile hormone (JH) deficient, which leads to reproductive diapause and increased longevity. Migrants also utilize time-compensated sun compass orientation to help them navigate to their overwintering grounds. Here, we describe a brain expressed sequence tag (EST) resource to identify genes involved in migratory behaviors. A brain EST library was constructed from summer and migrating butterflies. Of 9,484 unique sequences, 6068 had positive hits with the non-redundant protein database; the EST database likely represents approximately 52% of the gene-encoding potential of the monarch genome. The brain transcriptome was cataloged using Gene Ontology and compared to Drosophila. Monarch genes were well represented, including those implicated in behavior. Three genes involved in increased JH activity (allatotropin, juvenile hormone acid methyltransfersase, and takeout) were upregulated in summer butterflies, compared to migrants. The locomotion-relevant turtle gene was marginally upregulated in migrants, while the foraging and single-minded genes were not differentially regulated. Many of the genes important for the monarch circadian clock mechanism (involved in sun compass orientation) were in the EST resource, including the newly identified cryptochrome 2. The EST database also revealed a novel Na+/K+ ATPase allele predicted to be more resistant to the toxic effects of milkweed than that reported previously. Potential genetic markers were identified from 3,486 EST contigs and included 1599 double-hit single nucleotide polymorphisms (SNPs) and 98 microsatellite polymorphisms. These data provide a template of the brain transcriptome for the monarch butterfly. Our "snap-shot" analysis of the differential regulation of candidate genes between summer and migratory butterflies suggests that unbiased, comprehensive transcriptional

  13. Chasing Migration Genes: A Brain Expressed Sequence Tag Resource for Summer and Migratory Monarch Butterflies (Danaus plexippus)

    PubMed Central

    Zhu, Haisun; Casselman, Amy; Reppert, Steven M.

    2008-01-01

    North American monarch butterflies (Danaus plexippus) undergo a spectacular fall migration. In contrast to summer butterflies, migrants are juvenile hormone (JH) deficient, which leads to reproductive diapause and increased longevity. Migrants also utilize time-compensated sun compass orientation to help them navigate to their overwintering grounds. Here, we describe a brain expressed sequence tag (EST) resource to identify genes involved in migratory behaviors. A brain EST library was constructed from summer and migrating butterflies. Of 9,484 unique sequences, 6068 had positive hits with the non-redundant protein database; the EST database likely represents ∼52% of the gene-encoding potential of the monarch genome. The brain transcriptome was cataloged using Gene Ontology and compared to Drosophila. Monarch genes were well represented, including those implicated in behavior. Three genes involved in increased JH activity (allatotropin, juvenile hormone acid methyltransfersase, and takeout) were upregulated in summer butterflies, compared to migrants. The locomotion-relevant turtle gene was marginally upregulated in migrants, while the foraging and single-minded genes were not differentially regulated. Many of the genes important for the monarch circadian clock mechanism (involved in sun compass orientation) were in the EST resource, including the newly identified cryptochrome 2. The EST database also revealed a novel Na+/K+ ATPase allele predicted to be more resistant to the toxic effects of milkweed than that reported previously. Potential genetic markers were identified from 3,486 EST contigs and included 1599 double-hit single nucleotide polymorphisms (SNPs) and 98 microsatellite polymorphisms. These data provide a template of the brain transcriptome for the monarch butterfly. Our “snap-shot” analysis of the differential regulation of candidate genes between summer and migratory butterflies suggests that unbiased, comprehensive transcriptional profiling

  14. In silico mining for simple sequence repeat loci in a pineapple expressed sequence tag database and cross-species amplification of EST-SSR markers across Bromeliaceae.

    PubMed

    Wöhrmann, Tina; Weising, Kurt

    2011-08-01

    A collection of 5,659 expressed sequence tags (ESTs) from pineapple [Ananas comosus (L.) Merr.] was screened for simple sequence repeats (EST-SSRs) with motif lengths between 1 and 6 bp. Lower thresholds of 15, 7 and 5 repeat units were used to define microsatellites of the mono-, di-, and tri- to hexanucleotide repeat type, respectively. Based on these criteria, 696 SSRs were identified among 3,389 EST unigenes, together representing 2,840 kb. This corresponds to an average density of one SSR every 4.1 kb of non-redundant EST sequences. Dinucleotide repeats were most abundant (38.4% of all SSRs) followed by trinucleotide repeats (38.1%). Flanking primer pairs were designed for 537 EST-SSR loci, and 49 of these were screened for their functionality in 12 accessions of A. comosus, 14 accessions of 5 additional Ananas species and 1 species of Pseudananas. Distinct PCR products of the expected size range were obtained with 36 primer pairs. Eighteen loci analyzed in more detail were all polymorphic in pineapple, and primer pairs flanking these loci also generated PCR products from a wide range of genera and species from six subfamilies of the Bromeliaceae. The potential to reveal polymorphism in a heterologous target species was demonstrated in Deuterocohnia brevifolia (subfamily Pitcairnioideae).

  15. Census of genes expressed in porcine embryos and reproductive tissues by mining an expressed sequence tag database based on human genes.

    PubMed

    Jiang, Zhihua; Zhang, Ming; Wasem, Vaughn D; Michal, Jennifer J; Zhang, Hao; Wright, Raymond W

    2003-10-01

    A total of 98,898 expressed sequence tags (ESTs) derived from embryos and reproductive tissues in pigs were identified in the GenBank "est_others" database. Pig embryos were collected at 11, 12, 13, 14, 15, 20, 30, and 45 days after gestation. The reproductive tissues were sampled from testis, ovary, endometrium, hypothalamus, anterior pituitary, uterus, and placenta. A gene-oriented approach was developed to annotate these porcine EST sequences to census the genes expressed from these sources. Of the 33 308 mRNA sequences from the human genes used as references (data accessed on 1 November 2002), 9410 had the porcine EST homologs expressed in embryos and 11 795 had the EST homologs expressed in reproductive tissues. The entire genome contributes at least 28.3% of its genes to embryo development and 35.4% of its genes to reproduction. Using the EST entry numbers as indicators of gene expression, we determined that the gene expression patterns differ significantly between embryos and reproductive tissues in pigs. The basic active genes were identified for each source, but most of them are not coexpressed abundantly. Few genes were expressed on the Y chromosome (P < 0.01), but they may represent counterparts of the double-dose genes that remain active in an inactivated X chromosome in females but are needed for proper development and growth. The census provides a panel of transcripts in a broad sense that can be used as targets to study the mechanisms involved in embryo development and reproduction in pigs and other mammals, including humans.

  16. Tissue expression map of a large number of expressed sequence tags and its application to in silico screening of stress response genes in common wheat.

    PubMed

    Mochida, Keiichi; Kawaura, Kanako; Shimosaka, Etsuo; Kawakami, Naoto; Shin-I, Tadasu; Kohara, Yuji; Yamazaki, Yukiko; Ogihara, Yasunari

    2006-09-01

    In order to assess global changes in gene expression patterns in stress-induced tissues, we conducted large-scale analysis of expressed sequence tags (ESTs) in common wheat. Twenty-one cDNA libraries derived from stress-induced tissues, such as callus, as well as liquid cultures and abiotic stress conditions (temperature treatment, desiccation, photoperiod, moisture and ABA) were constructed. Several thousand colonies were randomly selected from each of these 21 cDNA libraries and sequenced from both the 5' and 3' ends. By computing abundantly expressed ESTs, correlated expression patterns of genes across the tissues were monitored. Furthermore, the relationships between gene expression profiles among the stress-induced tissues were inferred from the gene expression patterns. Multi-dimensional analysis of EST data is analogous to microarray experiments. As an example, genes specifically induced and/or suppressed by cold acclimation and heat-shock treatments were selected in silico. Four hundred and ninety genes showing fivefold induction or 218 genes for suppression in comparison to the control expression level were selected. These selected genes were annotated with the BLAST search. Furthermore, gene ontology was conducted for these genes with the InterPro search. Because genes regulated in response to temperature treatment were successfully selected, this method can be applied to other stress-treated tissues. Then, the method was applied to screen genes in response to abiotic stresses such as drought and ABA treatments. In silico selection of screened genes from virtual display should provide a powerful tool for functional plant genomics.

  17. Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

    PubMed

    Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.

  18. Expressed sequence-tag analysis of ovaries of Brachiaria brizantha reveals genes associated with the early steps of embryo sac differentiation of apomictic plants.

    PubMed

    Silveira, Erica Duarte; Guimarães, Larissa Arrais; de Alencar Dusi, Diva Maria; da Silva, Felipe Rodrigues; Martins, Natália Florencio; do Carmo Costa, Marcos Mota; Alves-Ferreira, Márcio; de Campos Carneiro, Vera Tavares

    2012-02-01

    In apomixis, asexual mode of plant reproduction through seeds, an unreduced megagametophyte is formed due to circumvented or altered meiosis. The embryo develops autonomously from the unreduced egg cell, independently of fertilization. Brachiaria is a genus of tropical forage grasses that reproduces sexually or by apomixis. A limited number of studies have reported the sequencing of apomixis-related genes and a few Brachiaria sequences have been deposited at genebank databases. This work shows sequencing and expression analyses of expressed sequence-tags (ESTs) of Brachiaria genus and points to transcripts from ovaries with preferential expression at megasporogenesis in apomictic plants. From the 11 differentially expressed sequences from immature ovaries of sexual and apomictic Brachiaria brizantha obtained from macroarray analysis, 9 were preferentially detected in ovaries of apomicts, as confirmed by RT-qPCR. A putative involvement in early steps of Panicum-type embryo sac differentiation of four sequences from B. brizantha ovaries: BbrizHelic, BbrizRan, BbrizSec13 and BbrizSti1 is suggested. Two of these, BbrizSti1 and BbrizHelic, with similarity to a gene coding to stress induced protein and a helicase, respectively, are preferentially expressed in the early stages of apomictic ovaries development, especially in the nucellus, in a stage previous to the differentiation of aposporous initials, as verified by in situ hybridization.

  19. Computational exploration of microRNAs from expressed sequence tags of Humulus lupulus, target predictions and expression analysis.

    PubMed

    Mishra, Ajay Kumar; Duraisamy, Ganesh Selvaraj; Týcová, Anna; Matoušek, Jaroslav

    2015-12-01

    Among computationally predicted and experimentally validated plant miRNAs, several are conserved across species boundaries in the plant kingdom. In this study, a combined experimental-in silico computational based approach was adopted for the identification and characterization of miRNAs in Humulus lupulus (hop), which is widely cultivated for use by the brewing industry and apart from, used as a medicinal herb. A total of 22 miRNAs belonging to 17 miRNA families were identified in hop following comparative computational approach and EST-based homology search according to a series of filtering criteria. Selected miRNAs were validated by end-point PCR and quantitative reverse transcription-polymerase chain reaction (qRT-PCR), confirmed the existence of conserved miRNAs in hop. Based on the characteristic that miRNAs exhibit perfect or nearly perfect complementarity with their targeted mRNA sequences, a total of 47 potential miRNA targets were identified in hop. Strikingly, the majority of predicted targets were belong to transcriptional factors which could regulate hop growth and development, including leaf, root and even cone development. Moreover, the identified miRNAs may also be involved in other cellular and metabolic processes, such as stress response, signal transduction, and other physiological processes. The cis-regulatory elements relevant to biotic and abiotic stress, plant hormone response, flavonoid biosynthesis were identified in the promoter regions of those miRNA genes. Overall, findings from this study will accelerate the way for further researches of miRNAs, their functions in hop and shows a path for the prediction and analysis of miRNAs to those species whose genomes are not available.

  20. Analyses of expressed sequence tags from the maize foliar pathogen Cercospora zeae-maydis identify novel genes expressed during vegetative, infectious, and reproductive growth

    PubMed Central

    Bluhm, Burton H; Dhillon, Braham; Lindquist, Erika A; Kema, Gert HJ; Goodwin, Stephen B; Dunkle, Larry D

    2008-01-01

    Background The ascomycete fungus Cercospora zeae-maydis is an aggressive foliar pathogen of maize that causes substantial losses annually throughout the Western Hemisphere. Despite its impact on maize production, little is known about the regulation of pathogenesis in C. zeae-maydis at the molecular level. The objectives of this study were to generate a collection of expressed sequence tags (ESTs) from C. zeae-maydis and evaluate their expression during vegetative, infectious, and reproductive growth. Results A total of 27,551 ESTs was obtained from five cDNA libraries constructed from vegetative and sporulating cultures of C. zeae-maydis. The ESTs, grouped into 4088 clusters and 531 singlets, represented 4619 putative unique genes. Of these, 36% encoded proteins similar (E value ≤ 10-05) to characterized or annotated proteins from the NCBI non-redundant database representing diverse molecular functions and biological processes based on Gene Ontology (GO) classification. We identified numerous, previously undescribed genes with potential roles in photoreception, pathogenesis, and the regulation of development as well as Zephyr, a novel, actively transcribed transposable element. Differential expression of selected genes was demonstrated by real-time PCR, supporting their proposed roles in vegetative, infectious, and reproductive growth. Conclusion Novel genes that are potentially involved in regulating growth, development, and pathogenesis were identified in C. zeae-maydis, providing specific targets for characterization by molecular genetics and functional genomics. The EST data establish a foundation for future studies in evolutionary and comparative genomics among species of Cercospora and other groups of plant pathogenic fungi. PMID:18983654

  1. A molecular analysis of desiccation tolerance mechanisms in the anhydrobiotic nematode Panagrolaimus superbus using expressed sequenced tags

    PubMed Central

    2012-01-01

    Background Some organisms can survive extreme desiccation by entering into a state of suspended animation known as anhydrobiosis. Panagrolaimus superbus is a free-living anhydrobiotic nematode that can survive rapid environmental desiccation. The mechanisms that P. superbus uses to combat the potentially lethal effects of cellular dehydration may include the constitutive and inducible expression of protective molecules, along with behavioural and/or morphological adaptations that slow the rate of cellular water loss. In addition, inducible repair and revival programmes may also be required for successful rehydration and recovery from anhydrobiosis. Results To identify constitutively expressed candidate anhydrobiotic genes we obtained 9,216 ESTs from an unstressed mixed stage population of P. superbus. We derived 4,009 unigenes from these ESTs. These unigene annotations and sequences can be accessed at http://www.nematodes.org/nembase4/species_info.php?species=PSC. We manually annotated a set of 187 constitutively expressed candidate anhydrobiotic genes from P. superbus. Notable among those is a putative lineage expansion of the lea (late embryogenesis abundant) gene family. The most abundantly expressed sequence was a member of the nematode specific sxp/ral-2 family that is highly expressed in parasitic nematodes and secreted onto the surface of the nematodes' cuticles. There were 2,059 novel unigenes (51.7% of the total), 149 of which are predicted to encode intrinsically disordered proteins lacking a fixed tertiary structure. One unigene may encode an exo-β-1,3-glucanase (GHF5 family), most similar to a sequence from Phytophthora infestans. GHF5 enzymes have been reported from several species of plant parasitic nematodes, with horizontal gene transfer (HGT) from bacteria proposed to explain their evolutionary origin. This P. superbus sequence represents another possible HGT event within the Nematoda. The expression of five of the 19 putative stress response

  2. An analysis of expressed sequence tags of developing castor endosperm using a full-length cDNA library

    PubMed Central

    Lu, Chaofu; Wallis, James G; Browse, John

    2007-01-01

    Background Castor seeds are a major source for ricinoleate, an important industrial raw material. Genomics studies of castor plant will provide critical information for understanding seed metabolism, for effectively engineering ricinoleate production in transgenic oilseeds, or for genetically improving castor plants by eliminating toxic and allergic proteins in seeds. Results Full-length cDNAs are useful resources in annotating genes and in providing functional analysis of genes and their products. We constructed a full-length cDNA library from developing castor endosperm, and obtained 4,720 ESTs from 5'-ends of the cDNA clones representing 1,908 unique sequences. The most abundant transcripts are genes encoding storage proteins, ricin, agglutinin and oleosins. Several other sequences are also very numerous, including two acidic triacylglycerol lipases, and the oleate hydroxylase (FAH12) gene that is responsible for ricinoleate biosynthesis. The role(s) of the lipases in developing castor seeds are not clear, and co-expressing of a lipase and the FAH12 did not result in significant changes in hydroxy fatty acid accumulation in transgenic Arabidopsis seeds. Only one oleate desaturase (FAD2) gene was identified in our cDNA sequences. Sequence and functional analyses of the castor FAD2 were carried out since it had not been characterized previously. Overexpression of castor FAD2 in a FAH12-expressing Arabidopsis line resulted in decreased accumulation of hydroxy fatty acids in transgenic seeds. Conclusion Our results suggest that transcriptional regulation of FAD2 and FAH12 genes maybe one of the mechanisms that contribute to a high level of ricinoleate accumulation in castor endosperm. The full-length cDNA library will be used to search for additional genes that affect ricinoleate accumulation in seed oils. Our EST sequences will also be useful to annotate the castor genome, which whole sequence is being generated by shotgun sequencing at the Institute for Genome

  3. Preparing and Analyzing Expressed Sequence Tags (ESTs) Library for the Mammary Tissue of Local Turkish Kivircik Sheep.

    PubMed

    Ozdemir Ozgenturk, Nehir; Omeroglu Ulu, Zehra; Ulu, Salih; Un, Cemal; Ozdem Oztabak, Kemal; Altunatmaz, Kemal

    2017-01-01

    Kivircik sheep is an important local Turkish sheep according to its meat quality and milk productivity. The aim of this study was to analyze gene expression profiles of both prenatal and postnatal stages for the Kivircik sheep. Therefore, two different cDNA libraries, which were taken from the same Kivircik sheep mammary gland tissue at prenatal and postnatal stages, were constructed. Total 3072 colonies which were randomly selected from the two libraries were sequenced for developing a sheep ESTs collection. We used Phred/Phrap computer programs for analysis of the raw EST and readable EST sequences were assembled with the CAP3 software. Putative functions of all unique sequences and statistical analysis were determined by Geneious software. Total 422 ESTs have over 80% similarity to known sequences of other organisms in NCBI classified by Panther database for the Gene Ontology (GO) category. By comparing gene expression profiles, we observed some putative genes that may be relative to reproductive performance or play important roles in milk synthesis and secretion. A total of 2414 ESTs have been deposited to the NCBI GenBank database (GW996847-GW999260). EST data in this study have provided a new source of information to functional genome studies of sheep.

  4. Preparing and Analyzing Expressed Sequence Tags (ESTs) Library for the Mammary Tissue of Local Turkish Kivircik Sheep

    PubMed Central

    Omeroglu Ulu, Zehra; Ulu, Salih; Un, Cemal; Ozdem Oztabak, Kemal; Altunatmaz, Kemal

    2017-01-01

    Kivircik sheep is an important local Turkish sheep according to its meat quality and milk productivity. The aim of this study was to analyze gene expression profiles of both prenatal and postnatal stages for the Kivircik sheep. Therefore, two different cDNA libraries, which were taken from the same Kivircik sheep mammary gland tissue at prenatal and postnatal stages, were constructed. Total 3072 colonies which were randomly selected from the two libraries were sequenced for developing a sheep ESTs collection. We used Phred/Phrap computer programs for analysis of the raw EST and readable EST sequences were assembled with the CAP3 software. Putative functions of all unique sequences and statistical analysis were determined by Geneious software. Total 422 ESTs have over 80% similarity to known sequences of other organisms in NCBI classified by Panther database for the Gene Ontology (GO) category. By comparing gene expression profiles, we observed some putative genes that may be relative to reproductive performance or play important roles in milk synthesis and secretion. A total of 2414 ESTs have been deposited to the NCBI GenBank database (GW996847–GW999260). EST data in this study have provided a new source of information to functional genome studies of sheep. PMID:28239610

  5. Annotated Expressed Sequence Tags (ESTs) from pre-smolt Atlantic salmon (Salmo salar) in a searchable data resource

    PubMed Central

    Adzhubei, Alexei A; Vlasova, Anna V; Hagen-Larsen, Heidi; Ruden, Torgeir A; Laerdahl, Jon K; Høyheim, Bjørn

    2007-01-01

    Background To identify as many different transcripts/genes in the Atlantic salmon genome as possible, it is crucial to acquire good cDNA libraries from different tissues and developmental stages, their relevant sequences (ESTs or full length sequences) and attempt to predict function. Such libraries allow identification of a large number of different transcripts and can provide valuable information on genes expressed in a particular tissue at a specific developmental stage. This data is important in constructing a microarray chip, identifying SNPs in coding regions, and for future identification of genes in the whole genome sequence. An important factor that determines the usefulness of generated data for biologists is efficient data access. Public searchable databases play a crucial role in providing such service. Description Twenty-three Atlantic salmon cDNA libraries were constructed from 15 tissues, yielding nearly 155,000 clones. From these libraries 58,109 ESTs were generated, of which 57,212 were used for contig assembly. Following deletion of mitochondrial sequences 55,118 EST sequences were submitted to GenBank. In all, 20,019 unique sequences, consisting of 6,424 contigs and 13,595 singlets, were generated. The Norwegian Salmon Genome Project Database has been constructed and annotation performed by the annotation transfer approach. Annotation was successful for 50.3% (10,075) of the sequences and 6,113 sequences (30.5%) were annotated with Gene Ontology terms for molecular function, biological process and cellular component. Conclusion We describe the construction of cDNA libraries from juvenile/pre-smolt Atlantic salmon (Salmo salar), EST sequencing, clustering, and annotation by assigning putative function to the transcripts. These sequences represents 97% of all sequences submitted to GenBank from the pre-smoltification stage. The data has been grouped into datasets according to its source and type of annotation. Various data query options are offered

  6. Use of expressed sequence tag analysis and cDNA microarrays of the filamentous fungus Aspergillus nidulans.

    PubMed

    Sims, Andrew H; Robson, Geoffrey D; Hoyle, David C; Oliver, Stephen G; Turner, Geoffrey; Prade, Rolf A; Russell, Hugh H; Dunn-Coleman, Nigel S; Gent, Manda E

    2004-02-01

    The use of microarrays in the analysis of gene expression is becoming widespread for many organisms, including yeast. However, although the genomes of a number of filamentous fungi have been fully or partially sequenced, microarray analysis is still in its infancy in these organisms. Here, we describe the construction and validation of microarrays for the fungus Aspergillus nidulans using PCR products from a 4092 EST conidial germination library. An experiment was designed to validate these arrays by monitoring the expression profiles of known genes following the addition of 1% (w/v) glucose to wild-type A. nidulans cultures grown to mid-exponential phase in Vogel's minimal medium with ethanol as the sole carbon source. The profiles of genes showing statistically significant differential expression following the glucose up-shift are presented and an assessment of the quality and reproducibility of the A. nidulans arrays discussed.

  7. Identification and isolation of full-length cDNA sequences by sequencing and analysis of expressed sequence tags from guarana (Paullinia cupana).

    PubMed

    Figueirêdo, L C; Faria-Campos, A C; Astolfi-Filho, S; Azevedo, J L

    2011-06-21

    The current intense production of biological data, generated by sequencing techniques, has created an ever-growing volume of unanalyzed data. We reevaluated data produced by the guarana (Paullinia cupana) transcriptome sequencing project to identify cDNA clones with complete coding sequences (full-length clones) and complete sequences of genes of biotechnological interest, contributing to the knowledge of biological characteristics of this organism. We analyzed 15,490 ESTs of guarana in search of clones with complete coding regions. A total of 12,402 sequences were analyzed using BLAST, and 4697 full-length clones were identified, responsible for the production of 2297 different proteins. Eighty-four clones were identified as full-length for N-methyltransferase and 18 were sequenced in both directions to obtain the complete genome sequence, and confirm the search made in silico for full-length clones. Phylogenetic analyses were made with the complete genome sequences of three clones, which showed only 0.017% dissimilarity; these are phylogenetically close to the caffeine synthase of Theobroma cacao. The search for full-length clones allowed the identification of numerous clones that had the complete coding region, demonstrating this to be an efficient and useful tool in the process of biological data mining. The sequencing of the complete coding region of identified full-length clones corroborated the data from the in silico search, strengthening its efficiency and utility.

  8. Identification of differentially expressed transcripts from maturing stem of sugarcane by in silico analysis of stem expressed sequence tags and gene expression profiling.

    PubMed

    Casu, Rosanne E; Dimmock, Christine M; Chapman, Scott C; Grof, Christopher P L; McIntyre, C Lynne; Bonnett, Graham D; Manners, John M

    2004-03-01

    Sugarcane accumulates high concentrations of sucrose in the mature stem and a number of physiological processes on-going in maturing stem tissue both directly and indirectly allow this process. To identify transcripts that are associated with stem maturation, we compared patterns of gene expression in maturing and immature stem tissue by expression profiling and bioinformatic analysis of sets of stem ESTs. This study complements a previous study of gene expression associated directly with sugar metabolism in sugarcane. A survey of sequences derived from stem tissue identified an abundance of several classes of sequence that are associated with fibre biosynthesis in the maturing stem. A combination of EST analyses and microarray hybridization revealed that genes encoding homologues of the dirigent protein, a protein that assists in the stereospecificity of lignin assembly, were the most abundant and most strongly differentially expressed transcripts in maturing stem tissue. There was also evidence of coordinated expression of other categories of fibre biosynthesis and putative defence- and stress-related transcripts in the maturing stem. This study has demonstrated the utility of genomic approaches using large-scale EST acquisition and microarray hybridization techniques to highlight the very significant transcriptional investment the maturing stem of sugarcane has placed in fibre biosynthesis and stress tolerance, in addition to its already well-documented role in sugar accumulation.

  9. Generation and Analysis of Expressed Sequence Tags (ESTs) from Halophyte Atriplex canescens to Explore Salt-Responsive Related Genes

    PubMed Central

    Li, Jingtao; Sun, Xinhua; Yu, Gang; Jia, Chengguo; Liu, Jinliang; Pan, Hongyu

    2014-01-01

    Little information is available on gene expression profiling of halophyte A. canescens. To elucidate the molecular mechanism for stress tolerance in A. canescens, a full-length complementary DNA library was generated from A. canescens exposed to 400 mM NaCl, and provided 343 high-quality ESTs. In an evaluation of 343 valid EST sequences in the cDNA library, 197 unigenes were assembled, among which 190 unigenes (83.1% ESTs) were identified according to their significant similarities with proteins of known functions. All the 343 EST sequences have been deposited in the dbEST GenBank under accession numbers JZ535802 to JZ536144. According to Arabidopsis MIPS functional category and GO classifications, we identified 193 unigenes of the 311 annotations EST, representing 72 non-redundant unigenes sharing similarities with genes related to the defense response. The sets of ESTs obtained provide a rich genetic resource and 17 up-regulated genes related to salt stress resistance were identified by qRT-PCR. Six of these genes may contribute crucially to earlier and later stage salt stress resistance. Additionally, among the 343 unigenes sequences, 22 simple sequence repeats (SSRs) were also identified contributing to the study of A. canescens resources. PMID:24960361

  10. Existence of microsatellites in expressed sequence tags of common carp ( Cyprinus carpio L.) available in GenBank dbEST database

    NASA Astrophysics Data System (ADS)

    Jingjie, Hu; Xiaolong, Wang; Xiaoli, Hu; Zhenmin, Bao

    2006-01-01

    Common carp expressed sequence tags (ESTs) were analyzed for the existence of microsatellites, or simple sequence repeats (SSRs). In the NCBI dbEST database, a total of 10612 sequences were registered before December 31, 2004. A complete search of 2-6 nucleotide microsatellites resulted in the identification of 513 SSR-containing ESTs, accounting for 4.8% of the total. Cluster analysis indicated that 73 sequences of SSR-containing ESTs fell into 27 groups and the remaining 440 ESTs were indenpendent. A total of 467 unique SSR-containing ESTs were identified. These EST-SSRs contained a variety of simple sequence types, and di- and tri-nucleotide repeats were the most abundant, accounting for 42.1% and 27.9% of the whole, respectively. Of the dinucleotide repeats, CA/TG was the most abundant, followed by GA/TC. BLASTx search showed that 38.1% of the SSR loci could be associated with genes or proteins of known or unknown function. BLASTx searches of SSR-containing ESTs also showed high frequencies (98/179) of hits on zebrafish sequences.

  11. Existence of microsatellites in expressed sequence tags of common carp ( Cyprinus carpio L.) available in GenBank dbEST database

    NASA Astrophysics Data System (ADS)

    Hu, Jingjie; Wang, Xiaolong; Hu, Xiaoli; Bao, Zhenmin

    2006-01-01

    Common carp expressed sequence tags (ESTs) were analyzed for the existence of microsatellites, or simple sequence repeats (SSRs). In the NCBI dbEST database, a total of 10612 sequences were registered before December 31, 2004. A complete search of 2 6 nucleotide microsatellites resulted in the identification of 513 SSR-containing ESTs, accounting for 4.8% of the total. Cluster analysis indicated that 73 sequences of SSR-containing ESTs fell into 27 groups and the remaining 440 ESTs were indenpendent. A total of 467 unique SSR-containing ESTs were identified. These EST-SSRs contained a variety of simple sequence types, and di- and tri-nucleotide repeats were the most abundant, accounting for 42.1% and 27.9% of the whole, respectively. Of the dinucleotide repeats, CA/TG was the most abundant, followed by GA/TC. BLASTx search showed that 38.1% of the SSR loci could be associated with genes or proteins of known or unknown function. BLASTx searches of SSR-containing ESTs also showed high frequencies (98/179) of hits on zebrafish sequences.

  12. Sequence tagging reveals unexpected modifications in toxicoproteomics.

    PubMed

    Dasari, Surendra; Chambers, Matthew C; Codreanu, Simona G; Liebler, Daniel C; Collins, Ben C; Pennington, Stephen R; Gallagher, William M; Tabb, David L

    2011-02-18

    Toxicoproteomic samples are rich in posttranslational modifications (PTMs) of proteins. Identifying these modifications via standard database searching can incur significant performance penalties. Here, we describe the latest developments in TagRecon, an algorithm that leverages inferred sequence tags to identify modified peptides in toxicoproteomic data sets. TagRecon identifies known modifications more effectively than the MyriMatch database search engine. TagRecon outperformed state of the art software in recognizing unanticipated modifications from LTQ, Orbitrap, and QTOF data sets. We developed user-friendly software for detecting persistent mass shifts from samples. We follow a three-step strategy for detecting unanticipated PTMs in samples. First, we identify the proteins present in the sample with a standard database search. Next, identified proteins are interrogated for unexpected PTMs with a sequence tag-based search. Finally, additional evidence is gathered for the detected mass shifts with a refinement search. Application of this technology on toxicoproteomic data sets revealed unintended cross-reactions between proteins and sample processing reagents. Twenty-five proteins in rat liver showed signs of oxidative stress when exposed to potentially toxic drugs. These results demonstrate the value of mining toxicoproteomic data sets for modifications.

  13. Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon

    PubMed Central

    2011-01-01

    Background Melon (Cucumis melo), an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO) terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs) and 3,073 single nucleotide polymorphisms (SNPs) in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but longer than many other dicot

  14. Expressed sequence tags from cephalic chemosensory organs of the northern walnut husk fly, Rhagoletis suavis, including a putative canonical odorant receptor.

    PubMed

    Ramsdell, Karlene M M; Lyons-Sobaski, Sheila A; Robertson, Hugh M; Walden, Kimberly K O; Feder, Jeffrey L; Wanner, Kevin; Berlocher, Stewart H

    2010-01-01

    Rhagoletis fruit flies are important both as major agricultural pests and as model organisms for the study of adaptation to new host plants and host race formation. Response to fruit odor plays a critical role in such adaptation. To better understand olfaction in Rhagoletis, an expressed sequence tag (EST) study was carried out on the antennae and maxillary palps of Rhagoletis suavis (Loew) (Diptera: Tephritidae), a common pest of walnuts in eastern United States. After cDNA cloning and sequencing, 544 ESTs were annotated. Of these, 66% had an open reading frame and could be matched to a previously sequenced gene. Based on BLAST sequence homology, 9% (49 of 544 sequences) were nuclear genes potentially involved in olfaction. The most significant finding is a putative odorant receptor (OR), RSOr1, that is homologous to Drosophila melanogaster Or49a and Or85f. This is the first tephritid OR discovered that might recognize a specific odorant. Other olfactory genes recovered included odorant binding proteins, chemosensory proteins, and putative odorant degrading enzymes.

  15. Expressed sequence tags from normalized cDNA libraries prepared from gill and hypodermal tissues of the blue crab, Callinectes sapidus.

    PubMed

    Coblentz, Francie E; Towle, David W; Shafer, Thomas H

    2006-06-01

    Expressed sequence tags (ESTs) were produced from two normalized cDNA libraries from the blue crab, Callinectes sapidus. The gill library represented pooled RNA from respiratory and transporting gills after acclimation to either high or low salinity. The hypodermis library was from arthrodial and dorsal tissue from both pre- and post-molt crabs. Random clones were single-pass sequenced from the 5'-ends, resulting in 11,761 high quality ESTs averaging 652 bases. All the ESTs were assembled using Paracel Transcript Assembler software, producing 2176 potential transcripts-883 contigs and 1293 singlets. Of these, 1235 (56.7%) were sequenced only from the gill library, while 578 (26.6%) were exclusively hypodermal. There were 363 contigs containing ESTs from both tissues (16.7% of the putative transcripts). All contigs and singlets were compared to the public protein database using BLASTx, and descriptions of the three most similar proteins for each were recorded. Additional annotations included an Interpro analysis of protein domains and a listing of Gene Ontology (GO) categories inferred from similar proteins in GO-annotated databases. All sequences are available on a web page (http://firedev.bear.uncw.edu:8080/shaferlab/). The annotations can be searched, and BLAST alignment of user-inputted sequences against the putative transcripts is possible. In addition, the ESTs have been submitted to GenBank.

  16. Construction of cDNA library and preliminary analysis of expressed sequence tags from tea plant [Camellia sinensis (L) O. Kuntze].

    PubMed

    Phukon, Munmi; Namdev, Richa; Deka, Diganta; Modi, Mahendra K; Sen, Priyabrata

    2012-09-10

    Tea is the most popular non-alcoholic and healthy beverage across the world. The understanding of the genetic organization and molecular biology of tea plant, which is very poorly understood at present, is required for quantum increase in productivity and efficient use of germplasm for either cultivation or breeding program. Single-pass sequencing of randomly selected cDNA clones is the most widely accepted technique for gene identification and cloning. In the present study, a good quality cDNA library was constructed and preliminary analysis of ESTs was carried out. The titers of unamplified and amplified libraries were 1.4 × 10(6)pfu/ml and 5.27 × 10(8)pfu/ml respectively. A total of 210 cDNA clones from the constructed cDNA library were sequenced and analyzed. A total of 84 high quality Expressed Sequence Tags (ESTs) were generated, among which 71 ESTs had significant homology with sequences in NCBI non-redundant protein database by BLAST X analysis. About 80% ESTs had poly (A) tail at 3' end indicating that the cDNAs were full length. The database-matched ESTs were classified into putative cellular roles, viz. energy-related category (corresponding to 20% of total BLAST X matched ESTs), Transcription (14.2%), protein synthesis (14.2%) cell growth and division (8.6%), cell structure (5.7%), signal transduction (5.7%), transporters (2.9%), disease and defenses (2.9%), secondary metabolism (2.9%) and gene regulation (2.9%). This study provides an overview of the mRNA expression profile and first hand information of gene sequence expressed in tender leaves and apical buds of tea plant.

  17. A White Campion (Silene latifolia) floral expressed sequence tag (EST) library: annotation, EST-SSR characterization, transferability, and utility for comparative mapping

    PubMed Central

    Moccia, Maria Domenica; Oger-Desfeux, Christine; Marais, Gabriel AB; Widmer, Alex

    2009-01-01

    Background Expressed sequence tag (EST) databases represent a valuable resource for the identification of genes in organisms with uncharacterized genomes and for development of molecular markers. One class of markers derived from EST sequences are simple sequence repeat (SSR) markers, also known as EST-SSRs. These are useful in plant genetic and evolutionary studies because they are located in transcribed genes and a putative function can often be inferred from homology searches. Another important feature of EST-SSR markers is their expected high level of transferability to related species that makes them very promising for comparative mapping. In the present study we constructed a normalized EST library from floral tissue of Silene latifolia with the aim to identify expressed genes and to develop polymorphic molecular markers. Results We obtained a total of 3662 high quality sequences from a normalized Silene cDNA library. These represent 3105 unigenes, with 73% of unigenes matching genes in other species. We found 255 sequences containing one or more SSR motifs. More than 60% of these SSRs were trinucleotides. A total of 30 microsatellite loci were identified from 106 ESTs having sufficient flanking sequences for primer design. The inheritance of these loci was tested via segregation analyses and their usefulness for linkage mapping was assessed in an interspecific cross. Tests for crossamplification of the EST-SSR loci in other Silene species established their applicability to related species. Conclusion The newly characterized genes and gene-derived markers from our Silene EST library represent a valuable genetic resource for future studies on Silene latifolia and related species. The polymorphism and transferability of EST-SSR markers facilitate comparative linkage mapping and analyses of genetic diversity in the genus Silene. PMID:19467153

  18. Transcriptional Regulations on the Low-Temperature-Induced Floral Transition in an Orchidaceae Species, Dendrobium nobile: An Expressed Sequence Tags Analysis

    PubMed Central

    Liang, Shan; Ye, Qing-Sheng; Li, Rui-Hong; Leng, Jia-Yi; Li, Mei-Ru; Wang, Xiao-Jing; Li, Hong-Qing

    2012-01-01

    Vernalization-induced flowering is a cold-relevant adaptation in many species, but little is known about the genetic basis behind in Orchidaceae species. Here, we reported a collection of 15017 expressed sequence tags (ESTs) from the vernalized axillary buds of an Orchidaceae species, Dendrobium nobile, which were assembled for 9616 unique gene clusters. Functional enrichment analysis showed that genes in relation to the responses to stresses, especially in the form of low temperatures, and those involving in protein biosynthesis and chromatin assembly were significantly overrepresented during 40 days of vernalization. Additionally, a total of 59 putative flowering-relevant genes were recognized, including those homologous to known key players in vernalization pathways in temperate cereals or Arabidopsis, such as cereal VRN1, FT/VRN3, and Arabidopsis AGL19. Results from this study suggest that the networks regulating vernalization-induced floral transition are conserved, but just in a part, in D. nobile, temperate cereals, and Arabidopsis. PMID:22550428

  19. Development of expressed sequence tag-based microsatellite markers for the critically endangered Isoëtes sinensis (Isoetaceae) based on transcriptome analysis.

    PubMed

    Gichira, A W; Long, Z C; Wang, Q F; Chen, J M; Liao, K

    2016-07-15

    Isoëtes sinensis is a critically endangered quillwort. To facilitate studies on the conservation genetics of this species, we developed expressed sequence tag-simple sequence repeat (EST-SSR) markers. A total of 50,063 unigenes were predicted by transcriptome sequencing, 5294 (10.6%) of which significantly matched 3011 Gene Ontology annotations and 2363 were assigned to Kyoto Encyclopedia of Genes and Genomes metabolic pathways. Most of these (2297) were involved in metabolism. A total of 1982 SSR motifs were identified, with trinucleotides being the dominant repeat motif, and 1438 (72.6%) SSR primers were designed. Eighteen randomly selected primer pairs were used to genotype 24 I. sinensis accessions, which confirmed the suitability of these novel markers for molecular studies of I. sinensis. The heterozygosity index value ranged between 0.0799 and 0.9106, while the Shannon-Wiener diversity index value ranged between 0.1732 and 2.5589. The EST-SSRs reported in this study are linked to genic sequences, and are therefore ideal for investigating the evolutionary history of I. sinensis. These markers, together with the large EST dataset generated in this study, will greatly facilitate conservation genetic studies of I. sinensis.

  20. Generation and analysis of expressed sequence tags (ESTs) of Camelina sativa to mine drought stress-responsive genes.

    PubMed

    Kanth, Bashistha Kumar; Kumari, Shipra; Choi, Seo Hee; Ha, Hye-Jeong; Lee, Geung-Joo

    2015-11-06

    Camelina sativa is an oil-producing crop belonging to the family of Brassicaceae. Due to exceptionally high content of omega fatty acid, it is commercially grown around the world as edible oil, biofuel, and animal feed. A commonly referred 'false flax' or gold-of-pleasure Camelina sativa has been interested as one of biofuel feedstocks. The species can grow on marginal land due to its superior drought tolerance with low requirement of agricultural inputs. This crop has been unexploited due to very limited transcriptomic and genomic data. Use of gene-specific molecular markers is an important strategy for new cultivar development in breeding program. In this study, Illumina paired-end sequencing technology and bioinformatics tools were used to obtain expression profiling of genes responding to drought stress in Camelina sativa BN14. A total of more than 60,000 loci were assembled, corresponding to approximately 275 K transcripts. When the species was exposed to 10 kPa drought stress, 100 kPa drought stress, and rehydrated conditions, a total of 107, 2,989, and 982 genes, respectively, were up-regulated, while 146, 3,659, and 1189 genes, respectively, were down-regulated compared to control condition. Some unknown genes were found to be highly expressed under drought conditions, together with some already reported gene families such as senescence-associated genes, CAP160, and LEA under 100 kPa soil water condition, cysteine protease, 2OG, Fe(II)-dependent oxygenase, and RAD-like 1 under rehydrated condition. These genes will be further validated and mapped to determine their function and loci. This EST library will be favorably applied to develop gene-specific molecular markers and discover genes responsible for drought tolerance in Camelina species.

  1. Transcript identification by analysis of short sequence tags--influence of tag length, restriction site and transcript database.

    PubMed

    Unneberg, Per; Wennborg, Anders; Larsson, Magnus

    2003-04-15

    There exist a number of gene expression profiling techniques that utilize restriction enzymes for generation of short expressed sequence tags. We have studied how the choice of restriction enzyme influences various characteristics of tags generated in an experiment. We have also investigated various aspects of in silico transcript identification that these profiling methods rely on. First, analysis of 14 248 mRNA sequences derived from the RefSeq transcript database showed that 1-30% of the sequences lack a given restriction enzyme recognition site. Moreover, 1-5% of the transcripts have recognition sites located less than 10 bases from the poly(A) tail. The uniqueness of 10 bp tags lies in the range 90-95%, which increases only slightly with longer tags, due to the existence of closely related transcripts. Furthermore, 3-30% of upstream 10 bp tags are identical to 3' tags, introducing a risk of misclassification if upstream tags are present in a sample. Second, we found that a sequence length of 16-17 bp, including the recognition site, is sufficient for unique transcript identification by BLAST based sequence alignment to the UniGene Human non-redundant database. Third, we constructed a tag-to-gene mapping for UniGene and compared it to an existing mapping database. The mappings agreed to 79-83%, where the selection of representative sequences in the UniGene clusters is the main cause of the disagreement. The results of this study may serve to improve the interpretation of sequence-based expression studies and the design of hybridization arrays, by identifying short tags that have a high reliability and separating them from tags that carry an inherent ambiguity in their capacity to discriminate between genes. To this end, supplementary information in the form of a web companion to this paper is located at http:// biobase.biotech.kth.se/tagseq.

  2. Construction of a full-length cDNA library and preliminary analysis of expressed sequence tags from lymphocytes of half-pipe snowboarding athletes.

    PubMed

    Zhao, Y H; Zhang, Z B; Zhao, C Q; Zhang, Y; Wang, Y F; Guan, W J; Zhu, Z Q

    2015-10-21

    The genes of top athletes are a valuable genetic resource for the human race, and could be exploited to identify novel genes related to sports ability, as well as other functions. We analyzed the expressed sequence tags from top half-pipe snowboarding athletes using the SMART complementary DNA (cDNA) library construction method to elucidate the characteristics of the athlete genome and the differential expression of the genes it contains. Overall, we established a full-length cDNA library from the lymphocytes of half-pipe snowboarding athletes and analyzed the inserted gene fragments. We also classified those genes according to molecular function, biological characteristics, cellular composition, protein types, and signal paths. A total of 201 functional genes were noted, which were distributed in 27 pathways. TXN, MDH1, ARL1, ARPC3, ACTG1, and other genes measured in sequence may be associated with physical ability. This suggests that the SMART cDNA library constructed from the genetic material from top athletes is an effective tool for preserving genetic sports resources and providing genetic markers of physical ability for athlete selection.

  3. Identification of stress-induced genes from the drought-tolerant plant Prosopis juliflora (Swartz) DC. through analysis of expressed sequence tags.

    PubMed

    George, Suja; Venkataraman, Gayatri; Parida, Ajay

    2007-05-01

    Abiotic stresses such as cold, salinity, drought, wounding, and heavy metal contamination adversely affect crop productivity throughout the world. Prosopis juliflora is a phreatophyte that can tolerate severe adverse environmental conditions such as drought, salinity, and heavy metal contamination. As a first step towards the characterization of genes that contribute to combating abiotic stress, construction and analysis of a cDNA library of P. juliflora genes is reported here. Random expressed sequence tag (EST) sequencing of 1750 clones produced 1467 high-quality reads. These clones were classified into functional categories, and BLAST comparisons revealed that 114 clones were homologous to genes implicated in stress response(s) and included heat shock proteins, metallothioneins, lipid transfer proteins, and late embryogenesis abundant proteins. Of the ESTs analyzed, 26% showed homology to previously uncharacterized genes in the databases. Fifty-two clones from this category were selected for reverse Northern analysis: 21 were shown to be upregulated and 16 downregulated. The results obtained by reverse Northern analysis were confirmed by Northern analysis. Clustering of the 1467 ESTs produced a total of 295 contigs encompassing 790 ESTs, resulting in a 54.2% redundancy. Two of the abundant genes coding for a nonspecific lipid transfer protein and late embryogenesis abundant protein were sequenced completely. Northern analysis (after polyethylene glycol stress) of the 2 genes was carried out. The implications of the analyzed genes in abiotic stress tolerance are also discussed.

  4. Generation and analysis of a 29,745 unique Expressed Sequence Tags from the Pacific oyster (Crassostrea gigas) assembled into a publicly accessible database: the GigasDatabase

    PubMed Central

    2009-01-01

    Background Although bivalves are among the most-studied marine organisms because of their ecological role and economic importance, very little information is available on the genome sequences of oyster species. This report documents three large-scale cDNA sequencing projects for the Pacific oyster Crassostrea gigas initiated to provide a large number of expressed sequence tags that were subsequently compiled in a publicly accessible database. This resource allowed for the identification of a large number of transcripts and provides valuable information for ongoing investigations of tissue-specific and stimulus-dependant gene expression patterns. These data are crucial for constructing comprehensive DNA microarrays, identifying single nucleotide polymorphisms and microsatellites in coding regions, and for identifying genes when the entire genome sequence of C. gigas becomes available. Description In the present paper, we report the production of 40,845 high-quality ESTs that identify 29,745 unique transcribed sequences consisting of 7,940 contigs and 21,805 singletons. All of these new sequences, together with existing public sequence data, have been compiled into a publicly-available Website http://public-contigbrowser.sigenae.org:9090/Crassostrea_gigas/index.html. Approximately 43% of the unique ESTs had significant matches against the SwissProt database and 27% were annotated using Gene Ontology terms. In addition, we identified a total of 208 in silico microsatellites from the ESTs, with 173 having sufficient flanking sequence for primer design. We also identified a total of 7,530 putative in silico, single-nucleotide polymorphisms using existing and newly-generated EST resources for the Pacific oyster. Conclusion A publicly-available database has been populated with 29,745 unique sequences for the Pacific oyster Crassostrea gigas. The database provides many tools to search cleaned and assembled ESTs. The user may input and submit several filters, such as

  5. Computational identification of conserved microRNAs and their targets from expression sequence tags of blueberry (Vaccinium corybosum).

    PubMed

    Li, Xuyan; Hou, Yanming; Zhang, Li; Zhang, Wenhao; Quan, Chen; Cui, Yuhai; Bian, Shaomin

    2014-01-01

    MicroRNAs (miRNAs) are a class of endogenous, approximately 21nt in length, non-coding RNA, which mediate the expression of target genes primarily at post-transcriptional levels. miRNAs play critical roles in almost all plant cellular and metabolic processes. Although numerous miRNAs have been identified in the plant kingdom, the miRNAs in blueberry, which is an economically important small fruit crop, still remain totally unknown. In this study, we reported a computational identification of miRNAs and their targets in blueberry. By conducting an EST-based comparative genomics approach, 9 potential vco-miRNAs were discovered from 22,402 blueberry ESTs according to a series of filtering criteria, designated as vco-miR156-5p, vco-miR156-3p, vco-miR1436, vco-miR1522, vco-miR4495, vco-miR5120, vco-miR5658, vco-miR5783, and vco-miR5986. Based on sequence complementarity between miRNA and its target transcript, 34 target ESTs from blueberry and 70 targets from other species were identified for the vco-miRNAs. The targets were found to be involved in transcription, RNA splicing and binding, DNA duplication, signal transduction, transport and trafficking, stress response, as well as synthesis and metabolic process. These findings will greatly contribute to future research in regard to functions and regulatory mechanisms of blueberry miRNAs.

  6. An expressed sequence tag analysis of the intertidal brown seaweeds Fucus serratus (L.) and F. vesiculosus (L.) (Heterokontophyta, Phaeophyceae) in response to abiotic stressors.

    PubMed

    Pearson, Gareth A; Hoarau, Galice; Lago-Leston, Asuncion; Coyer, James A; Kube, Michael; Reinhardt, Richard; Henckel, Kolja; Serrão, Ester T A; Corre, Erwan; Olsen, Jeanine L

    2010-04-01

    In order to aid gene discovery and uncover genes responding to abiotic stressors in stress-tolerant brown algae of the genus Fucus, expressed sequence tags (ESTs) were studied in two species, Fucus serratus and Fucus vesiculosus. Clustering of over 12,000 ESTs from three libraries for heat shock/recovery and desiccation/rehydration resulted in identification of 2,503, 1,290, and 2,409 unigenes from heat-shocked F. serratus, desiccated F. serratus, and desiccated F. vesiculosus, respectively. Low overall annotation rates (18-31%) were strongly associated with the presence of long 3' untranslated regions in Fucus transcripts, as shown by analyses of predicted protein-coding sequence in annotated and nonannotated tentative consensus sequences. Posttranslational modification genes were overrepresented in the heat shock/recovery library, including many chaperones, the most abundant of which were a family of small heat shock protein transcripts, Hsp90 and Hsp70 members. Transcripts of LI818-like light-harvesting genes implicated in photoprotection were also expressed during heat shock in high light. The expression of several heat-shock-responsive genes was confirmed by quantitative reverse transcription polymerase chain reaction. However, candidate genes were notably absent from both desiccation/rehydration libraries, while the responses of the two species to desiccation were divergent, perhaps reflecting the species-specific physiological differences in stress tolerance previously established. Desiccation-tolerant F. vesiculosus overexpressed at least 17 ribosomal protein genes and two ubiquitin-ribosomal protein fusion genes, suggesting that ribosome function and/or biogenesis are important during cycles of rapid desiccation and rehydration in the intertidal zone and possibly indicate parallels with other poikilohydric organisms such as desiccation-tolerant bryophytes.

  7. Expressed sequence tags from larval gut of the European corn borer (Ostrinia nubilalis): Exploring candidate genes potentially involved in Bacillus thuringiensis toxicity and resistance

    PubMed Central

    Khajuria, Chitvan; Zhu, Yu Cheng; Chen, Ming-Shun; Buschman, Lawrent L; Higgins, Randall A; Yao, Jianxiu; Crespo, Andre LB; Siegfried, Blair D; Muthukrishnan, Subbaratnam; Zhu, Kun Yan

    2009-01-01

    Background Lepidoptera represents more than 160,000 insect species which include some of the most devastating pests of crops, forests, and stored products. However, the genomic information on lepidopteran insects is very limited. Only a few studies have focused on developing expressed sequence tag (EST) libraries from the guts of lepidopteran larvae. Knowledge of the genes that are expressed in the insect gut are crucial for understanding basic physiology of food digestion, their interactions with Bacillus thuringiensis (Bt) toxins, and for discovering new targets for novel toxins for use in pest management. This study analyzed the ESTs generated from the larval gut of the European corn borer (ECB, Ostrinia nubilalis), one of the most destructive pests of corn in North America and the western world. Our goals were to establish an ECB larval gut-specific EST database as a genomic resource for future research and to explore candidate genes potentially involved in insect-Bt interactions and Bt resistance in ECB. Results We constructed two cDNA libraries from the guts of the fifth-instar larvae of ECB and sequenced a total of 15,000 ESTs from these libraries. A total of 12,519 ESTs (83.4%) appeared to be high quality with an average length of 656 bp. These ESTs represented 2,895 unique sequences, including 1,738 singletons and 1,157 contigs. Among the unique sequences, 62.7% encoded putative proteins that shared significant sequence similarities (E-value ≤ 10-3)with the sequences available in GenBank. Our EST analysis revealed 52 candidate genes that potentially have roles in Bt toxicity and resistance. These genes encode 18 trypsin-like proteases, 18 chymotrypsin-like proteases, 13 aminopeptidases, 2 alkaline phosphatases and 1 cadherin-like protein. Comparisons of expression profiles of 41 selected candidate genes between Cry1Ab-susceptible and resistant strains of ECB by RT-PCR showed apparently decreased expressions in 2 trypsin-like and 2 chymotrypsin

  8. Computational identification and characterization of conserved miRNAs and their target genes in garlic (Allium sativum L.) expressed sequence tags.

    PubMed

    Panda, Debashis; Dehury, Budheswar; Sahu, Jagajjit; Barooah, Madhumita; Sen, Priyabrata; Modi, Mahendra K

    2014-03-10

    The endogenous small non-coding functional microRNAs (miRNAs) are short in size, range from ~21 to 24 nucleotides in length, play a pivotal role in gene expression in plants and animals by silencing genes either by destructing or blocking of translation of homologous mRNA. Although various high-throughput, time consuming and expensive techniques like forward genetics and direct cloning are employed to detect miRNAs in plants but comparative genomics complemented with novel bioinformatic tools pave the way for efficient and cost-effective identification of miRNAs through homologous sequence search with previously known miRNAs. In this study, an attempt was made to identify and characterize conserved miRNAs in garlic expressed sequence tags (ESTs) through computational means. For identification of novel miRNAs in garlic, a total 3227 known mature miRNAs of plant kingdom Viridiplantae were searched for homology against 21,637 EST sequences resulting in identification of 6 potential miRNA candidates belonging to 6 different miRNA families. The psRNATarget server predicted 33 potential target genes and their probable functions for the six identified miRNA families in garlic. Most of the garlic miRNA target genes seem to encode transcription factors as well as genes involved in stress response, metabolism, plant growth and development. The results from the present study will shed more light on the understanding of molecular mechanisms of miRNA in garlic which may aid in the development of novel and precise techniques to understand some post-transcriptional gene silencing mechanism in response to stress tolerance.

  9. Bioinformatic analysis of fruit-specific expressed sequence tag libraries of Diospyros kaki Thunb.: view at the transcriptome at different developmental stages.

    PubMed

    Sablok, Gaurav; Luo, Chun; Lee, Wan Sin; Rahman, Farzana; Tatarinova, Tatiana V; Harikrishna, Jennifer Ann; Luo, Zhengrong

    2011-07-01

    We present here a systematic analysis of the Diospyros kaki expressed sequence tags (ESTs) generated from development stage-specific libraries. A total of 2,529 putative tentative unigenes were identified in the MF library whereas the OYF library displayed 3,775 tentative unigenes. Among the two cDNA libraries, 325 EST-Simple sequence repeats (SSRs) in 296 putative unigenes were detected in the MF library showing an occurrence of 11.7% with a frequency of 1 SSR/3.16 kb whereas the OYF library had an EST-SSRs occurrence of 10.8% with 407 EST-SSRs in the 352 putative unigenes with a frequency of 1 SSR/2.92 kb. We observed a higher frequency of SNPs and indels in the OYF library (20.94 SNPs/indels per 100 bp) in comparison to MF library showed a relatively lower frequency (0.74 SNPs/indels per 100 bp). A combined homology and secondary structure analysis approach identified a potential miRNA precursor, an ortholog of miR159, and potential miR159 targets, in the development-specific ESTs of D. kaki. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s13205-011-0005-9) contains supplementary material, which is available to authorized users.

  10. Analysis and functional annotation of expressed sequence tags from in vitro cell lines of elasmobranchs: Spiny dogfish shark (Squalus acanthias) and little skate (Leucoraja erinacea).

    PubMed

    Parton, Angela; Bayne, Christopher J; Barnes, David W

    2010-09-01

    Elasmobranchs are the most commonly used experimental models among the jawed, cartilaginous fish (Chondrichthyes). Previously we developed cell lines from embryos of two elasmobranchs, Squalus acanthias the spiny dogfish shark (SAE line), and Leucoraja erinacea the little skate (LEE-1 line). From these lines cDNA libraries were derived and expressed sequence tags (ESTs) generated. From the SAE cell line 4303 unique transcripts were identified, with 1848 of these representing unknown sequences (showing no BLASTX identification). From the LEE-1 cell line, 3660 unique transcripts were identified, and unknown, unique sequences totaled 1333. Gene Ontology (GO) annotation showed that GO assignments for the two cell lines were in general similar. These results suggest that the procedures used to derive the cell lines led to isolation of cell types of the same general embryonic origin from both species. The LEE-1 transcripts included GO categories "envelope" and "oxidoreductase activity" but the SAE transcripts did not. GO analysis of SAE transcripts identified the category "anatomical structure formation" that was not present in LEE-1 cells. Increased organelle compartments may exist within LEE-1 cells compared to SAE cells, and the higher oxidoreductase activity in LEE-1 cells may indicate a role for these cells in responses associated with innate immunity or in steroidogenesis. These EST libraries from elasmobranch cell lines provide information for assembly of genomic sequences and are useful in revealing gene diversity, new genes and molecular markers, as well as in providing means for elucidation of full-length cDNAs and probes for gene array analyses. This is the first study of this type with members of the Chondrichthyes.

  11. Internal epitope tagging informed by relative lack of sequence conservation

    PubMed Central

    Burg, Leonard; Zhang, Karen; Bonawitz, Tristan; Grajevskaja, Viktorija; Bellipanni, Gianfranco; Waring, Richard; Balciunas, Darius

    2016-01-01

    Many experimental techniques rely on specific recognition and stringent binding of proteins by antibodies. This can readily be achieved by introducing an epitope tag. We employed an approach that uses a relative lack of evolutionary conservation to inform epitope tag site selection, followed by integration of the tag-coding sequence into the endogenous locus in zebrafish. We demonstrate that an internal epitope tag is accessible for antibody binding, and that tagged proteins retain wild type function. PMID:27892520

  12. Generation and analysis of expressed sequence tags(ESTs) for marker development in yam (Dioscores alata L.)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A total of 44,757 EST sequences , 1705 EST-SSR and 104 SNP markers were generated from the cDNA libraries of the resistant and susceptible genotypes. We have developed a comprehensive annotated transcriptome data set in yam to enrich the EST information in public databases. These EST resources prov...

  13. Analysis of expressed sequence tags and identification of genes encoding cell-wall-degrading enzymes from the fungivorous nematode Aphelenchus avenae

    PubMed Central

    2009-01-01

    Background The fungivorus nematode, Aphelenchus avenae is widespread in soil and is found in association with decaying plant material. This nematode is also found in association with plants but its ability to cause plant disease remains largely undetermined. The taxonomic position and intermediate lifestyle of A. avenae make it an important model for studying the evolution of plant parasitism within the Nematoda. In addition, the exceptional capacity of this nematode to survive desiccation makes it an important system for study of anhydrobiosis. Expressed sequence tag (EST) analysis may therefore be useful in providing an initial insight into the poorly understood genetic background of A. avenae. Results We present the generation, analysis and annotation of over 5,000 ESTs from a mixed-stage A. avenae cDNA library. Clustering of 5,076 high-quality ESTs resulted in a set of 2,700 non-redundant sequences comprising 695 contigs and 2,005 singletons. Comparative analyses indicated that 1,567 (58.0%) of the cluster sequences had homologues in Caenorhabditis elegans, 1,750 (64.8%) in other nematodes, 1,321(48.9%) in organisms other than nematodes, and 862 (31.9%) had no significant match to any sequence in current protein or nucleotide databases. In addition, 1,100 (40.7%) of the sequences were functionally classified using Gene Ontology (GO) hierarchy. Similarity searches of the cluster sequences identified a set of genes with significant homology to genes encoding enzymes that degrade plant or fungal cell walls. The full length sequences of two genes encoding glycosyl hydrolase family 5 (GHF5) cellulases and two pectate lyase genes encoding polysaccharide lyase family 3 (PL3) proteins were identified and characterized. Conclusion We have described at least 2,214 putative genes from A. avenae and identified a set of genes encoding a range of cell-wall-degrading enzymes. This EST dataset represents a starting point for studies in a number of different fundamental and

  14. A consensus linkage map for sugi (Cryptomeria japonica) from two pedigrees, based on microsatellites and expressed sequence tags.

    PubMed

    Tani, Naoki; Takahashi, Tomokazu; Iwata, Hiroyoshi; Mukai, Yuzuru; Ujino-Ihara, Tokuko; Matsumoto, Asako; Yoshimura, Kensuke; Yoshimaru, Hiroshi; Murai, Masafumi; Nagasaka, Kazutoshi; Tsumura, Yoshihiko

    2003-11-01

    A consensus map for sugi (Cryptomeria japonica) was constructed by integrating linkage data from two unrelated third-generation pedigrees, one derived from a full-sib cross and the other by self-pollination of F1 individuals. The progeny segregation data of the first pedigree were derived from cleaved amplified polymorphic sequences, microsatellites, restriction fragment length polymorphisms, and single nucleotide polymorphisms. The data of the second pedigree were derived from cleaved amplified polymorphic sequences, isozyme markers, morphological traits, random amplified polymorphic DNA markers, and restriction fragment length polymorphisms. Linkage analyses were done for the first pedigree with JoinMap 3.0, using its parameter set for progeny derived by cross-pollination, and for the second pedigree with the parameter set for progeny derived from selfing of F1 individuals. The 11 chromosomes of C. japonica are represented in the consensus map. A total of 438 markers were assigned to 11 large linkage groups, 1 small linkage group, and 1 nonintegrated linkage group from the second pedigree; their total length was 1372.2 cM. On average, the consensus map showed 1 marker every 3.0 cM. PCR-based codominant DNA markers such as cleaved amplified polymorphic sequences and microsatellite markers were distributed in all linkage groups and occupied about half of mapped loci. These markers are very useful for integration of different linkage maps, QTL mapping, and comparative mapping for evolutional study, especially for species with a large genome size such as conifers.

  15. Development of Microsatellite Markers Derived from Expressed Sequence Tags of Polyporales for Genetic Diversity Analysis of Endangered Polyporus umbellatus

    PubMed Central

    Zhang, Yuejin; Chen, Yuanyuan; Wang, Ruihong; Zeng, Ailin; Deyholos, Michael K.; Shu, Jia; Guo, Hongbo

    2015-01-01

    A large scale of EST sequences of Polyporales was screened in this investigation in order to identify EST-SSR markers for various applications. The distribution of EST sequences and SSRs in five families of Polyporales was analyzed, respectively. Mononucleotide was the most abundant type, followed by trinucleotide. Among five families, Ganodermataceae occupied the most SSR markers, followed by Coriolaceae. Functional prediction of SSR marker-containing EST sequences in Ganoderma lucidum obtained three main groups, namely, cellular component, biological process, and molecular function. Thirty EST-SSR primers were designed to evaluate the genetic diversity of 13 natural Polyporus umbellatus accessions. Twenty one EST-SSRs were polymorphic with average PIC value of 0.33 and transferability rate of 71%. These 13 P. umbellatus accessions showed relatively high genetic diversity. The expected heterozygosity, Nei's gene diversity, and Shannon information index were 0.41, 0.39, and 0.57, respectively. Both UPGMA dendrogram and principal coordinate analysis (PCA) showed the same cluster result that divided the 13 accessions into three or four groups. PMID:26146636

  16. Global Transcriptome Analysis of the Tentacle of the Jellyfish Cyanea capillata Using Deep Sequencing and Expressed Sequence Tags: Insight into the Toxin- and Degenerative Disease-Related Transcripts

    PubMed Central

    Liu, Dan; Wang, Qianqian; Ruan, Zengliang; He, Qian; Zhang, Liming

    2015-01-01

    Background Jellyfish contain diverse toxins and other bioactive components. However, large-scale identification of novel toxins and bioactive components from jellyfish has been hampered by the low efficiency of traditional isolation and purification methods. Results We performed de novo transcriptome sequencing of the tentacle tissue of the jellyfish Cyanea capillata. A total of 51,304,108 reads were obtained and assembled into 50,536 unigenes. Of these, 21,357 unigenes had homologues in public databases, but the remaining unigenes had no significant matches due to the limited sequence information available and species-specific novel sequences. Functional annotation of the unigenes also revealed general gene expression profile characteristics in the tentacle of C. capillata. A primary goal of this study was to identify putative toxin transcripts. As expected, we screened many transcripts encoding proteins similar to several well-known toxin families including phospholipases, metalloproteases, serine proteases and serine protease inhibitors. In addition, some transcripts also resembled molecules with potential toxic activities, including cnidarian CfTX-like toxins with hemolytic activity, plancitoxin-1, venom toxin-like peptide-6, histamine-releasing factor, neprilysin, dipeptidyl peptidase 4, vascular endothelial growth factor A, angiotensin-converting enzyme-like and endothelin-converting enzyme 1-like proteins. Most of these molecules have not been previously reported in jellyfish. Interestingly, we also characterized a number of transcripts with similarities to proteins relevant to several degenerative diseases, including Huntington’s, Alzheimer’s and Parkinson’s diseases. This is the first description of degenerative disease-associated genes in jellyfish. Conclusion We obtained a well-categorized and annotated transcriptome of C. capillata tentacle that will be an important and valuable resource for further understanding of jellyfish at the molecular

  17. Unraveling new genes associated with seed development and metabolism in Bixa orellana L. by expressed sequence tag (EST) analysis.

    PubMed

    Soares, Virgínia L F; Rodrigues, Simone M; de Oliveira, Tahise M; de Queiroz, Talisson O; Lima, Lívia S; Hora-Júnior, Braz T; Gramacho, Karina P; Micheli, Fabienne; Cascardo, Júlio C M; Otoni, Wagner C; Gesteira, Abelmon S; Costa, Marcio G C

    2011-02-01

    The tropical tree Bixa orellana L. produces a range of secondary metabolites which biochemical and molecular biosynthesis basis are not well understood. In this work we have characterized a set of ESTs from a non-normalized cDNA library of B. orellana seeds to obtain information about the main developmental and metabolic processes taking place in developing seeds and their associated genes. After sequencing a set of randomly selected clones, most of the sequences were assigned with putative functions based on similarity, GO annotations and protein domains. The most abundant transcripts encoded proteins associated with cell wall (prolyl 4-hydroxylase), fatty acid (acyl carrier protein), and hormone/flavonoid (2OG-Fe oxygenase) synthesis, germination (MADS FLC-like protein) and embryo development (AP2/ERF transcription factor) regulation, photosynthesis (chlorophyll a-b binding protein), cell elongation (MAP65-1a), and stress responses (metallothionein- and thaumatin-like proteins). Enzymes were assigned to 16 different metabolic pathways related to both primary and secondary metabolisms. Characterization of two candidate genes of the bixin biosynthetic pathway, BoCCD and BoOMT, showed that they belong, respectively, to the carotenoid-cleavage dioxygenase 4 (CCD4) and caffeic acid O-methyltransferase (COMT) families, and are up-regulated during seed development. It indicates their involvement in the synthesis of this commercially important carotenoid pigment in seeds of B. orellana. Most of the genes identified here are the first representatives of their gene families in B. orellana.

  18. Expressed sequence tag analysis and development of gene associated markers in a near-isogenic plant system of Eragrostis curvula.

    PubMed

    Cervigni, Gerardo D L; Paniego, Norma; Díaz, Marina; Selva, Juan P; Zappacosta, Diego; Zanazzi, Darío; Landerreche, Iñaki; Martelotto, Luciano; Felitti, Silvina; Pessino, Silvina; Spangenberg, Germán; Echenique, Viviana

    2008-05-01

    Eragrostis curvula (Schrad.) Nees is a forage grass native to the semiarid regions of Southern Africa, which reproduces mainly by pseudogamous diplosporous apomixis. A collection of ESTs was generated from four cDNA libraries, three of them obtained from panicles of near-isogenic lines with different ploidy levels and reproductive modes, and one obtained from 12 days-old plant leaves. A total of 12,295 high-quality ESTs were clustered and assembled, rendering 8,864 unigenes, including 1,490 contigs and 7,394 singletons, with a genome coverage of 22%. A total of 7,029 (79.11%) unigenes were functionally categorized by BLASTX analysis against sequences deposited in public databases, but only 37.80% could be classified according to Gene Ontology. Sequence comparison against the cereals genes indexes (GI) revealed 50% significant hits. A total of 254 EST-SSRs were detected from 219 singletons and 35 from contigs. Di- and tri- motifs were similarly represented with percentages of 38.95 and 40.16%, respectively. In addition, 190 SNPs and Indels were detected in 18 contigs generated from 3 to 4 libraries. The ESTs and the molecular markers obtained in this study will provide valuable resources for a wide range of applications including gene identification, genetic mapping, cultivar identification, analysis of genetic diversity, phenotype mapping and marker assisted selection.

  19. Identification of salt-induced genes from Salicornia brachiata, an extreme halophyte through expressed sequence tags analysis.

    PubMed

    Jha, Bhavanath; Agarwal, Pradeep K; Reddy, Palakolanu Sudhakar; Lal, Sanjay; Sopory, Sudhir K; Reddy, Malireddy K

    2009-04-01

    Salinity severely affects plant growth and development causing crop loss worldwide. We have isolated a large number of salt-induced genes as well as unknown and hypothetical genes from Salicornia brachiata Roxb. (Amaranthaceae). This is the first description of identification of genes in response to salinity stress in this extreme halophyte plant. Salicornia accumulates salt in its pith and survives even at 2 M NaCl under field conditions. For isolating salt responsive genes, cDNA subtractive hybridization was performed between control and 500 mM NaCl treated plants. Out of the 1200 recombinant clones, 930 sequences were submitted to the NCBI database (GenBank accession: EB484528 to EB485289 and EC906125 to EC906292). 789 ESTs showed matching with different genes in NCBI database. 4.8% ESTs belonged to stress-tolerant gene category and approximately 29% ESTs showed no homology with known functional gene sequences, thus classified as unknown or hypothetical. The detection of a large number of ESTs with unknown putative function in this species makes it an interesting contribution. The 90 unknown and hypothetical genes were selected to study their differential regulation by reverse Northern analysis for identifying their role in salinity tolerance. Interestingly, both up and down regulation at 500 mM NaCl were observed (21 and 10 genes, respectively). Northern analysis of two important salt tolerant genes, ASR1 (Abscisic acid stress ripening gene) and plasma membrane H+ATPase, showed the basal level of transcripts in control condition and an increase with NaCl treatment. ASR1 gene is made full length using 5' RACE and its potential role in imparting salt tolerance is being studied.

  20. Comparative analysis of secreted protein evolution using expressed sequence tags from four poplar leaf rusts (Melampsora spp.)

    PubMed Central

    2010-01-01

    Background Obligate biotrophs such as rust fungi are believed to establish long-term relationships by modulating plant defenses through a plethora of effector proteins, whose most recognizable feature is the presence of a signal peptide for secretion. Since the phenotypes of these effectors extend to host cells, their genes are expected to be under accelerated evolution stimulated by host-pathogen coevolutionary arms races. Recently, whole genome sequence data has allowed the prediction of secretomes, facilitating the identification of putative effectors. Results We generated cDNA libraries from four poplar leaf rust pathogens (Melampsora spp.) and used computational approaches to identify and annotate putative secreted proteins with the aim of uncovering new knowledge about the nature and evolution of the rust secretome. While more than half of the predicted secretome members encoded lineage-specific proteins, similarities with experimentally characterized fungal effectors were also identified. A SAGE analysis indicated a strong stage-specific regulation of transcripts encoding secreted proteins. The average sequence identity of putative secreted proteins to their closest orthologs in the wheat stem rust Puccinia graminis f. sp. tritici was dramatically reduced compared with non-secreted ones. A comparative genomics approach based on homologous gene groups unravelled positive selection in putative members of the secretome. Conclusion We uncovered robust evidence that different evolutionary constraints are acting on the rust secretome when compared to the rest of the genome. These results are consistent with the view that these genes are more likely to exhibit an effector activity and be involved in coevolutionary arms races with host factors. PMID:20615251

  1. DNA Sequence-Based “Bar Codes” for Tracking the Origins of Expressed Sequence Tags from a Maize cDNA Library Constructed Using Multiple mRNA Sources1

    PubMed Central

    Qiu, Fang; Guo, Ling; Wen, Tsui-Jung; Liu, Feng; Ashlock, Daniel A.; Schnable, Patrick S.

    2003-01-01

    To enhance gene discovery, expressed sequence tag (EST) projects often make use of cDNA libraries produced using diverse mixtures of mRNAs. As such, expression data are lost because the origins of the resulting ESTs cannot be determined. Alternatively, multiple libraries can be prepared, each from a more restricted source of mRNAs. Although this approach allows the origins of ESTs to be determined, it requires the production of multiple libraries. A hybrid approach is reported here. A cDNA library was prepared using 21 different pools of maize (Zea mays) mRNAs. DNA sequence “bar codes” were added during first-strand cDNA synthesis to uniquely identify the mRNA source pool from which individual cDNAs were derived. Using a decoding algorithm that included error correction, it was possible to identify the source mRNA pool of more than 97% of the ESTs. The frequency at which a bar code is represented in an EST contig should be proportional to the abundance of the corresponding mRNA in the source pool. Consistent with this, all ESTs derived from several genes (zein and adh1) that are known to be exclusively expressed in kernels or preferentially expressed under anaerobic conditions, respectively, were exclusively tagged with bar codes associated with mRNA pools prepared from kernel and anaerobically treated seedlings, respectively. Hence, by allowing for the retention of expression data, the bar coding of cDNA libraries can enhance the value of EST projects. PMID:14555776

  2. A comprehensive expressed sequence tag linkage map for tiger salamander and Mexican axolotl: enabling gene mapping and comparative genomics in Ambystoma.

    PubMed

    Smith, J J; Kump, D K; Walker, J A; Parichy, D M; Voss, S R

    2005-11-01

    Expressed sequence tag (EST) markers were developed for Ambystoma tigrinum tigrinum (Eastern tiger salamander) and for A. mexicanum (Mexican axolotl) to generate the first comprehensive linkage map for these model amphibians. We identified 14 large linkage groups (125.5-836.7 cM) that presumably correspond to the 14 haploid chromosomes in the Ambystoma genome. The extent of genome coverage for these linkage groups is apparently high because the total map size (5251 cM) falls within the range of theoretical estimates and is consistent with independent empirical estimates. Unlike most vertebrate species, linkage map size in Ambystoma is not strongly correlated with chromosome arm number. Presumably, the large physical genome size ( approximately 30 Gbp) is a major determinant of map size in Ambystoma. To demonstrate the utility of this resource, we mapped the position of two historically significant A. mexicanum mutants, white and melanoid, and also met, a quantitative trait locus (QTL) that contributes to variation in metamorphic timing. This new collection of EST-based PCR markers will better enable the Ambystoma system by facilitating development of new molecular probes, and the linkage map will allow comparative studies of this important vertebrate group.

  3. Analysis of expressed sequence tags (ESTs) and gene expression changes under different growth conditions for the ciliate Anophryoides haemophila, the causative agent of bumper car disease in the American lobster (Homarus americanus).

    PubMed

    Acorn, Adam R; Clark, K Fraser; Jones, Sarah; Després, Béatrice M; Munro, Sarah; Cawthorn, Richard J; Greenwood, Spencer J

    2011-06-01

    The scuticociliate Anophryoides haemophila, causes bumper car disease in American lobster (Homarus americanus) in commercial holding facilities in Atlantic Canada. While the parasite has been recognized since the 1970s and much has been learned about its biology, minimal molecular characterization exists. With genome consortiums turning to model organisms like the ciliates Tetrahymena and Paramecium, the amount of relevant sequence data available has made sequence surveys more attractive for gene discovery in related ciliates. We sequenced 9984 expressed sequence tags (ESTs) from a non-normalized A. haemophila cDNA library to characterize gene expression patterns, functional gene distribution and to discover novel genes related to the parasitic life history. The A. haemophila ESTs were grouped into 843 clusters and singletons with 658 EST clusters having identifiable homologs, while 159 ESTs were unique and had no similarity to any sequences in the public databases. Not unexpectedly, about 67% of the A. haemophila ESTs have similarity to annotated and hypothetical genes from the related oligohymenophorean ciliate, Tetrahymena. Numerous cysteine proteases, hypothetical proteins and novel sequences possess putative secretory signal peptides suggesting that they may contribute to the pathogenesis of bumper car disease in lobster. Real time RT-qPCR analysis of cathepsin L and two homologs of cathepsin B did not show any changes in gene expression under varying in vitro growth conditions or during a modified-in vivo infection which may be suggestive of the opportunistic life history strategy of this ciliate.

  4. Construction of a cDNA library and preliminary analysis of expressed sequence tags in Piper hainanense.

    PubMed

    Fan, R; Ling, P; Hao, C Y; Li, F P; Huang, L F; Wu, B D; Wu, H S

    2015-10-19

    Black pepper is a perennial climbing vine. It is widely cultivated because its berries can be utilized not only as a spice in food but also for medicinal use. This study aimed to construct a standardized, high-quality cDNA library to facilitated identification of new Piper hainanense transcripts. For this, 262 unigenes were used to generate raw reads. The average length of these 262 unigenes was 774.8 bp. Of these, 94 genes (35.9%) were newly identified, according to the NCBI protein database. Thus, identification of new genes may broaden the molecular knowledge of P. hainanense on the basis of Clusters of Orthologous Groups and Gene Ontology categories. In addition, certain basic genes linked to physiological processes, which can contribute to disease resistance and thereby to the breeding of black pepper. A total of 26 unigenes were found to be SSR markers. Dinucleotide SSR was the main repeat motif, accounting for 61.54%, followed by trinucleotide SSR (23.07%). Eight primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism among twenty-one piper germplasm. These results present a novel sequence information of P. hainanense, which can serve as the foundation for further genetic research on this species.

  5. Analysis of expressed sequence tags from Actinidia: applications of a cross species EST database for gene discovery in the areas of flavor, health, color and ripening

    PubMed Central

    Crowhurst, Ross N; Gleave, Andrew P; MacRae, Elspeth A; Ampomah-Dwamena, Charles; Atkinson, Ross G; Beuning, Lesley L; Bulley, Sean M; Chagne, David; Marsh, Ken B; Matich, Adam J; Montefiori, Mirco; Newcomb, Richard D; Schaffer, Robert J; Usadel, Björn; Allan, Andrew C; Boldingh, Helen L; Bowen, Judith H; Davy, Marcus W; Eckloff, Rheinhart; Ferguson, A Ross; Fraser, Lena G; Gera, Emma; Hellens, Roger P; Janssen, Bart J; Klages, Karin; Lo, Kim R; MacDiarmid, Robin M; Nain, Bhawana; McNeilage, Mark A; Rassam, Maysoon; Richardson, Annette C; Rikkerink, Erik HA; Ross, Gavin S; Schröder, Roswitha; Snowden, Kimberley C; Souleyre, Edwige JF; Templeton, Matt D; Walton, Eric F; Wang, Daisy; Wang, Mindy Y; Wang, Yanming Y; Wood, Marion; Wu, Rongmei; Yauk, Yar-Khing; Laing, William A

    2008-01-01

    Background Kiwifruit (Actinidia spp.) are a relatively new, but economically important crop grown in many different parts of the world. Commercial success is driven by the development of new cultivars with novel consumer traits including flavor, appearance, healthful components and convenience. To increase our understanding of the genetic diversity and gene-based control of these key traits in Actinidia, we have produced a collection of 132,577 expressed sequence tags (ESTs). Results The ESTs were derived mainly from four Actinidia species (A. chinensis, A. deliciosa, A. arguta and A. eriantha) and fell into 41,858 non redundant clusters (18,070 tentative consensus sequences and 23,788 EST singletons). Analysis of flavor and fragrance-related gene families (acyltransferases and carboxylesterases) and pathways (terpenoid biosynthesis) is presented in comparison with a chemical analysis of the compounds present in Actinidia including esters, acids, alcohols and terpenes. ESTs are identified for most genes in color pathways controlling chlorophyll degradation and carotenoid biosynthesis. In the health area, data are presented on the ESTs involved in ascorbic acid and quinic acid biosynthesis showing not only that genes for many of the steps in these pathways are represented in the database, but that genes encoding some critical steps are absent. In the convenience area, genes related to different stages of fruit softening are identified. Conclusion This large EST resource will allow researchers to undertake the tremendous challenge of understanding the molecular basis of genetic diversity in the Actinidia genus as well as provide an EST resource for comparative fruit genomics. The various bioinformatics analyses we have undertaken demonstrates the extent of coverage of ESTs for genes encoding different biochemical pathways in Actinidia. PMID:18655731

  6. Identification and characterisation of functional expressed sequence tags-derived simple sequence repeat (eSSR) markers for genetic linkage mapping of Schistosoma mansoni juvenile resistance and susceptibility loci in Biomphalaria glabrata

    PubMed Central

    Ittiprasert, Wannaporn; Miller, André; Su, Xin-zhuan; Mu, Jianbing; Bhusudsawang, Ganlayarat; Ukoskit, Kitipat; Knight, Matty

    2013-01-01

    Biomphalaria glabrata susceptibility to Schistosoma mansoni has a strong genetic component, offering the possibility for investigating host–parasite interactions at the molecular level, perhaps leading to novel control approaches. The identification, mapping and molecular characterisation of genes that influence the outcome of parasitic infection in the intermediate snail host is, therefore, seen as fundamental to the control of schistosomiasis. To better understand the evolutionary processes driving disease resistance/susceptibility phenotypes, we previously identified polymorphic random amplification of polymorphic DNA and genomic simple sequence repeats from B. glabrata. In the present study we identified and characterised polymorphic expressed simple sequence repeats markers (Bg-eSSR) from existing B. glabrata expressed sequence tags. Using these markers, and with previously identified genomic simple sequence repeats, genetic linkage mapping for parasite refractory and susceptibility phenotypes, the first known for B. glabrata, was initiated. Data mining of 54,309 expressed sequence tag, produced 660 expressed simple sequence repeats of which dinucleotide motifs (TA)n were the most common (37.88%), followed by trinucleotide (29.55%), mononucleotide (18.64%) and tetranucleotide (10.15%). Penta- and hexanucleotide motifs represented <3% of the Bg-eSSRs identified. While the majority (71%) of Bg-eSSRs were monomorphic between resistant and susceptible snails, several were, however, useful for the construction of a genetic linkage map based on their inheritance in segregating F2 progeny snails derived from crossing juvenile BS-90 and NMRI snails. Polymorphic Bg-eSSRs assorted into six linkage groups at a logarithm of odds score of 3. Interestingly, the heritability of four markers (Prim1_910, Prim1_771, Prim6_1024 and Prim7_823) with juvenile snail resistance were, by t-test, significant (P < 0.05) while an allelic marker, Prim24_524, showed linkage with the

  7. Ginger and turmeric expressed sequence tags identify signature genes for rhizome identity and development and the biosynthesis of curcuminoids, gingerols and terpenoids

    PubMed Central

    2013-01-01

    Background Ginger (Zingiber officinale) and turmeric (Curcuma longa) accumulate important pharmacologically active metabolites at high levels in their rhizomes. Despite their importance, relatively little is known regarding gene expression in the rhizomes of ginger and turmeric. Results In order to identify rhizome-enriched genes and genes encoding specialized metabolism enzymes and pathway regulators, we evaluated an assembled collection of expressed sequence tags (ESTs) from eight different ginger and turmeric tissues. Comparisons to publicly available sorghum rhizome ESTs revealed a total of 777 gene transcripts expressed in ginger/turmeric and sorghum rhizomes but apparently absent from other tissues. The list of rhizome-specific transcripts was enriched for genes associated with regulation of tissue growth, development, and transcription. In particular, transcripts for ethylene response factors and AUX/IAA proteins appeared to accumulate in patterns mirroring results from previous studies regarding rhizome growth responses to exogenous applications of auxin and ethylene. Thus, these genes may play important roles in defining rhizome growth and development. Additional associations were made for ginger and turmeric rhizome-enriched MADS box transcription factors, their putative rhizome-enriched homologs in sorghum, and rhizomatous QTLs in rice. Additionally, analysis of both primary and specialized metabolism genes indicates that ginger and turmeric rhizomes are primarily devoted to the utilization of leaf supplied sucrose for the production and/or storage of specialized metabolites associated with the phenylpropanoid pathway and putative type III polyketide synthase gene products. This finding reinforces earlier hypotheses predicting roles of this enzyme class in the production of curcuminoids and gingerols. Conclusion A significant set of genes were found to be exclusively or preferentially expressed in the rhizome of ginger and turmeric. Specific

  8. QTL analysis of photoperiod sensitivity in common buckwheat by using markers for expressed sequence tags and photoperiod-sensitivity candidate genes

    PubMed Central

    Hara, Takashi; Iwata, Hiroyoshi; Okuno, Kazutoshi; Matsui, Katsuhiro; Ohsawa, Ryo

    2011-01-01

    Photoperiod sensitivity is an important trait related to crop adaptation and ecological breeding in common buckwheat (Fagopyrum esculentum Moench). Although photoperiod sensitivity in this species is thought to be controlled by quantitative trait loci (QTLs), no genes or regions related to photoperiod sensitivity had been identified until now. Here, we identified QTLs controlling photoperiod sensitivity by QTL analysis in a segregating F4 population (n = 100) derived from a cross of two autogamous lines, 02AL113(Kyukei SC2)LH.self and C0408-0 RP. The F4 progenies were genotyped with three markers for photoperiod-sensitivity candidate genes, which were identified based on homology to photoperiod-sensitivity genes in Arabidopsis and 76 expressed sequence tag markers. Among the three photoperiod-sensitivity candidate genes (FeCCA1, FeELF3 and FeCOL3) identified in common buckwheat, FeELF3 was associated with photoperiod sensitivity. Two EST regions, Fest_L0606_4 and Fest_L0337_6, were associated with photoperiod sensitivity and explained 20.0% and 14.2% of the phenotypic variation, respectively. For both EST regions, the allele from 02AL113(Kyukei SC2)LH.self led to early flowering. An epistatic interaction was also confirmed between Fest_L0606_4 and Fest_L0337_6. These results demonstrate that photoperiod sensitivity in common buckwheat is controlled by a pathway consisting of photoperiod-sensitivity candidate genes as well as multiple gene action. PMID:23136477

  9. Expression of the Arabidopsis transposable element Tag1 is targeted to developing gametophytes.

    PubMed Central

    Galli, Mary; Theriault, Angie; Liu, Dong; Crawford, Nigel M

    2003-01-01

    The Arabidopsis transposon Tag1 undergoes late excision during vegetative and germinal development in plants containing 35S-Tag1-GUS constructs. To determine if transcriptional regulation can account for the developmental control of Tag1 excision, the transcriptional activity of Tag1 promoter-GUS fusion constructs of various lengths was examined in transgenic plants. All constructs showed expression in the reproductive organs of developing flowers but no expression in leaves. Expression was restricted to developing gametophytes in both male and female lineages. Quantitative RT-PCR analysis confirmed that Tag1 expression predominates in the reproductive organs of flower buds. These results are consistent with late germinal excision of Tag1, but they cannot explain the vegetative excision activity of Tag1 observed with 35S-Tag1-GUS constructs. To resolve this issue, Tag1 excision was reexamined using elements with no adjacent 35S promoter sequences. Tag1 excision in this context is restricted to germinal events with no detectable vegetative excision. If a 35S enhancer sequence is placed next to Tag1, vegetative excision is restored. These results indicate that the intrinsic activity of Tag1 is restricted to germinal excision due to targeted expression of the Tag1 transposase to developing gametophytes and that this activity is altered by the presence of adjacent enhancers or promoters. PMID:14704189

  10. An Expressed Sequence Tag (EST)-enriched genetic map of turbot (Scophthalmus maximus): a useful framework for comparative genomics across model and farmed teleosts

    PubMed Central

    2012-01-01

    Background The turbot (Scophthalmus maximus) is a relevant species in European aquaculture. The small turbot genome provides a source for genomics strategies to use in order to understand the genetic basis of productive traits, particularly those related to sex, growth and pathogen resistance. Genetic maps represent essential genomic screening tools allowing to localize quantitative trait loci (QTL) and to identify candidate genes through comparative mapping. This information is the backbone to develop marker-assisted selection (MAS) programs in aquaculture. Expressed sequenced tag (EST) resources have largely increased in turbot, thus supplying numerous type I markers suitable for extending the previous linkage map, which was mostly based on anonymous loci. The aim of this study was to construct a higher-resolution turbot genetic map using EST-linked markers, which will turn out to be useful for comparative mapping studies. Results A consensus gene-enriched genetic map of the turbot was constructed using 463 SNP and microsatellite markers in nine reference families. This map contains 438 markers, 180 EST-linked, clustered at 24 linkage groups. Linkage and comparative genomics evidences suggested additional linkage group fusions toward the consolidation of turbot map according to karyotype information. The linkage map showed a total length of 1402.7 cM with low average intermarker distance (3.7 cM; ~2 Mb). A global 1.6:1 female-to-male recombination frequency (RF) ratio was observed, although largely variable among linkage groups and chromosome regions. Comparative sequence analysis revealed large macrosyntenic patterns against model teleost genomes, significant hits decreasing from stickleback (54%) to zebrafish (20%). Comparative mapping supported particular chromosome rearrangements within Acanthopterygii and aided to assign unallocated markers to specific turbot linkage groups. Conclusions The new gene-enriched high-resolution turbot map represents a

  11. A maize map standard with sequenced core markers, grass genome reference points and 932 expressed sequence tagged sites (ESTs) in a 1736-locus map.

    PubMed Central

    Davis, G L; McMullen, M D; Baysdorfer, C; Musket, T; Grant, D; Staebell, M; Xu, G; Polacco, M; Koster, L; Melia-Hancock, S; Houchins, K; Chao, S; Coe, E H

    1999-01-01

    We have constructed a 1736-locus maize genome map containing1156 loci probed by cDNAs, 545 probed by random genomic clones, 16 by simple sequence repeats (SSRs), 14 by isozymes, and 5 by anonymous clones. Sequence information is available for 56% of the loci with 66% of the sequenced loci assigned functions. A total of 596 new ESTs were mapped from a B73 library of 5-wk-old shoots. The map contains 237 loci probed by barley, oat, wheat, rice, or tripsacum clones, which serve as grass genome reference points in comparisons between maize and other grass maps. Ninety core markers selected for low copy number, high polymorphism, and even spacing along the chromosome delineate the 100 bins on the map. The average bin size is 17 cM. Use of bin assignments enables comparison among different maize mapping populations and experiments including those involving cytogenetic stocks, mutants, or quantitative trait loci. Integration of nonmaize markers in the map extends the resources available for gene discovery beyond the boundaries of maize mapping information into the expanse of map, sequence, and phenotype information from other grass species. This map provides a foundation for numerous basic and applied investigations including studies of gene organization, gene and genome evolution, targeted cloning, and dissection of complex traits. PMID:10388831

  12. A maize map standard with sequenced core markers, grass genome reference points and 932 expressed sequence tagged sites (ESTs) in a 1736-locus map.

    PubMed

    Davis, G L; McMullen, M D; Baysdorfer, C; Musket, T; Grant, D; Staebell, M; Xu, G; Polacco, M; Koster, L; Melia-Hancock, S; Houchins, K; Chao, S; Coe, E H

    1999-07-01

    We have constructed a 1736-locus maize genome map containing1156 loci probed by cDNAs, 545 probed by random genomic clones, 16 by simple sequence repeats (SSRs), 14 by isozymes, and 5 by anonymous clones. Sequence information is available for 56% of the loci with 66% of the sequenced loci assigned functions. A total of 596 new ESTs were mapped from a B73 library of 5-wk-old shoots. The map contains 237 loci probed by barley, oat, wheat, rice, or tripsacum clones, which serve as grass genome reference points in comparisons between maize and other grass maps. Ninety core markers selected for low copy number, high polymorphism, and even spacing along the chromosome delineate the 100 bins on the map. The average bin size is 17 cM. Use of bin assignments enables comparison among different maize mapping populations and experiments including those involving cytogenetic stocks, mutants, or quantitative trait loci. Integration of nonmaize markers in the map extends the resources available for gene discovery beyond the boundaries of maize mapping information into the expanse of map, sequence, and phenotype information from other grass species. This map provides a foundation for numerous basic and applied investigations including studies of gene organization, gene and genome evolution, targeted cloning, and dissection of complex traits.

  13. DNA methylation mapping by tag-modified bisulfite genomic sequencing.

    PubMed

    Han, Weiguo; Cauchi, Stephane; Herman, James G; Spivack, Simon D

    2006-08-01

    A tag-modified bisulfite genomic sequencing (tBGS) method employing direct cycle sequencing of polymerase chain reaction (PCR) products at kilobase scale, without conventional DNA fragment cloning, was developed for simplified evaluation of DNA methylation sites. The method entails subjecting bisulfite-modified genomic DNA to a second-round PCR amplification employing GC-tagged primers. Qualitative results from tBGS closely correlated with those from conventional BGS (R=0.935, p=0.002). In application, the intertissue and interindividual CpG methylation differences in promoter sequence for two genes, CYP1B1 and GSTP1, were then explored across four human tissue types (peripheral blood cells, exfoliated buccal cells, paired nontumor-tumor lung tissues), and two lung cell types in culture (normal NHBE and malignant A549). Predominantly conserved methylation maps for the two gene promoters were apparent across donors and tissues. At any given CpG site, variation in the degree of methylation could be determined by the relative height of C and T peaks in the sequencing trace. Methylation maps for the GSTP1 promoter diverged between NHBE (unmethylated) and A549 (completely methylated) cells in a previously unexplored upstream region, correlating with a 2.7-fold difference in GSTP1 mRNA expression (p<0.01). The tBGS method simplifies detailed methylation scanning of kilobase-scale genomic DNA, facilitating more ambitious genomic methylation mapping studies.

  14. Discovery and mapping of a new expressed sequence tag-single nucleotide polymorphism and simple sequence repeat panel for large-scale genetic studies and breeding of Theobroma cacao L.

    PubMed

    Allegre, Mathilde; Argout, Xavier; Boccara, Michel; Fouet, Olivier; Roguet, Yolande; Bérard, Aurélie; Thévenin, Jean Marc; Chauveau, Aurélie; Rivallan, Ronan; Clement, Didier; Courtois, Brigitte; Gramacho, Karina; Boland-Augé, Anne; Tahi, Mathias; Umaharan, Pathmanathan; Brunel, Dominique; Lanaud, Claire

    2012-01-01

    Theobroma cacao is an economically important tree of several tropical countries. Its genetic improvement is essential to provide protection against major diseases and improve chocolate quality. We discovered and mapped new expressed sequence tag-single nucleotide polymorphism (EST-SNP) and simple sequence repeat (SSR) markers and constructed a high-density genetic map. By screening 149 650 ESTs, 5246 SNPs were detected in silico, of which 1536 corresponded to genes with a putative function, while 851 had a clear polymorphic pattern across a collection of genetic resources. In addition, 409 new SSR markers were detected on the Criollo genome. Lastly, 681 new EST-SNPs and 163 new SSRs were added to the pre-existing 418 co-dominant markers to construct a large consensus genetic map. This high-density map and the set of new genetic markers identified in this study are a milestone in cocoa genomics and for marker-assisted breeding. The data are available at http://tropgenedb.cirad.fr.

  15. Accumulation, functional annotation, and comparative analysis of expressed sequence tags in eggplant (Solanum melongena L.), the third pole of the genus Solanum species after tomato and potato.

    PubMed

    Fukuoka, Hiroyuki; Yamaguchi, Hirotaka; Nunome, Tsukasa; Negoro, Satomi; Miyatake, Koji; Ohyama, Akio

    2010-01-15

    Eggplant (Solanum melongena L.) is a widely grown vegetable crop that belongs to the genus Solanum, which is comprised of more than 1000 species of wide genetic and phenotypic variation. Unlike tomato and potato, Solanum crops that belong to subgenus Potatoe and have been targets for comprehensive genomic studies, eggplant is endemic to the Old World and belongs to a different subgenus, Leptostemonum, and therefore, would be a unique member for comparative molecular biology in Solanum. In this study, more than 60,000 eggplant cDNA clones from various tissues and treatments were sequenced from both the 5'- and 3'-ends, and a unigene set consisting of 16,245 unique sequences was constructed. Functional annotations based on sequence similarity to known plant reference datasets revealed a distribution of functional categories almost similar to that of tomato, while 1316 unigenes were suggested to be eggplant-specific. Sequence-based comparative analysis using putative orthologous gene groups setup by reciprocal sequence comparison among six solanaceous species suggested that eggplant and its wild ally Solanum torvum were clustered separately from subgenus Potatoe species, and then, all Solanum species were clustered separately from the genus Capsicum. Microsatellite motif distribution was different among species and likely to be coincident with the phylogenetic relationships. Furthermore, the eggplant unigene dataset exhibited its utility in transcriptome analysis by the SAGE strategy where a considerable number of short tag sequences of interest were successfully assigned to unigenes and their functional annotations. The eggplant ESTs and 16k unigene set developed in this study would be a useful resource not only for molecular genetics and breeding in eggplant itself, but for expanding the scope of comparative biology in Solanum species.

  16. Protein identification with N and C-terminal sequence tags in proteome projects.

    PubMed

    Wilkins, M R; Gasteiger, E; Tonella, L; Ou, K; Tyler, M; Sanchez, J C; Gooley, A A; Walsh, B J; Bairoch, A; Appel, R D; Williams, K L; Hochstrasser, D F

    1998-05-08

    Genome sequences are available for increasing numbers of organisms. The proteomes (protein complement expressed by the genome) of many such organisms are being studied with two-dimensional (2D) gel electrophoresis. Here we have investigated the application of short N-terminal and C-terminal sequence tags to the identification of proteins separated on 2D gels. The theoretical N and C termini of 15, 519 proteins, representing all SWISS-PROT entries for the organisms Mycoplasma genitalium, Bacillus subtilis, Escherichia coli, Saccharomyces cerevisiae and human, were analysed. Sequence tags were found to be surprisingly specific, with N-terminal tags of four amino acid residues found to be unique for between 43% and 83% of proteins, and C-terminal tags of four amino acid residues unique for between 74% and 97% of proteins, depending on the species studied. Sequence tags of five amino acid residues were found to be even more specific. To utilise this specificity of sequence tags for protein identification, we created a world-wide web-accessible protein identification program, TagIdent (http://www.expasy.ch/www/tools.html), which matches sequence tags of up to six amino acid residues as well as estimated protein pI and mass against proteins in the SWISS-PROT database. We demonstrate the utility of this identification approach with sequence tags generated from 91 different E. coli proteins purified by 2D gel electrophoresis. Fifty-one proteins were unambiguously identified by virtue of their sequence tags and estimated pI and mass, and a further 11 proteins identified when sequence tags were combined with protein amino acid composition data. We conlcude that the TagIdent identification approach is best suited to the identification of proteins from prokaryotes whose complete genome sequences are available. The approach is less well suited to proteins from eukaryotes, as many eukaryotic proteins are not amenable to sequencing via Edman degradation, and tag protein

  17. Analysis of expressed sequence tags (ESTs) from avocado seed (Persea americana var. drymifolia) reveals abundant expression of the gene encoding the antimicrobial peptide snakin.

    PubMed

    Guzmán-Rodríguez, Jaquelina J; Ibarra-Laclette, Enrique; Herrera-Estrella, Luis; Ochoa-Zarzosa, Alejandra; Suárez-Rodríguez, Luis María; Rodríguez-Zapata, Luis C; Salgado-Garciglia, Rafael; Jimenez-Moraila, Beatriz; López-Meza, Joel E; López-Gómez, Rodolfo

    2013-09-01

    Avocado is one of the most important fruits in the world. Avocado "native mexicano" (Persea americana var. drymifolia) seeds are widely used in the propagation of this plant and are the primary source of rootstocks globally for a variety of avocado cultivars, such as the Hass avocado. Here, we report the isolation of 5005 ESTs from the 5' ends of P. americana var. drymifolia seed cDNA clones representing 1584 possible unigenes. These avocado seed ESTs were compared with the avocado flower EST library, and we detected several genes that are expressed either in both tissues or only in the seed. The snakin gene, which encodes an element of the innate immune response in plants, was one of those most frequently found among the seed ESTs, and this suggests that it is abundantly expressed in the avocado seed. We expressed the snakin gene in a heterologous system, namely the bovine endothelial cell line BVE-E6E7. Conditioned media from transfected BVE-E6E7 cells showed antimicrobial activity against strains of Escherichia coli and Staphylococcus aureus. This is the first study of the function of the snakin gene in plant seed tissue, and our observations suggest that this gene might play a protective role in the avocado seed.

  18. Development of peanut expessed sequence tag-based genomic resources and tools

    Technology Transfer Automated Retrieval System (TEKTRAN)

    U.S. Peanut Genome Initiative (PGI) has widely recognized the need for peanut genome tools and resources development for mitigating peanut allergens and food safety. Genomics such as Expressed Sequence Tag (EST), microarray technologies, and whole genome sequencing provides robotic tools for profili...

  19. Identification of the immune expressed sequence tags of pearl oyster (Pinctada martensii, Dunker 1850) responding to Vibrio alginolyticus challenge by suppression subtractive hybridization.

    PubMed

    Wang, Yanhong; Fu, Dingkun; Luo, Peng; He, Xiaocui

    2012-09-01

    One hemolymph subtracted cDNA library of pearl oyster (Pinctada martensii, Dunker 1837) was constructed using the suppression subtractive hybridization (SSH) in response to Vibrio alginolyticus. A total of 1089 clones were sequenced. All the consensuses were recognized based on the BLAST searches in NCBI, and revealed that 376 (58%) of them had no significant matches to reported sequences in the database. 267 ESTs were in significant matches after homologous sequence searches. Hypothesized genes inferred from EST sequences were categorized into six groups according to their putative biological functions: replication, transcription and translation; cellular processes; responded to stimuli; metabolism and biosynthesis; signal transduction genes; "other" category. The five genes, pearlin gene promoter PGPPm, serine/threonine kinase STKPm, limbic system-associated membrane protein LSAMPPm, nacrein gene intron 6 NGIPm6 and ferritin-like protein FLPPm, were analyzed using real-time PCR. All these genes were significantly expressed after V. alginolyticus challenge.

  20. Construction of a Full-Length Enriched cDNA Library and Preliminary Analysis of Expressed Sequence Tags from Bengal Tiger Panthera tigris tigris

    PubMed Central

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-01-01

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105

  1. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  2. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, Richard A.; Huang, Xiaohua C.; Quesada, Mark A.

    1995-01-01

    A DNA sequencing method described which uses single lane or channel electrophoresis. Sequencing fragments are separated in said lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radio-isotope labels.

  3. Construction of an Ostrea edulis database from genomic and expressed sequence tags (ESTs) obtained from Bonamia ostreae infected haemocytes: Development of an immune-enriched oligo-microarray.

    PubMed

    Pardo, Belén G; Álvarez-Dios, José Antonio; Cao, Asunción; Ramilo, Andrea; Gómez-Tato, Antonio; Planas, Josep V; Villalba, Antonio; Martínez, Paulino

    2016-12-01

    The flat oyster, Ostrea edulis, is one of the main farmed oysters, not only in Europe but also in the United States and Canada. Bonamiosis due to the parasite Bonamia ostreae has been associated with high mortality episodes in this species. This parasite is an intracellular protozoan that infects haemocytes, the main cells involved in oyster defence. Due to the economical and ecological importance of flat oyster, genomic data are badly needed for genetic improvement of the species, but they are still very scarce. The objective of this study is to develop a sequence database, OedulisDB, with new genomic and transcriptomic resources, providing new data and convenient tools to improve our knowledge of the oyster's immune mechanisms. Transcriptomic and genomic sequences were obtained using 454 pyrosequencing and compiled into an O. edulis database, OedulisDB, consisting of two sets of 10,318 and 7159 unique sequences that represent the oyster's genome (WG) and de novo haemocyte transcriptome (HT), respectively. The flat oyster transcriptome was obtained from two strains (naïve and tolerant) challenged with B. ostreae, and from their corresponding non-challenged controls. Approximately 78.5% of 5619 HT unique sequences were successfully annotated by Blast search using public databases. A total of 984 sequences were identified as being related to immune response and several key immune genes were identified for the first time in flat oyster. Additionally, transcriptome information was used to design and validate the first oligo-microarray in flat oyster enriched with immune sequences from haemocytes. Our transcriptomic and genomic sequencing and subsequent annotation have largely increased the scarce resources available for this economically important species and have enabled us to develop an OedulisDB database and accompanying tools for gene expression analysis. This study represents the first attempt to characterize in depth the O. edulis haemocyte transcriptome in

  4. Arabidopsis genes involved in acyl lipid metabolism. A 2003 census of the candidates, a study of the distribution of expressed sequence tags in organs, and a web-based database.

    PubMed

    Beisson, Frédéric; Koo, Abraham J K; Ruuska, Sari; Schwender, Jörg; Pollard, Mike; Thelen, Jay J; Paddock, Troy; Salas, Joaquín J; Savage, Linda; Milcamps, Anne; Mhaske, Vandana B; Cho, Younghee; Ohlrogge, John B

    2003-06-01

    The genome of Arabidopsis has been searched for sequences of genes involved in acyl lipid metabolism. Over 600 encoded proteins have been identified, cataloged, and classified according to predicted function, subcellular location, and alternative splicing. At least one-third of these proteins were previously annotated as "unknown function" or with functions unrelated to acyl lipid metabolism; therefore, this study has improved the annotation of over 200 genes. In particular, annotation of the lipolytic enzyme group (at least 110 members total) has been improved by the critical examination of the biochemical literature and the sequences of the numerous proteins annotated as "lipases." In addition, expressed sequence tag (EST) data have been surveyed, and more than 3,700 ESTs associated with the genes were cataloged. Statistical analysis of the number of ESTs associated with specific cDNA libraries has allowed calculation of probabilities of differential expression between different organs. More than 130 genes have been identified with a statistical probability > 0.95 of preferential expression in seed, leaf, root, or flower. All the data are available as a Web-based database, the Arabidopsis Lipid Gene database (http://www.plantbiology.msu.edu/lipids/genesurvey/index.htm). The combination of the data of the Lipid Gene Catalog and the EST analysis can be used to gain insights into differential expression of gene family members and sets of pathway-specific genes, which in turn will guide studies to understand specific functions of individual genes.

  5. SAGETTARIUS: a program to reduce the number of tags mapped to multiple transcripts and to plan SAGE sequencing stages.

    PubMed

    Bianchetti, Laurent; Wu, Yan; Guerin, Eric; Plewniak, Frédéric; Poch, Olivier

    2007-01-01

    SAGE (Serial Analysis of Gene Expression) experiments generate short nucleotide sequences called 'tags' which are assumed to map unambiguously to their original transcripts (1 tag to 1 transcript mapping). Nevertheless, many tags are generated that do not map to any transcript or map to multiple transcripts. Current bioinformatics resources, such as SAGEmap and TAGmapper, have focused on reducing the number of unmapped tags. Here, we describe SAGETTARIUS, a new high-throughput program that performs successive precise Nla3 and Sau3A tag to transcript mapping, based on specifically designed Virtual Tag (VT) libraries. First, SAGETTARIUS decreases the number of tags mapped to multiple transcripts. Among the various mapping resources compared, SAGETTARIUS performed the best in this respect by decreasing up to 11% the number of multiply mapped tags. Second, SAGETTARIUS allows the establishment of a guideline for SAGE experiment sequencing efforts through efficient mapping of the CRT (Cytoplasmic Ribosomal protein Transcripts)-specific tags. Using all publicly available human and mouse Nla3 SAGE experiments, we show that sequencing 100,000 tags is sufficient to map almost all CRT-specific tags and that four sequencing stages can be identified when carrying out a human or mouse SAGE project. SAGETTARIUS is web interfaced and freely accessible to academic users.

  6. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    SciTech Connect

    Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric; Abernathy, Jason; Waldbieser, Geoff; Lindquist, Erika; Richardson, Paul; Lucas, Susan; Wang, Mei; Li, Ping; Thimmapuram, Jyothi; Liu, Lei; Vullaganti, Deepika; Kucuktas, Huseyin; Murdock, Christopher; Small, Brian C; Wilson, Melanie; Liu, Hong; Jiang, Yanliang; Lee, Yoona; Chen, Fei; Lu, Jianguo; Wang, Wenqi; Xu, Peng; Somridhivej, Benjaporn; Baoprasertkul, Puttharat; Quilang, Jonas; Sha, Zhenxia; Bao, Baolong; Wang, Yaping; Wang, Qun; Takano, Tomokazu; Nandi, Samiran; Liu, Shikai; Wong, Lilian; Kaltenboeck, Ludmilla; Quiniou, Sylvie; Bengten, Eva; Miller, Norman; Trant, John; Rokhsar, Daniel; Liu, Zhanjiang

    2010-03-23

    Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.

  7. Immunological responses of turbot (Psetta maxima) to nodavirus infection or polyriboinosinic polyribocytidylic acid (pIC) stimulation, using expressed sequence tags (ESTs) analysis and cDNA microarrays.

    PubMed

    Park, Kyoung C; Osborne, Jane A; Montes, Ariana; Dios, Sonia; Nerland, Audun H; Novoa, Beatriz; Figueras, Antonio; Brown, Laura L; Johnson, Stewart C

    2009-01-01

    To investigate the immunological responses of turbot to nodavirus infection or pIC stimulation, we constructed cDNA libraries from liver, kidney and gill tissues of nodavirus-infected fish and examined the differential gene expression within turbot kidney in response to nodavirus infection or pIC stimulation using a turbot cDNA microarray. Turbot were experimentally infected with nodavirus and samples of each tissue were collected at selected time points post-infection. Using equal amount of total RNA at each sampling time, we made three tissue-specific cDNA libraries. After sequencing 3230 clones we obtained 3173 (98.2%) high quality sequences from our liver, kidney and gill libraries. Of these 2568 (80.9%) were identified as known genes and 605 (19.1%) as unknown genes. A total of 768 unique genes were identified. The two largest groups resulting from the classification of ESTs according to function were the cell/organism defense genes (71 uni-genes) and apoptosis-related process (23 uni-genes). Using these clones, a 1920 element cDNA microarray was constructed and used to investigate the differential gene expression within turbot in response to experimental nodavirus infection or pIC stimulation. Kidney tissue was collected at selected times post-infection (HPI) or stimulation (HPS), and total RNA was isolated for microarray analysis. Of the 1920 genes studied on the microarray, we identified a total of 121 differentially expressed genes in the kidney: 94 genes from nodavirus-infected animals and 79 genes from those stimulated with pIC. Within the nodavirus-infected fish we observed the highest number of differentially expressed genes at 24 HPI. Our results indicate that certain genes in turbot have important roles in immune responses to nodavirus infection and dsRNA stimulation.

  8. Comparative analyses of genotype dependent expressed sequence tags and stress-responsive transcriptome of chickpea wilt illustrate predicted and unexpected genes and novel regulators of plant immunity

    PubMed Central

    Ashraf, Nasheeman; Ghai, Deepali; Barman, Pranjan; Basu, Swaraj; Gangisetty, Nagaraju; Mandal, Mihir K; Chakraborty, Niranjan; Datta, Asis; Chakraborty, Subhra

    2009-01-01

    Background The ultimate phenome of any organism is modulated by regulated transcription of many genes. Characterization of genetic makeup is thus crucial for understanding the molecular basis of phenotypic diversity, evolution and response to intra- and extra-cellular stimuli. Chickpea is the world's third most important food legume grown in over 40 countries representing all the continents. Despite its importance in plant evolution, role in human nutrition and stress adaptation, very little ESTs and differential transcriptome data is available, let alone genotype-specific gene signatures. Present study focuses on Fusarium wilt responsive gene expression in chickpea. Results We report 6272 gene sequences of immune-response pathway that would provide genotype-dependent spatial information on the presence and relative abundance of each gene. The sequence assembly led to the identification of a CaUnigene set of 2013 transcripts comprising of 973 contigs and 1040 singletons, two-third of which represent new chickpea genes hitherto undiscovered. We identified 209 gene families and 262 genotype-specific SNPs. Further, several novel transcription regulators were identified indicating their possible role in immune response. The transcriptomic analysis revealed 649 non-cannonical genes besides many unexpected candidates with known biochemical functions, which have never been associated with pathostress-responsive transcriptome. Conclusion Our study establishes a comprehensive catalogue of the immune-responsive root transcriptome with insight into their identity and function. The development, detailed analysis of CaEST datasets and global gene expression by microarray provide new insight into the commonality and diversity of organ-specific immune-responsive transcript signatures and their regulated expression shaping the species specificity at genotype level. This is the first report on differential transcriptome of an unsequenced genome during vascular wilt. PMID:19732460

  9. SAGETTARIUS: a program to reduce the number of tags mapped to multiple transcripts and to plan SAGE sequencing stages

    PubMed Central

    Bianchetti, Laurent; Wu, Yan; Guerin, Eric; Plewniak, Frédéric; Poch, Olivier

    2007-01-01

    SAGE (Serial Analysis of Gene Expression) experiments generate short nucleotide sequences called ‘tags’ which are assumed to map unambiguously to their original transcripts (1 tag to 1 transcript mapping). Nevertheless, many tags are generated that do not map to any transcript or map to multiple transcripts. Current bioinformatics resources, such as SAGEmap and TAGmapper, have focused on reducing the number of unmapped tags. Here, we describe SAGETTARIUS, a new high-throughput program that performs successive precise Nla3 and Sau3A tag to transcript mapping, based on specifically designed Virtual Tag (VT) libraries. First, SAGETTARIUS decreases the number of tags mapped to multiple transcripts. Among the various mapping resources compared, SAGETTARIUS performed the best in this respect by decreasing up to 11% the number of multiply mapped tags. Second, SAGETTARIUS allows the establishment of a guideline for SAGE experiment sequencing efforts through efficient mapping of the CRT (Cytoplasmic Ribosomal protein Transcripts)-specific tags. Using all publicly available human and mouse Nla3 SAGE experiments, we show that sequencing 100 000 tags is sufficient to map almost all CRT-specific tags and that four sequencing stages can be identified when carrying out a human or mouse SAGE project. SAGETTARIUS is web interfaced and freely accessible to academic users. PMID:17884916

  10. Expressed sequence tags from the laboratory-grown miniature tomato (Lycopersicon esculentum) cultivar Micro-Tom and mining for single nucleotide polymorphisms and insertions/deletions in tomato cultivars.

    PubMed

    Yamamoto, Naoki; Tsugane, Taneaki; Watanabe, Manabu; Yano, Kentaro; Maeda, Fumi; Kuwata, Chikara; Torki, Moez; Ban, Yusuke; Nishimura, Shigeo; Shibata, Daisuke

    2005-08-15

    Laboratory-grown miniature tomato (Lycopersicon esculentum) cultivar Micro-Tom has attracted attention as a host for functional genomics research. In this study, we generated 35,824 expressed sequence tags (ESTs) from leaves and fruits of Micro-Tom. The ESTs comprised 10,287 unigenes (5007 contigs and 5280 singletons), including 1858 novel tomato unigenes. Of the 18 unigenes that shared strong homology with tobacco chloroplast genome sequences, one unigene was likely derived from polyadenylated transcripts of the atpH gene. Interestingly, ESTs for vacuolar invertase, pectate lyase and alcohol acyl transferase were underrepresented in the Micro-Tom data set. From all of the ESTs, we mined 2039 candidate single nucleotide polymorphisms (SNPs) and 121 candidate insertions and deletions (indels) based on homology with four tomato inbred lines, E6203, R11-13, Rio Grande PtoR and R11-12, and a wild relative, L. pennellii TA56, for which sequence data was publicly available with more than 5000 entries. Direct genome sequencing of several SNP or indel sites in Micro-Tom and L. esculentum E6203 suggested that more than 69% of the candidate sites were truly polymorphic, making them useful for the preparation of DNA markers.

  11. The ABRF Edman Sequencing Research Group 2008 Study: Investigation into Homopolymeric Amino Acid N-Terminal Sequence Tags and Their Effects on Automated Edman Degradation

    PubMed Central

    Thoma, R. S.; Smith, J. S.; Sandoval, W.; Leone, J. W.; Hunziker, P.; Hampton, B.; Linse, K. D.; Denslow, N. D.

    2009-01-01

    The Edman Sequence Research Group (ESRG) of the Association of Biomolecular Resource designs and executes interlaboratory studies investigating the use of automated Edman degradation for protein and peptide analysis. In 2008, the ESRG enlisted the help of core sequencing facilities to investigate the effects of a repeating amino acid tag at the N-terminus of a protein. Commonly, to facilitate protein purification, an affinity tag containing a polyhistidine sequence is conjugated to the N-terminus of the protein. After expression, polyhistidine-tagged protein is readily purified via chelation with an immobilized metal affinity resin. The addition of the polyhistidine tag presents unique challenges for the determination of protein identity using Edman degradation chemistry. Participating laboratories were asked to sequence one protein engineered in three configurations: with an N-terminal polyhistidine tag; with an N-terminal polyalanine tag; or with no tag. Study participants were asked to return a data file containing the uncorrected amino acid picomole yields for the first 17 cycles. Initial and repetitive yield (R.Y.) information and the amount of lag were evaluated. Information about instrumentation and sample treatment was also collected as part of the study. For this study, the majority of participating laboratories successfully called the amino acid sequence for 17 cycles for all three test proteins. In general, laboratories found it more difficult to call the sequence containing the polyhistidine tag. Lag was observed earlier and more consistently with the polyhistidine-tagged protein than the polyalanine-tagged protein. Histidine yields were significantly less than the alanine yields in the tag portion of each analysis. The polyhistidine and polyalanine protein-R.Y. calculations were found to be equivalent. These calculations showed that the nontagged portion from each protein was equivalent. The terminal histidines from the tagged portion of the protein

  12. Histidine tag fusion increases expression levels of active recombinant amelogenin in Escherichia coli.

    PubMed

    Svensson, Johan; Andersson, Christer; Reseland, Janne E; Lyngstadaas, Petter; Bülow, Leif

    2006-07-01

    Amelogenin is a dental enamel matrix protein involved in formation of dental enamel. In this study, we have expressed two different recombinant murine amelogenins in Escherichia coli: the untagged rM179, and the histidine tagged rp(H)M180, identical to rM179 except that it carries the additional N-terminal sequence MRGSHHHHHHGS. The effects of the histidine tag on expression levels, and on growth properties of the amelogenin expressing cells were studied. Purification of a crude protein extract containing rp(H)M180 was also carried out using IMAC and reverse-phase HPLC. The results of this study showed clearly that both growth properties and amelogenin expression levels were improved for E. coli cells expressing the histidine tagged amelogenin rp(H)M180, compared to cells expressing the untagged amelogenin rM179. The positive effect of the histidine tag on amelogenin expression is proposed to be due to the hydrophilic nature of the histidine tag, generating a more hydrophilic amelogenin, which is more compatible with the host cell. Human osteoblasts treated with the purified rp(H)M180 showed increased levels of secreted osteocalcin, compared to untreated cells. This response was similar to cells treated with enamel matrix derivate, mainly composed by amelogenin, suggesting that the recombinant protein is biologically active. Thus, the histidine tag favors expression and purification of biologically active recombinant amelogenin.

  13. HIV-1 quasispecies delineation by tag linkage deep sequencing.

    PubMed

    Wu, Nicholas C; De La Cruz, Justin; Al-Mawsawi, Laith Q; Olson, C Anders; Qi, Hangfei; Luan, Harding H; Nguyen, Nguyen; Du, Yushen; Le, Shuai; Wu, Ting-Ting; Li, Xinmin; Lewis, Martha J; Yang, Otto O; Sun, Ren

    2014-01-01

    Trade-offs between throughput, read length, and error rates in high-throughput sequencing limit certain applications such as monitoring viral quasispecies. Here, we describe a molecular-based tag linkage method that allows assemblage of short sequence reads into long DNA fragments. It enables haplotype phasing with high accuracy and sensitivity to interrogate individual viral sequences in a quasispecies. This approach is demonstrated to deduce ∼ 2000 unique 1.3 kb viral sequences from HIV-1 quasispecies in vivo and after passaging ex vivo with a detection limit of ∼ 0.005% to ∼ 0.001%. Reproducibility of the method is validated quantitatively and qualitatively by a technical replicate. This approach can improve monitoring of the genetic architecture and evolution dynamics in any quasispecies population.

  14. Generation and analysis of a large-scale expressed sequence tags from a full-length enriched cDNA library of Siberian tiger (Panthera tigris altaica).

    PubMed

    Guo, Yu; Liu, Changqing; Lu, Taofeng; Liu, Dan; Bai, Chunyu; Li, Xiangchen; Ma, Yuehui; Guan, Weijun

    2014-05-15

    In this study, a full-length enriched cDNA library was successfully constructed from Siberian tiger, the world's most endangered species. The titers of primary and amplified libraries were 1.28×10(6)pfu/mL and 1.59×10(10)pfu/mL respectively. The proportion of recombinants from unamplified library was 91.3% and the average length of exogenous inserts was 1.06kb. A total of 279 individual ESTs with sizes ranging from 316 to 1258bps were then analyzed. Furthermore, 204 unigenes were successfully annotated and involved in 49 functions of the GO classification, cell (175, 85.5%), cellular process (165, 80.9%), and binding (152, 74.5%) are the dominant terms. 198 unigenes were assigned to 156 KEGG pathways, and the pathways with the most representation are metabolic pathways (18, 9.1%). The proportion pattern of each COG subcategory was similar among Panthera tigris altaica, P. tigris tigris and Homo sapiens, and general function prediction only cluster (44, 15.8%) represents the largest group, followed by translation, ribosomal structure and biogenesis (33, 11.8%), replication, recombination and repair (24, 8.6%), and only 7.2% ESTs classified as novel genes. Moreover, the recombinant plasmid pET32a-TAT-COL6A2 was constructed, coded for the Trx-TAT-COL6A2 fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-COL6A2 recombinant protein was 2.64±0.18mg/mL. This library will provide a useful platform for the functional genome and transcriptome research of for the P. tigris and other felid animals in the future.

  15. Identifying nonspecific SAGE tags by context of gene expression.

    PubMed

    Ge, Xijin; Wang, San Ming

    2008-01-01

    Many serial analysis of gene expression (SAGE) tags can be matched to multiple genes, leading to difficulty in SAGE data interpretation and analysis. As only a subset of genes in the human genome are transcribed in a certain type of tissue/cell, we used microarray expression data from different tissue types to define contexts of gene expression and to annotate SAGE tags collected from the same or similar tissue sources. To predict the original transcript contributing a nonspecific SAGE tag collected from a particular tissue, we ranked the corresponding genes by their expression levels determined by microarray. We developed a tissue-specific SAGE tag annotation database based on microarray data collected from 73 normal human tissues and 18 cancer tissues and cell lines. The database can be queried online at: http://www.basic.northwestern.edu/SAGE/. The accuracy of this database was confirmed by experimental data.

  16. Mapping DNA-protein interactions in large genomes by sequence tag analysis of genomic enrichment.

    PubMed

    Kim, Jonghwan; Bhinge, Akshay A; Morgan, Xochitl C; Iyer, Vishwanath R

    2005-01-01

    Identifying the chromosomal targets of transcription factors is important for reconstructing the transcriptional regulatory networks underlying global gene expression programs. We have developed an unbiased genomic method called sequence tag analysis of genomic enrichment (STAGE) to identify the direct binding targets of transcription factors in vivo. STAGE is based on high-throughput sequencing of concatemerized tags derived from target DNA enriched by chromatin immunoprecipitation. We first used STAGE in yeast to confirm that RNA polymerase III genes are the most prominent targets of the TATA-box binding protein. We optimized the STAGE protocol and developed analysis methods to allow the identification of transcription factor targets in human cells. We used STAGE to identify several previously unknown binding targets of human transcription factor E2F4 that we independently validated by promoter-specific PCR and microarray hybridization. STAGE provides a means of identifying the chromosomal targets of DNA-associated proteins in any sequenced genome.

  17. Leaf-, panel- and latex-expressed sequenced tags from the rubber tree (Hevea brasiliensis) under cold-stressed and suboptimal growing conditions: the development of gene-targeted functional markers for stress response.

    PubMed

    Silva, Carla C; Mantello, Camila C; Campos, Tatiana; Souza, Livia M; Gonçalves, Paulo S; Souza, Anete P

    2014-01-01

    Hevea brasiliensis is a native species of the Amazon Basin of South America and the primary source of natural rubber worldwide. Due to the occurrence of South American Leaf Blight disease in this area, rubber plantations have been extended to suboptimal regions. Rubber tree breeding is time-consuming and expensive, but molecular markers can serve as a tool for early evaluation, thus reducing time and costs. In this work, we constructed six different cDNA libraries with the aim of developing gene-targeted molecular markers for the rubber tree. A total of 8,263 reads were assembled, generating 5,025 unigenes that were analyzed; 912 expressed sequence tags (ESTs) represented new transcripts, and two sequences were highly up-regulated by cold stress. These unigenes were scanned for microsatellite (SSR) regions and single nucleotide polymorphisms (SNPs). In total, 169 novel EST-SSR markers were developed; 138 loci were polymorphic in the rubber tree, and 98 % presented transferability to six other Hevea species. Locus duplication was observed in H. brasiliensis and other species. Additionally, 43 SNP markers in 13 sequences that showed similarity to proteins involved in stress response, latex biosynthesis and developmental processes were characterized. cDNA libraries are a rich source of SSR and SNP markers and enable the identification of new transcripts. The new markers developed here will be a valuable resource for linkage mapping, QTL identification and other studies in the rubber tree and can also be used to evaluate the genetic variability of other Hevea species, which are valuable assets in rubber tree breeding.

  18. Expressed sequence tags from larval gut of the european corn borer (Ostrinia nubilalis): exploring candidate genes potenially involved in Bacillus thuringiensis toxicity and resistance

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: Knowledge of the genes that are expressed in the insect gut are crucial for understanding basic physiology of food digestion, their interactions with Bacillus thuringiensis (Bt) toxin and for discovering new targets for novel toxins for use in pest management. This study analyzed the ES...

  19. Identification of anther-specific gene expression from T-DNA tagging rice.

    PubMed

    Muthukalianan, Gothandam K; Lee, Sanghyun; Yum, Hyunsik; Ku, Sujin; Kwun, Minjung; Kang, Hong Gyu; An, Gynheung; Chung, Yong-Yoon

    2003-02-28

    We have screened a total of 5,500 T-DNA tagging rice lines in which beta-glucuronidase (GUS) gene sequence was randomly inserted as a transgene into the plant genome. Histochemical GUS assays were carried out to select the T-DNA tagging rice lines that show its expression in anther. Of the tagging lines screened, three lines were found to express GUS specifically in the anther that is about 0.05%. Microscopic observation of the anther-expressed lines showed specific expression patterns of GUS in the anther, either gametophytic or sporophytic specificities. Southern blot analysis revealed that the integration copy number of the transgene was 2.3 in average. The detailed expression patterns were analyzed and discussed.

  20. Bioinformatic analyses of the publicly accessible crustacean expressed sequence tags (ESTs) reveal numerous novel neuropeptide-encoding precursor proteins, including ones from members of several little studied taxa.

    PubMed

    Christie, Andrew E; Durkin, Christopher S; Hartline, Niko; Ohno, Paul; Lenz, Petra H

    2010-05-15

    ESTs have been generated for many crustacean species, providing an invaluable resource for peptide discovery in members of this arthropod subphylum. Here, these data were mined for novel peptide-encoding transcripts, with the mature peptides encoded by them predicted using a combination of online peptide prediction programs and homology to known arthropod sequences. In total, 70 mature full-length/partial peptides representing members of 16 families/subfamilies were predicted, the vast majority being novel; the species from which the peptides were identified included members of the Branchiopoda (Daphnia carinata and Triops cancriformis), Maxillopoda (Caligus clemensi, Caligus rogercresseyi, Lepeophtheirus salmonis and Lernaeocera branchialis) and Malacostraca (Euphausia superba, Marsupenaeus japonicus, Penaeus monodon, Homarus americanus, Petrolisthes cinctipes, Callinectes sapidus and Portunus trituberculatus). Of particular note were the identifications of an intermediate between the insect adipokinetic hormones and crustacean red pigment concentrating hormone and a modified crustacean cardioactive peptide from the daphnid D. carinata; Arg(7)-corazonin was also deduced from this species, the first identification of a corazonin from a non-decapod crustacean. Our data also include the first reports of members of the calcitonin-like diuretic hormone, FMRFamide-related peptide (neuropeptide F subfamily) and orcokinin families from members of the Copepoda. Moreover, the prediction of a bursicon alpha from the euphausid E. superba represents the first peptide identified from any member of the basal eucaridean order Euphausiacea. In addition, large collections of insect eclosion hormone- and neuroparsin-like peptides were identified from a variety of species, greatly expanding the number of known members of these families in crustaceans.

  1. Elucidation of the metabolic fate of glucose in the filamentous fungus Trichoderma reesei using expressed sequence tag (EST) analysis and cDNA microarrays.

    PubMed

    Chambergo, Felipe S; Bonaccorsi, Eric D; Ferreira, Ari J S; Ramos, Augusto S P; Ferreira Júnior, José Ribamar; Abrahão-Neto, José; Farah, João P Simon; El-Dorry, Hamza

    2002-04-19

    Despite the intense interest in the metabolic regulation and evolution of the ATP-producing pathways, the long standing question of why most multicellular microorganisms metabolize glucose by respiration rather than fermentation remains unanswered. One such microorganism is the cellulolytic fungus Trichoderma reesei (Hypocrea jecorina). Using EST analysis and cDNA microarrays, we find that in T. reesei expression of the genes encoding the enzymes of the tricarboxylic acid cycle and the proteins of the electron transport chain is programmed in a way that favors the oxidation of pyruvate via the tricarboxylic acid cycle rather than its reduction to ethanol by fermentation. Moreover, the results indicate that acetaldehyde may be channeled into acetate rather than ethanol, thus preventing the regeneration of NAD(+), a pivotal product required for anaerobic metabolism. The studies also point out that the regulatory machinery controlled by glucose was most probably the target of evolutionary pressure that directed the flow of metabolites into respiratory metabolism rather than fermentation. This finding has significant implications for the development of metabolically engineered cellulolytic microorganisms for fuel production from cellulose biomass.

  2. CREST--classification resources for environmental sequence tags.

    PubMed

    Lanzén, Anders; Jørgensen, Steffen L; Huson, Daniel H; Gorfer, Markus; Grindhaug, Svenn Helge; Jonassen, Inge; Øvreås, Lise; Urich, Tim

    2012-01-01

    Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags), a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity) as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3) from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com.

  3. Multiplexed metagenome mining using short DNA sequence tags facilitates targeted discovery of epoxyketone proteasome inhibitors.

    PubMed

    Owen, Jeremy G; Charlop-Powers, Zachary; Smith, Alexandra G; Ternei, Melinda A; Calle, Paula Y; Reddy, Boojala Vijay B; Montiel, Daniel; Brady, Sean F

    2015-04-07

    In molecular evolutionary analyses, short DNA sequences are used to infer phylogenetic relationships among species. Here we apply this principle to the study of bacterial biosynthesis, enabling the targeted isolation of previously unidentified natural products directly from complex metagenomes. Our approach uses short natural product sequence tags derived from conserved biosynthetic motifs to profile biosynthetic diversity in the environment and then guide the recovery of gene clusters from metagenomic libraries. The methodology is conceptually simple, requires only a small investment in sequencing, and is not computationally demanding. To demonstrate the power of this approach to natural product discovery we conducted a computational search for epoxyketone proteasome inhibitors within 185 globally distributed soil metagenomes. This led to the identification of 99 unique epoxyketone sequence tags, falling into 6 phylogenetically distinct clades. Complete gene clusters associated with nine unique tags were recovered from four saturating soil metagenomic libraries. Using heterologous expression methodologies, seven potent epoxyketone proteasome inhibitors (clarepoxcins A-E and landepoxcins A and B) were produced from these pathways, including compounds with different warhead structures and a naturally occurring halohydrin prodrug. This study provides a template for the targeted expansion of bacterially derived natural products using the global metagenome.

  4. Vectors for the expression of tagged proteins in Drosophila.

    PubMed

    Parker, L; Gross, S; Alphey, L

    2001-12-01

    Regulated expression systems have been extremely useful in developmental studies, allowing the expression of specific proteins in defined spatial and temporal patterns. If these proteins are fused to an appropriate molecular tag, then they can be purified or visualized without the need to raise specific antibodies. If the tag is inherently fluorescent, then the proteins can even be visualized directly, in living tissue. We have constructed a series of P element-based transformation vectors for the most widely used expression system in Drosophila, GAL4/UAS. These vectors provide a series of useful tags for antibody detection, protein purification, and/or direct visualization, together with a convenient multiple cloning site into which the cDNA of interest can be inserted.

  5. De novo sequencing of highly modified therapeutic oligonucleotides by hydrophobic tag sequencing coupled with LC-MS.

    PubMed

    Goto, R; Miyakawa, S; Inomata, E; Takami, T; Yamaura, J; Nakamura, Y

    2017-02-01

    Correct sequences are prerequisite for quality control of therapeutic oligonucleotides. However, there is no definitive method available for determining sequences of highly modified therapeutic RNAs, and thereby, most of the oligonucleotides have been used clinically without direct sequence determination. In this study, we developed a novel sequencing method called 'hydrophobic tag sequencing'. Highly modified oligonucleotides are sequenced by partially digesting oligonucleotides conjugated with a 5'-hydrophobic tag, followed by liquid chromatography-mass spectrometry analysis. 5'-Hydrophobic tag-printed fragments (5'-tag degradates) can be separated in order of their molecular masses from tag-free oligonucleotides by reversed-phase liquid chromatography. As models for the sequencing, the anti-VEGF aptamer (Macugen) and the highly modified 38-mer RNA sequences were analyzed under blind conditions. Most nucleotides were identified from the molecular weight of hydrophobic 5'-tag degradates calculated from monoisotopic mass in simple full mass data. When monoisotopic mass could not be assigned, the nucleotide was estimated using the molecular weight of the most abundant mass. The sequences of Macugen and 38-mer RNA perfectly matched the theoretical sequences. The hydrophobic tag sequencing worked well to obtain simple full mass data, resulting in accurate and clear sequencing. The present study provides for the first time a de novo sequencing technology for highly modified RNAs and contributes to quality control of therapeutic oligonucleotides. Copyright © 2016 John Wiley & Sons, Ltd.

  6. Plant Gene and Alternatively Spliced Variant Annotator. A plant genome annotation pipeline for rice gene and alternatively spliced variant identification with cross-species expressed sequence tag conservation from seven plant species.

    PubMed

    Chen, Feng-Chi; Wang, Sheng-Shun; Chaw, Shu-Miaw; Huang, Yao-Ting; Chuang, Trees-Juen

    2007-03-01

    The completion of the rice (Oryza sativa) genome draft has brought unprecedented opportunities for genomic studies of the world's most important food crop. Previous rice gene annotations have relied mainly on ab initio methods, which usually yield a high rate of false-positive predictions and give only limited information regarding alternative splicing in rice genes. Comparative approaches based on expressed sequence tags (ESTs) can compensate for the drawbacks of ab initio methods because they can simultaneously identify experimental data-supported genes and alternatively spliced transcripts. Furthermore, cross-species EST information can be used to not only offset the insufficiency of same-species ESTs but also derive evolutionary implications. In this study, we used ESTs from seven plant species, rice, wheat (Triticum aestivum), maize (Zea mays), barley (Hordeum vulgare), sorghum (Sorghum bicolor), soybean (Glycine max), and Arabidopsis (Arabidopsis thaliana), to annotate the rice genome. We developed a plant genome annotation pipeline, Plant Gene and Alternatively Spliced Variant Annotator (PGAA). Using this approach, we identified 852 genes (931 isoforms) not annotated in other widely used databases (i.e. the Institute for Genomic Research, National Center for Biotechnology Information, and Rice Annotation Project) and found 87% of them supported by both rice and nonrice EST evidence. PGAA also identified more than 44,000 alternatively spliced events, of which approximately 20% are not observed in the other three annotations. These novel annotations represent rich opportunities for rice genome research, because the functions of most of our annotated genes are currently unknown. Also, in the PGAA annotation, the isoforms with non-rice-EST-supported exons are significantly enriched in transporter activity but significantly underrepresented in transcription regulator activity. We have also identified potential lineage-specific and conserved isoforms, which are

  7. Grouping and identification of sequence tags (GRIST): bioinformatics tools for the NEIBank database.

    PubMed

    Wistow, Graeme; Bernstein, Steven L; Touchman, Jeffrey W; Bouffard, Gerald; Wyatt, M Keith; Peterson, Katherine; Behal, Amita; Gao, James; Buchoff, Patee; Smith, Don

    2002-06-15

    NEIBank is a project to develop and organize genomics and bioinformatics resources for the eye. As part of this effort, tools have been developed for bioinformatics analysis and web based display of data from expressed sequence tag (EST) analyses. EST sequences are identified and formed into groups or clusters representing related transcripts from the same gene. This is carried out by a rules-based procedure called GRIST (GRouping and Identification of Sequence Tags) that uses sequence match parameters derived from BLAST programs. Linked procedures are used to eliminate non-mRNA contaminants. All data are assembled in a relational database and assembled for display as web pages with annotations and links to other informatics resources. Genome projects generate huge amounts of data that need to be classified and organized to become easily accessible to the research community. GRIST provides a useful tool for assembling and displaying the results of EST analyses. The NEIBank web site contains a growing set of pages cataloging the known transcriptional repertoire of eye tissues, derived from new NEIBank cDNA libraries and from eye-related data deposited in the dbEST section of GenBank.

  8. Phosphorylation of serine residues in histidine-tag sequences attached to recombinant protein kinases: a cause of heterogeneity in mass and complications in function.

    PubMed

    Du, Ping; Loulakis, Pat; Luo, Chun; Mistry, Anil; Simons, Samuel P; LeMotte, Peter K; Rajamohan, Francis; Rafidi, Kristina; Coleman, Kevin G; Geoghegan, Kieran F; Xie, Zhi

    2005-12-01

    High-level recombinant expression of protein kinases in eukaryotic cells or Escherichia coli commonly gives products that are phosphorylated by autocatalysis or by the action of endogenous kinases. Here, we report that phosphorylation occurred on serine residues adjacent to hexahistidine affinity tags (His-tags) derived from several commercial expression vectors and fused to overexpressed kinases. The result was observed with a variety of recombinant kinases expressed in either insect cells or E. coli. Multiple phosphorylations of His-tagged full-length Aurora A, a protein serine/threonine kinase, were detected by mass spectrometry when it was expressed in insect cells in the presence of okadaic acid, a protein phosphatase inhibitor. Peptide mapping by liquid chromatography-mass spectrometry detected phosphorylations on all three serine residues in an N-terminal tag, alpha-N-acetyl-MHHHHHHSSGLPRGS. The same sequence was also phosphorylated, but only at a low level, when a His-tagged protein tyrosine kinase, Pyk2 was expressed in insect cells and activated in vitro. When catalytic domains of Aurora A and several other protein serine/threonine kinases were expressed in E. coli, serines in the affinity tag sequence GSSHHHHHHSSGLVPRGS were also variably phosphorylated. His-Aurora A with hyperphosphorylation of the serine residues in the tag aggregated and resisted thrombin-catalyzed removal of the tag. Treatment with alkaline phosphatase partly restored sensitivity to thrombin. The same His-tag sequence was also detected bearing alpha-N-d-gluconoylation in addition to multiple phosphorylations. The results show that histidine-tag sequences can receive complicated posttranslational modification, and that the hyperphosphorylation and resulting heterogeneity of the recombinant fusion proteins can interfere with downstream applications.

  9. Genomic Sequence or Signature Tags (GSTs) from the Genome Group at Brookhaven National Laboratory (BNL)

    DOE Data Explorer

    Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K.

    Genomic Signature Tags (GSTs) are the products of a method we have developed for identifying and quantitatively analyzing genomic DNAs. The DNA is initially fragmented with a type II restriction enzyme. An oligonucleotide adaptor containing a recognition site for MmeI, a type IIS restriction enzyme, is then used to release 21-bp tags from fixed positions in the DNA relative to the sites recognized by the fragmenting enzyme. These tags are PCR-amplified, purified, concatenated and then cloned and sequenced. The tag sequences and abundances are used to create a high resolution GST sequence profile of the genomic DNA. [Quoted from Genomic Signature Tags (GSTs): A System for Profiling Genomic DNA, Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K., Revised 9/13/2002

  10. Phylogeny of Saccharina and Laminaria (Laminariaceae, Laminariales, Phaeophyta) in sequence-tagged-site markers

    NASA Astrophysics Data System (ADS)

    Qu, Jieqiong; Zhang, Jing; Wang, Xumin; Chi, Shan; Liu, Cui; Liu, Tao

    2014-01-01

    Laminaria and Saccharina have recently been recognized as two independent clades from the former genus Laminaria. Traditional morphological taxonomy is being challenged by molecular evidence from both nucleus and plastid. Intensive work is in great demand from the perspective of genome colinearity. In this study, 118 sequence-tagged site (STS) markers were screened for phylogenetic analyses, 29 based on genome sequences, while 89 were based on expressed sequence tag (EST) sequences. EST-based STS marker development (29.37%) had an effi ciency twice as high as genome-sequence-based development (9.48%) as a result of high conservation of gene transcripts among the relative species. S. ochotensis, S. religiosa, S. japonica, and L. hyperborea showed great homogeneity in all 118 STS markers. Our result supports the view that the diversifi cation between the genera Saccharina and Laminaria was a more recent event and that Saccharina and Laminaria shared high phylogenetic affi nity. However, when it came to the single nucleotide polymorphism (SNP) level among the 41 SNPs, L. hyperborea owned 29 unique SNPs against 12 within the left three Saccharina species and 12 of the 13 indels were supposedly unique for L. hyperborea, indicated by its high variability. Originating from homologous ancestors, species between the recently diverged genera Laminaria and Saccharina may have taken in enough mutations at the SNP level only, in spite of different evolutionary strategies for better adaptation to the environment. Our study lays a solid foundation from a new perspective, although more accurate phylogenetic analysis is still needed to clarify the evolutionary traces between the genera Saccharina and Laminaria.

  11. C-Terminally fused affinity Strep-tag II is removed by proteolysis from recombinant human erythropoietin expressed in transgenic tobacco plants

    PubMed Central

    Kittur, Farooqahmed S.; Lalgondar, Mallikarjun; Hung, Chiu-Yueh; Sane, David C.

    2014-01-01

    Asialo-erythropoietin (asialo-EPO), a desialylated form of EPO, is a potent tissue-protective agent. Recently, we and others have exploited a low cost plant-based expression system to produce recombinant human asialo-EPO (asialo-rhuEPOP). To facilitate purification from plant extracts, Strep-tag II was engineered at the C-terminus of EPO. Although asialo-rhuEPOP was efficiently expressed in transgenic tobacco plants, affinity purification based on Strep-tag II did not result in the recovery of the protein. In this study, we investigated the stability of Strep-tag II tagged asialo-rhuEPOP expressed in tobacco plants to understand whether this fused tag is cleaved or inaccessible. Sequencing RT-PCR products confirmed that fused DNA sequences encoding Strep-tag II were properly transcribed, and three-dimensional protein structure model revealed that the tag must be fully accessible. However, Western blot analysis of leaf extracts and purified asialo-rhuEPOP revealed that the Strep-tag II was absent on the protein. Additionally, no peptide fragment containing Strep-tag II was identified in the LC-MS/MS analysis of purified protein further supporting that the affinity tag was absent on asialo-rhuEPOP. However, Strep-tag II was detected on asialo-rhuEPOP that was retained in the endoplasmic reticulum, suggesting that the Strep-tag II is removed during protein secretion or extraction. These findings together with recent reports that C-terminally fused Strep-tag II or IgG Fc domain are also removed from EPO in tobacco plants, suggest that its C-terminus may be highly susceptible to proteolysis in tobacco plants. Therefore, direct fusion of purification tags at the C-terminus of EPO should be avoided while expressing it in tobacco plants. PMID:25504272

  12. Mining and characterization of sequence tagged microsatellites from the brown planthopper Nilaparvata lugens.

    PubMed

    Sun, Jing-Tao; Zhang, Yan-Kai; Ge, Cheng; Hong, Xiao-Yue

    2011-01-01

    The brown planthopper, Nilaparvata lugens (Stål) (Hemiptera: Delphacidae), is an important pest of rice. To better understand the migration pattern and population structure of the Chinese populations of N. lugens, we developed and characterized 12 polymorphic microsatellites from the expressed sequence tags database of N. lugens. The occurrence of these simple sequence repeats was assessed in three populations collected from three provinces of China. The number of alleles per locus ranged from 3 to 13 with an average of 6.5 alleles per locus. The mean observed heterozygosity of the three populations ranged from 0.051 to 0.772 and the expected heterozygosity ranged from 0.074 to 0.766. The sequences of the 12 markers were highly variable. The polymorphism information content of the 12 markers was high and ranged from 0.074 to 0.807 (mean = 0.503). Sequencing of microsatellite alleles revealed that the fragment length differences were mainly due to the variation of the repeat motif. Significant genetic differentiation was detected among the three N. lugens populations as the Fst ranged from 0.034 to 0.273. Principle coordinates analysis also revealed significant genetic differentiation between populations of different years. We conclude that these microsatellite markers will be a powerful tools to study the migration routine of the N. lugens.

  13. Unique archaeal assemblages in the Arctic Ocean unveiled by massively parallel tag sequencing.

    PubMed

    Galand, Pierre E; Casamayor, Emilio O; Kirchman, David L; Potvin, Marianne; Lovejoy, Connie

    2009-07-01

    The Arctic Ocean plays a critical role in controlling nutrient budgets between the Pacific and Atlantic Ocean. Archaea are key players in the nitrogen cycle and in cycling nutrients, but their community composition has been little studied in the Arctic Ocean. Here, we characterize archaeal assemblages from surface and deep Arctic water masses using massively parallel tag sequencing of the V6 region of the 16S rRNA gene. This approach gave a very high coverage of the natural communities, allowing a precise description of archaeal assemblages. This first taxonomic description of archaeal communities by tag sequencing reported so far shows that it is possible to assign an identity below phylum level to most (95%) of the archaeal V6 tags, and shows that tag sequencing is a powerful tool for resolving the diversity and distribution of specific microbes in the environment. Marine group I Crenarchaeota was overall the most abundant group in the Arctic Ocean and comprised between 27% and 63% of all tags. Group III Euryarchaeota were more abundant in deep-water masses and represented the largest archaeal group in the deep Atlantic layer of the central Arctic Ocean. Coastal surface waters, in turn, harbored more group II Euryarchaeota. Moreover, group II sequences that dominated surface waters were different from the group II sequences detected in deep waters, suggesting functional differences in closely related groups. Our results unveiled for the first time an archaeal community dominated by group III Euryarchaeota and show biogeographical traits for marine Arctic Archaea.

  14. Digital Gene Expression Tag Profiling Analysis of the Gene Expression Patterns Regulating the Early Stage of Mouse Spermatogenesis

    PubMed Central

    Meng, Lijun; Liu, Meiling; Zhao, Lina; Hu, Fen; Ding, Cunbao; Wang, Yang; He, Baoling; Pan, Yuxin; Fang, Wei; Chen, Jing; Hu, Songnian; Jia, Mengchun

    2013-01-01

    Detailed characterization of the gene expression patterns in spermatogonia and primary spermatocytes is critical to understand the processes which occur prior to meiosis during normal spermatogenesis. The genome-wide expression profiles of mouse type B spermatogonia and primary spermatocytes were investigated using the Solexa/Illumina digital gene expression (DGE) system, a tag based high-throughput transcriptome sequencing method, and the developmental processes which occur during early spermatogenesis were systematically analyzed. Gene expression patterns vary significantly between mouse type B spermatogonia and primary spermatocytes. The functional analysis revealed that genes related to junction assembly, regulation of the actin cytoskeleton and pluripotency were most significantly differently expressed. Pathway analysis indicated that the Wnt non-canonical signaling pathway played a central role and interacted with the actin filament organization pathway during the development of spermatogonia. This study provides a foundation for further analysis of the gene expression patterns and signaling pathways which regulate the molecular mechanisms of early spermatogenesis. PMID:23554914

  15. Next generation barcode tagged sequencing for monitoring microbial community dynamics.

    PubMed

    Breakwell, Katy; Tetu, Sasha G; Elbourne, Liam D H

    2014-01-01

    Microbial identification using 16S rDNA variable regions has become increasingly popular over the past decade. The application of next-generation amplicon sequencing to these regions allows microbial communities to be sequenced in far greater depth than previous techniques, as well as allowing for the identification of unculturable or rare organisms within a sample. Multiplexing can be used to sequence multiple samples in tandem through the use of sample-specific identification sequences which are attached to each amplicon, making this a cost-effective method for large-scale microbial identification experiments.

  16. Detrimental effect of the 6 His C-terminal tag on YedY enzymatic activity and influence of the TAT signal sequence on YedY synthesis

    PubMed Central

    2013-01-01

    Background YedY, a molybdoenzyme belonging to the sulfite oxidase family, is found in most Gram-negative bacteria. It contains a twin-arginine signal sequence that is cleaved after its translocation into the periplasm. Despite a weak reductase activity with substrates such as dimethyl sulfoxide or trimethylamine N-oxide, its natural substrate and its role in the cell remain unknown. Although sequence conservation of the YedY family displays a strictly conserved hydrophobic C-terminal residue, all known studies on Escherichia coli YedY have been performed with an enzyme containing a 6 histidine-tag at the C-terminus which could hamper enzyme activity. Results In this study, we demonstrate that the tag fused to the C-terminus of Rhodobacter sphaeroides YedY is detrimental to the enzyme’s reductase activity and results in an eight-fold decrease in catalytic efficiency. Nonetheless this C-terminal tag does not influence the properties of the molybdenum active site, as assayed by EPR spectroscopy. When a cleavable His-tag was fused to the N-terminus of the mature enzyme in the absence of the signal sequence, YedY was expressed and folded with its cofactor. However, when the signal sequence was added upstream of the N-ter tag, the amount of enzyme produced was approximately ten-fold higher. Conclusion Our study thus underscores the risk of using a C-terminus tagged enzyme while studying YedY, and presents an alternative strategy to express signal sequence-containing enzymes with an N-terminal tag. It brings new insights into molybdoenzyme maturation in R. sphaeroides showing that for some enzymes, maturation can occur in the absence of the signal sequence but that its presence is required for high expression of active enzyme. PMID:24180491

  17. Massively parallel tag sequencing reveals the complexity of anaerobic marine protistan communities

    PubMed Central

    Stoeck, Thorsten; Behnke, Anke; Christen, Richard; Amaral-Zettler, Linda; Rodriguez-Mora, Maria J; Chistoserdov, Andrei; Orsi, William; Edgcomb, Virginia P

    2009-01-01

    Background Recent advances in sequencing strategies make possible unprecedented depth and scale of sampling for molecular detection of microbial diversity. Two major paradigm-shifting discoveries include the detection of bacterial diversity that is one to two orders of magnitude greater than previous estimates, and the discovery of an exciting 'rare biosphere' of molecular signatures ('species') of poorly understood ecological significance. We applied a high-throughput parallel tag sequencing (454 sequencing) protocol adopted for eukaryotes to investigate protistan community complexity in two contrasting anoxic marine ecosystems (Framvaren Fjord, Norway; Cariaco deep-sea basin, Venezuela). Both sampling sites have previously been scrutinized for protistan diversity by traditional clone library construction and Sanger sequencing. By comparing these clone library data with 454 amplicon library data, we assess the efficiency of high-throughput tag sequencing strategies. We here present a novel, highly conservative bioinformatic analysis pipeline for the processing of large tag sequence data sets. Results The analyses of ca. 250,000 sequence reads revealed that the number of detected Operational Taxonomic Units (OTUs) far exceeded previous richness estimates from the same sites based on clone libraries and Sanger sequencing. More than 90% of this diversity was represented by OTUs with less than 10 sequence tags. We detected a substantial number of taxonomic groups like Apusozoa, Chrysomerophytes, Centroheliozoa, Eustigmatophytes, hyphochytriomycetes, Ichthyosporea, Oikomonads, Phaeothamniophytes, and rhodophytes which remained undetected by previous clone library-based diversity surveys of the sampling sites. The most important innovations in our newly developed bioinformatics pipeline employ (i) BLASTN with query parameters adjusted for highly variable domains and a complete database of public ribosomal RNA (rRNA) gene sequences for taxonomic assignments of tags; (ii

  18. An epitope tagged mammalian/prokaryotic expression vector with positive selection of cloned inserts.

    PubMed

    Schneider, S; Georgiev, O; Buchert, M; Adams, M T; Moelling, K; Hovens, C M

    1997-09-15

    A dual eukaryotic/prokaryotic expression vector has been developed which combines the features of positive selection for cloned inserts along with the production of an epitope-tagged cDNA insert by transient transfection in mammalian cells as well as high level induced expression in E. coli cells harbouring T7 RNA polymerase. This vector, pZilch, has two MCSs flanking a mutant E. coli phenylalanyl-tRNA synthetase gene, pheS, which when expressed in combination with the phenylalanine analog p-CI-Phe, results in termination of host cell protein synthesis. Cloning of inserts using unique sites in the flanking MCS regions results in loss of the pZilch pheS allele and hence permits growth of colonies harbouring recombinants on p-Cl-Phe plates. Additional features of the vector include an optimal Kozak consensus sequence for high level eukaryotic cell expression and an efficient prokaryotic translation initiation site in frame and downstream from the eukaryotic initiation site. Recombinant proteins can be produced with an N-terminal FLAG epitope which can be removed via a specific protease cleavage site. Flanking T7 and SP6 RNA polymerase promoter sites permit in vitro transcription and translation of cloned inserts. A derivative of the vector has also been constructed enabling nuclear accumulation of the tagged proteins via an SV40 nuclear localisation signal upstream of the 5' MCS.

  19. Diploid Musa acuminata genetic diversity assayed with sequence-tagged microsatellite sites.

    PubMed

    Grapin, A; Noyer, J L; Carreel, F; Dambier, D; Baurens, F C; Lanaud, C; Lagoda, P J

    1998-06-01

    The sequence-tagged microsatellite site (STMS) discrimination potential was explored using nine microsatellite primer pairs. STMS polymorphism was assayed by nonradioactive urea-polyacrylamide gel electrophoresis. Genetic relationships were examined among 59 genotypes of wild or cultivated accessions of diploid Musa acuminata. The organization of the subspecies was confirmed and some clone relationships were clarified.

  20. Expression and purification of recombinant proteins in Escherichia coli tagged with the metal-binding protein CusF.

    PubMed

    Cantu-Bustos, J Enrique; Vargas-Cortez, Teresa; Morones-Ramirez, Jose Ruben; Balderas-Renteria, Isaias; Galbraith, David W; McEvoy, Megan M; Zarate, Xristo

    2016-05-01

    Production of recombinant proteins in Escherichia coli has been improved considerably through the use of fusion proteins, because they increase protein solubility and facilitate purification via affinity chromatography. In this article, we propose the use of CusF as a new fusion partner for expression and purification of recombinant proteins in E. coli. Using a cell-free protein expression system, based on the E. coli S30 extract, Green Fluorescent Protein (GFP) was expressed with a series of different N-terminal tags, immobilized on self-assembled protein microarrays, and its fluorescence quantified. GFP tagged with CusF showed the highest fluorescence intensity, and this was greater than the intensities from corresponding GFP constructs that contained MBP or GST tags. Analysis of protein production in vivo showed that CusF produces large amounts of soluble protein with low levels of inclusion bodies. Furthermore, fusion proteins can be exported to the cellular periplasm, if CusF contains the signal sequence. Taking advantage of its ability to bind copper ions, recombinant proteins can be purified with readily available IMAC resins charged with this metal ion, producing pure proteins after purification and tag removal. We therefore recommend the use of CusF as a viable alternative to MBP or GST as a fusion protein/affinity tag for the production of soluble recombinant proteins in E. coli.

  1. Nonradioactive sequence-tagged microsatellite site analyses: a method transferable to the tropics.

    PubMed

    Lagoda, P J; Dambier, D; Grapin, A; Baurens, F C; Lanaud, C; Noyer, J L

    1998-02-01

    Utilization of existing isozyme analysis facilities to detect sequence-tagged microsatellite site (STMS) polymorphism or any simple sequence repeat (SSR) variation is described. Different parameters concerning the difficulties in transferring molecular techniques to less sophisticated laboratory infrastructures (i.e. tropical outstations) are discussed (e.g. reproducibility, efficacy, precision). Nonradioactive STMS analysis is bound to foster collaborative research between "biodiversity" and "biotechnology" centers.

  2. Intraclade Heterogeneity in Nitrogen Utilization by Marine Prokaryotes Revealed Using Stable Isotope Probing Coupled with Tag Sequencing (Tag-SIP).

    PubMed

    Morando, Michael; Capone, Douglas G

    2016-01-01

    Nitrogen can greatly influence the structure and productivity of microbial communities through its relative availability and form. However, the roles of specific organisms in the uptake of different nitrogen species remain poorly characterized. Most studies seeking to identify agents of assimilation have been correlative, indirectly linking activity measurements (e.g., nitrate uptake) with the presence or absence of biological markers, particularly functional genes and their transcripts. Evidence is accumulating of previously underappreciated functional diversity in major microbial subpopulations, which may confer physiological advantages under certain environmental conditions leading to ecotype divergence. This microdiversity further complicates our view of genetic variation in environmental samples requiring the development of more targeted approaches. Here, next-generation tag sequencing was successfully coupled with stable isotope probing (Tag-SIP) to assess the ability of individual phylotypes to assimilate a specific N source. Our results provide the first direct evidence of nitrate utilization by organisms thought to lack the genes required for this process including the heterotrophic clades SAR11 and the Archaeal Marine Group II. Alternatively, this may suggest the existence of tightly coupled metabolisms with primary assimilators, e.g., symbiosis, or the rapid and efficient scavenging of recently released products by highly active individuals. These results may be connected with global dominance often seen with these clades, likely conferring an advantage over other clades unable to access these resources. We also provide new direct evidence of in situ nitrate utilization by the cyanobacterium Prochlorococcus in support of recent findings. Furthermore, these results revealed widespread functional heterogeneity, i.e., different levels of nitrogen assimilation within clades, likely reflecting niche partitioning by ecotypes.

  3. Intraclade Heterogeneity in Nitrogen Utilization by Marine Prokaryotes Revealed Using Stable Isotope Probing Coupled with Tag Sequencing (Tag-SIP)

    PubMed Central

    Morando, Michael; Capone, Douglas G.

    2016-01-01

    Nitrogen can greatly influence the structure and productivity of microbial communities through its relative availability and form. However, the roles of specific organisms in the uptake of different nitrogen species remain poorly characterized. Most studies seeking to identify agents of assimilation have been correlative, indirectly linking activity measurements (e.g., nitrate uptake) with the presence or absence of biological markers, particularly functional genes and their transcripts. Evidence is accumulating of previously underappreciated functional diversity in major microbial subpopulations, which may confer physiological advantages under certain environmental conditions leading to ecotype divergence. This microdiversity further complicates our view of genetic variation in environmental samples requiring the development of more targeted approaches. Here, next-generation tag sequencing was successfully coupled with stable isotope probing (Tag-SIP) to assess the ability of individual phylotypes to assimilate a specific N source. Our results provide the first direct evidence of nitrate utilization by organisms thought to lack the genes required for this process including the heterotrophic clades SAR11 and the Archaeal Marine Group II. Alternatively, this may suggest the existence of tightly coupled metabolisms with primary assimilators, e.g., symbiosis, or the rapid and efficient scavenging of recently released products by highly active individuals. These results may be connected with global dominance often seen with these clades, likely conferring an advantage over other clades unable to access these resources. We also provide new direct evidence of in situ nitrate utilization by the cyanobacterium Prochlorococcus in support of recent findings. Furthermore, these results revealed widespread functional heterogeneity, i.e., different levels of nitrogen assimilation within clades, likely reflecting niche partitioning by ecotypes. PMID:27994576

  4. Efficient protein production method for NMR using soluble protein tags with cold shock expression vector.

    PubMed

    Hayashi, Kokoro; Kojima, Chojiro

    2010-11-01

    The E. coli protein expression system is one of the most useful methods employed for NMR sample preparation. However, the production of some recombinant proteins in E. coli is often hampered by difficulties such as low expression level and low solubility. To address these problems, a modified cold-shock expression system containing a glutathione S-transferase (GST) tag, the pCold-GST system, was investigated. The pCold-GST system successfully expressed 9 out of 10 proteins that otherwise could not be expressed using a conventional E. coli expression system. Here, we applied the pCold-GST system to 84 proteins and 78 proteins were successfully expressed in the soluble fraction. Three other cold-shock expression systems containing a maltose binding protein tag (pCold-MBP), protein G B1 domain tag (pCold-GB1) or thioredoxin tag (pCold-Trx) were also developed to improve the yield. Additionally, we show that a C-terminal proline tag, which is invisible in ¹H-¹⁵N HSQC spectra, inhibits protein degradation and increases the final yield of unstable proteins. The purified proteins were amenable to NMR analyses. These data suggest that pCold expression systems combined with soluble protein tags can be utilized to improve the expression and purification of various proteins for NMR analysis.

  5. Velocity measurement of clay intrusion through a sudden contraction step using a tagging pulse sequence.

    PubMed

    Tsushima, Shohji; Hasegawa, Atsushi; Suekane, Tetsuya; Hirai, Shuichiro; Tanaka, Yoshihiro; Nakasuji, Yoshizumi

    2003-07-01

    Magnetic resonance imaging (MRI) with a spatial tagging sequence was used to measure the velocity distribution of clay that was forced past a sudden contraction. A spatial tagging sequence provided magnetic resonance images of clay that allowed measurement of the velocity distribution in the clay, which can provide profound insights on the deformation process of clay during the intrusion process. The experiments were conducted using a specially-designed vessel that could operate at up to 30 MPa. The vessel offers a rectangle test section with a sudden contraction step that had a ratio of contraction of 2:1. The vessel was installed into a commercial magnetic resonance imaging equipment and then the fluid motion of clay flowing into the narrow contracted channel was quantitatively investigated to examine behaviors of flowing clay as non-Newtonian fluid. MRI results are compared with those obtained by computational fluid dynamics (CFD) calculation. Velocity distributions obtained from each tag displacement did not well agree with those predicted by CFD results near the contraction step where the fluid accelerated rapidly. However, a post-processing on calculation results, in which virtual tag displacement is calculated, gave better agreement with experiment and enabled us to compare MRI results with CFD results.

  6. Direct Quantitative Bisulfite Sequencing Using Tag-modified Primers and Internal Normalization.

    PubMed

    Dietrich, Dimo

    2016-12-01

    For the investigation of DNA methylation patterns, bisulfite conversion of the DNA followed by polymerase chain reaction (PCR) amplification and sequencing of the region of interest is the method of choice when information at single CpG site resolution is desired. In this study, a simple method for direct quantitative bisulfite sequencing based on the Sanger method is shown to be usable for the accurate analysis of single CpG sites. This method is based on the usage of tag-modified primers to obtain an internal normalization signal within the PCR product.

  7. Satellite-tagged transcribing sequences in Bubalus bubalis genome undergo programmed modulation in meiocytes: possible implications for transcriptional inactivation.

    PubMed

    Chattopadhyay, M; Gangadharan, S; Kapur, V; Azfer, M A; Prakash, B; Ali, S

    2001-09-01

    We cloned and sequenced a 1378 bp BamHI satellite DNA fraction from the water buffalo Bubalus bubalis and have studied its expression in different tissues. The GC-rich sequences of the resultant contig pDS5 crosshybridize only with bovid DNA and are not conserved evolutionarily. Typing of buffalo genomic DNA using pDS5 with several restriction enzymes revealed multilocus monomorphic bands. Similar typing of cattle, buffalo, goat, sheep, and gaur genomic DNA revealed variations in copy number and allele length giving rise to species-specific band patterns. Expression study of pDS5 in bubaline samples by RNA slot-blot, Northern blot, and RT-PCR showed various levels of signal in all the somatic tissues and germline cells except heart. A GenBank database search revealed homology of pDS5 sequences in the 5' region from nt 1-1261 with collagen gene. An AluI typing analysis of DNA from bubaline semen samples showed consistent loss of two bands. The presence of corresponding bands in somatic tissues suggests a sequence modulation within the pDS5 array in meiocytes during spermatogenesis, which is restored in the somatic cells after fertilization. Modulation of the satellite-tagged transcribing sequence in the meiocytes may be a mechanism of its inactivation.

  8. Maltose-Binding Protein (MBP), a Secretion-Enhancing Tag for Mammalian Protein Expression Systems.

    PubMed

    Reuten, Raphael; Nikodemus, Denise; Oliveira, Maria B; Patel, Trushar R; Brachvogel, Bent; Breloy, Isabelle; Stetefeld, Jörg; Koch, Manuel

    2016-01-01

    Recombinant proteins are commonly expressed in eukaryotic expression systems to ensure the formation of disulfide bridges and proper glycosylation. Although many proteins can be expressed easily, some proteins, sub-domains, and mutant protein versions can cause problems. Here, we investigated expression levels of recombinant extracellular, intracellular as well as transmembrane proteins tethered to different polypeptides in mammalian cell lines. Strikingly, fusion of proteins to the prokaryotic maltose-binding protein (MBP) generally enhanced protein production. MBP fusion proteins consistently exhibited the most robust increase in protein production in comparison to commonly used tags, e.g., the Fc, Glutathione S-transferase (GST), SlyD, and serum albumin (ser alb) tag. Moreover, proteins tethered to MBP revealed reduced numbers of dying cells upon transient transfection. In contrast to the Fc tag, MBP is a stable monomer and does not promote protein aggregation. Therefore, the MBP tag does not induce artificial dimerization of tethered proteins and provides a beneficial fusion tag for binding as well as cell adhesion studies. Using MBP we were able to secret a disease causing laminin β2 mutant protein (congenital nephrotic syndrome), which is normally retained in the endoplasmic reticulum. In summary, this study establishes MBP as a versatile expression tag for protein production in eukaryotic expression systems.

  9. Open-reading-frame sequence tags (OSTs) support the existence of at least 17,300 genes in C. elegans.

    PubMed

    Reboul, J; Vaglio, P; Tzellas, N; Thierry-Mieg, N; Moore, T; Jackson, C; Shin-i, T; Kohara, Y; Thierry-Mieg, D; Thierry-Mieg, J; Lee, H; Hitti, J; Doucette-Stamm, L; Hartley, J L; Temple, G F; Brasch, M A; Vandenhaute, J; Lamesch, P E; Hill, D E; Vidal, M

    2001-03-01

    The genome sequences of Caenorhabditis elegans, Drosophila melanogaster and Arabidopsis thaliana have been predicted to contain 19,000, 13,600 and 25,500 genes, respectively. Before this information can be fully used for evolutionary and functional studies, several issues need to be addressed. First, the gene number estimates obtained in silico and not yet supported by any experimental data need to be verified. For example, it seems biologically paradoxical that C. elegans would have 50% more genes than Drosophilia. Second, intron/exon predictions need to be tested experimentally. Third, complete sets of open reading frames (ORFs), or "ORFeomes," need to be cloned into various expression vectors. To address these issues simultaneously, we have designed and applied to C. elegans the following strategy. Predicted ORFs are amplified by PCR from a highly representative cDNA library using ORF-specific primers, cloned by Gateway recombination cloning and then sequenced to generate ORF sequence tags (OSTs) as a way to verify identity and splicing. In a sample (n=1,222) of the nearly 10,000 genes predicted ab initio (that is, for which no expressed sequence tag (EST) is available so far), at least 70% were verified by OSTs. We also observed that 27% of these experimentally confirmed genes have a structure different from that predicted by GeneFinder. We now have experimental evidence that supports the existence of at least 17,300 genes in C. elegans. Hence we suggest that gene counts based primarily on ESTs may underestimate the number of genes in human and in other organisms.

  10. High Level Expression and Purification of Recombinant Proteins from Escherichia coli with AK-TAG

    PubMed Central

    Luo, Dan; Wen, Caixia; Zhao, Rongchuan; Liu, Xinyu; Liu, Xinxin; Cui, Jingjing; Liang, Joshua G.; Liang, Peng

    2016-01-01

    Adenylate kinase (AK) from Escherichia coli was used as both solubility and affinity tag for recombinant protein production. When fused to the N-terminus of a target protein, an AK fusion protein could be expressed in soluble form and purified to near homogeneity in a single step from Blue-Sepherose via affinity elution with micromolar concentration of P1, P5- di (adenosine—5’) pentaphosphate (Ap5A), a transition-state substrate analog of AK. Unlike any other affinity tags, the level of a recombinant protein expression in soluble form and its yield of recovery during each purification step could be readily assessed by AK enzyme activity in near real time. Coupled to a His-Tag installed at the N-terminus and a thrombin cleavage site at the C terminus of AK, the streamlined method, here we dubbed AK-TAG, could also allow convenient expression and retrieval of a cleaved recombinant protein in high yield and purity via dual affinity purification steps. Thus AK-TAG is a new addition to the arsenal of existing affinity tags for recombinant protein expression and purification, and is particularly useful where soluble expression and high degree of purification are at stake. PMID:27214237

  11. Evidence from sequence-tagged-site markers of a recent progenitor-derivative species pair in conifers

    PubMed Central

    Perron, Martin; Perry, Daniel J.; Andalo, Christophe; Bousquet, Jean

    2000-01-01

    Black spruce (Picea mariana [B.S.P.] Mill.) and red spruce (Picea rubens Sarg.) are two conifer species known to hybridize naturally in northeastern North America. We hypothesized that there is a progenitor-derivative relationship between these two taxa and conducted a genetic investigation by using sequence-tagged-site markers of expressed genes. Based on the 26 sequence-tagged-site loci assayed in this study, the unbiased genetic identity between the two taxa was quite high with a value of 0.920. The mean number of polymorphic loci, the mean number of alleles per polymorphic locus, and the average observed heterozygosity were lower in red spruce (P = 35%, AP = 2.1, Ho = 0.069) than in black spruce (P = 54%, AP = 2.9, Ho = 0.103). No unique alleles were found in red spruce, and the observed patterns of allele distribution indicated that the genetic diversity of red spruce was essentially a subset of that found in black spruce. When considered in combination with ecological evidence and simulation results, these observations clearly support the existence of a progenitor-derivative relationship and suggest that the reduced level of genetic diversity in red spruce may result from allopatric speciation through glaciation-induced isolation of a preexisting black spruce population during the Pleistocene era. Our observations signal a need for a thorough reexamination of several conifer species complexes in which natural hybridization is known to occur. PMID:11016967

  12. A physical map of the X chromosome of Drosophila melanogaster: Cosmid contigs and sequence tagged sites

    SciTech Connect

    Madueno, E.; Modolell, J.; Papagiannakis, G.

    1995-04-01

    A physical map of the euchromatic X chromosome of Drosophila melanogaster has been constructed by assembling contiguous arrays of cosmids that were selected by screening a library with DNA isolated from microamplified chromosomal divisions. This map, consisting of 893 cosmids, covers {approximately}64% of the euchromatic part of the chromosome. In addition, 568 sequence tagged sites (STS), in aggregate representing 120 kb of sequenced DNA, were derived from selected cosmids. Most of these STSs, spaced at an average distance of {approximately} 35 kb along the euchromatic region of the chromosome, represent DNA tags that can be used as entry points to the fruitfly genome. Furthermore, 42 genes have been placed on the physical map, either through the hybridization of specific probes to the cosmids or through the fact that they were represented among the STSs. These provide a link between the physical and the genetic maps of D. melanogaster. Nine novel genes have been tentatively identified in Drosophila on the basis of matches between STS sequences and sequences from other species. 32 refs., 3 figs., 4 tabs.

  13. De novo sequencing of unique sequence tags for discovery of post-translational modifications of proteins

    SciTech Connect

    Shen, Yufeng; Tolic, Nikola; Hixson, Kim K.; Purvine, Samuel O.; Anderson, Gordon A.; Smith, Richard D.

    2008-10-15

    De novo sequencing has a promise to discover the protein post-translation modifications; however, such approach is still in their infancy and not widely applied for proteomics practices due to its limited reliability. In this work, we describe a de novo sequencing approach for discovery of protein modifications through identification of the UStags (Anal. Chem. 2008, 80, 1871-1882). The de novo information was obtained from Fourier-transform tandem mass spectrometry for peptides and polypeptides in a yeast lysate, and the de novo sequences obtained were filtered to define a more limited set of UStags. The DNA-predicted database protein sequences were then compared to the UStags, and the differences observed across or in the UStags (i.e., the UStags’ prefix and suffix sequences and the UStags themselves) were used to infer the possible sequence modifications. With this de novo-UStag approach, we uncovered some unexpected variances of yeast protein sequences due to amino acid mutations and/or multiple modifications to the predicted protein sequences. Random matching of the de novo sequences to the predicted sequences were examined with use of two random (false) databases, and ~3% false discovery rates were estimated for the de novo-UStag approach. The factors affecting the reliability (e.g., existence of de novo sequencing noise residues and redundant sequences) and the sensitivity are described. The de novo-UStag complements the UStag method previously reported by enabling discovery of new protein modifications.

  14. Scratching the surface of the rare biosphere with ribosomal sequence tag primers.

    PubMed

    Neufeld, Josh D; Li, Jason; Mohn, William W

    2008-06-01

    Increasingly large datasets of 16S rRNA gene sequences reveal new information about the extent of microbial diversity and the surprising extent of the rare biosphere. Currently, many of the largest datasets are represented by short and variable ribosomal sequence tags (RSTs) that are limited in their ability to accurately assign sequences to broad-scale phylogenetic trees. In this study, we selected 30 rare RSTs from existing sequence datasets and designed primers to amplify c. 1400 bases of the 16S rRNA gene to determine whether these sequences were represented by existing databases or if they might reveal new lineages within the Bacteria. Approximately one-third of the RST primers successfully amplified longer portions of these low-abundance 16S rRNA genes in a specific manner. Subsequent phylogenetic analysis demonstrated that most of these sequences were (1) distantly related to existing cultivated microorganisms and (2) closely related to uncultivated clone sequences that were recently deposited in GenBank. The presence of so many recently collected 16S rRNA gene reference sequences in existing databases suggests that progress is being made quickly towards a microbial census, one which has begun scratching the surface of the 'rare biosphere'.

  15. Modified PCR methods for 3' end amplification from serial analysis of gene expression (SAGE) tags.

    PubMed

    Xu, Wang-Jie; Wang, Zhao-Xia; Qiao, Zhong-Dong

    2009-05-01

    Serial analysis of gene expression (SAGE) is a powerful technique to study gene expression at the genome level. However, a disadvantage of the shortness of SAGE tags is that it prevents further study of SAGE library data, thus limiting extensive application of the SAGE method in gene expression studies. However, this problem can be solved by extension of the SAGE tags to 3' cDNAs. Therefore, several methods based on PCR have been developed to generate a 3' longer fragment cDNA corresponding to a SAGE tag. The list of modified methods is extensive, and includes rapid RT-PCR analysis of unknown SAGE tags (RAST-PCR), generation of longer cDNA fragments from SAGE tags for gene identification (GLGI), a high-throughput GLGI procedure, reverse SAGE (rSAGE), two-step analysis of unknown SAGE tags (TSAT-PCR), etc. These procedures are constantly being updated because they have characteristics and advantages that can be shared. Development of these methods has promoted the widespread use of the SAGE technique, and has accelerated the speed of studies of large-scale gene expression.

  16. Cloning, Expression, and Purification of Histidine-Tagged Escherichia coli Dihydrodipicolinate Reductase.

    PubMed

    Trigoso, Yvonne D; Evans, Russell C; Karsten, William E; Chooback, Lilian

    2016-01-01

    The enzyme dihydrodipicolinate reductase (DHDPR) is a component of the lysine biosynthetic pathway in bacteria and higher plants. DHDPR catalyzes the NAD(P)H dependent reduction of 2,3-dihydrodipicolinate to the cyclic imine L-2,3,4,5,-tetrahydropicolinic acid. The dapB gene that encodes dihydrodipicolinate reductase has previously been cloned, but the expression of the enzyme is low and the purification is time consuming. Therefore the E. coli dapB gene was cloned into the pET16b vector to improve the protein expression and simplify the purification. The dapB gene sequence was utilized to design forward and reverse oligonucleotide primers that were used to PCR the gene from Escherichia coli genomic DNA. The primers were designed with NdeI or BamHI restriction sites on the 5'and 3' terminus respectively. The PCR product was sequenced to confirm the identity of dapB. The gene was cloned into the expression vector pET16b through NdeI and BamHI restriction endonuclease sites. The resulting plasmid containing dapB was transformed into the bacterial strain BL21 (DE3). The transformed cells were utilized to grow and express the histidine-tagged reductase and the protein was purified using Ni-NTA affinity chromatography. SDS/PAGE gel analysis has shown that the protein was 95% pure and has approximate subunit molecular weight of 28 kDa. The protein purification is completed in one day and 3 liters of culture produced approximately 40-50 mgs of protein, an improvement on the previous protein expression and multistep purification.

  17. Construction of a dual-tag system for gene expression, protein affinity purification and fusion protein processing.

    PubMed

    Motejadded, Hassan; Altenbuchner, Josef

    2009-04-01

    An E. coli vector system was constructed which allows the expression of fusion genes via a L: -rhamnose-inducible promotor. The corresponding fusion proteins consist of the maltose-binding protein and a His-tag sequence for affinity purification, the Saccharomyces cerevisiae Smt3 protein for protein processing by proteolytic cleavage and the protein of interest. The Smt3 gene was codon-optimized for expression in E. coli. In a second rhamnose-inducible vector, the S. cerevisiae Ulp1 protease gene for processing Smt3 fusion proteins was fused in the same way to maltose-binding protein and His-tag sequence but without the Smt3 gene. The enhanced green fluorescent protein (eGFP) was used as reporter and protein of interest. Both fusion proteins (MalE-6xHis-Smt3-eGFP and MalE-6xHis-Ulp1) were efficiently produced in E. coli and separately purified by amylose resin. After proteolytic cleavage the products were applied to a Ni-NTA column to remove protease and tags. Pure eGFP protein was obtained in the flow-through of the column in a yield of around 35% of the crude cell extract.

  18. Differential Gene Expression in the Siphonophore Nanomia bijuga (Cnidaria) Assessed with Multiple Next-Generation Sequencing Workflows

    PubMed Central

    Siebert, Stefan; Robinson, Mark D.; Tintori, Sophia C.; Goetz, Freya; Helm, Rebecca R.; Smith, Stephen A.; Shaner, Nathan; Haddock, Steven H. D.; Dunn, Casey W.

    2011-01-01

    We investigated differential gene expression between functionally specialized feeding polyps and swimming medusae in the siphonophore Nanomia bijuga (Cnidaria) with a hybrid long-read/short-read sequencing strategy. We assembled a set of partial gene reference sequences from long-read data (Roche 454), and generated short-read sequences from replicated tissue samples that were mapped to the references to quantify expression. We collected and compared expression data with three short-read expression workflows that differ in sample preparation, sequencing technology, and mapping tools. These workflows were Illumina mRNA-Seq, which generates sequence reads from random locations along each transcript, and two tag-based approaches, SOLiD SAGE and Helicos DGE, which generate reads from particular tag sites. Differences in expression results across workflows were mostly due to the differential impact of missing data in the partial reference sequences. When all 454-derived gene reference sequences were considered, Illumina mRNA-Seq detected more than twice as many differentially expressed (DE) reference sequences as the tag-based workflows. This discrepancy was largely due to missing tag sites in the partial reference that led to false negatives in the tag-based workflows. When only the subset of reference sequences that unambiguously have tag sites was considered, we found broad congruence across workflows, and they all identified a similar set of DE sequences. Our results are promising in several regards for gene expression studies in non-model organisms. First, we demonstrate that a hybrid long-read/short-read sequencing strategy is an effective way to collect gene expression data when an annotated genome sequence is not available. Second, our replicated sampling indicates that expression profiles are highly consistent across field-collected animals in this case. Third, the impacts of partial reference sequences on the ability to detect DE can be mitigated through

  19. Differential gene expression in the siphonophore Nanomia bijuga (Cnidaria) assessed with multiple next-generation sequencing workflows.

    PubMed

    Siebert, Stefan; Robinson, Mark D; Tintori, Sophia C; Goetz, Freya; Helm, Rebecca R; Smith, Stephen A; Shaner, Nathan; Haddock, Steven H D; Dunn, Casey W

    2011-01-01

    We investigated differential gene expression between functionally specialized feeding polyps and swimming medusae in the siphonophore Nanomia bijuga (Cnidaria) with a hybrid long-read/short-read sequencing strategy. We assembled a set of partial gene reference sequences from long-read data (Roche 454), and generated short-read sequences from replicated tissue samples that were mapped to the references to quantify expression. We collected and compared expression data with three short-read expression workflows that differ in sample preparation, sequencing technology, and mapping tools. These workflows were Illumina mRNA-Seq, which generates sequence reads from random locations along each transcript, and two tag-based approaches, SOLiD SAGE and Helicos DGE, which generate reads from particular tag sites. Differences in expression results across workflows were mostly due to the differential impact of missing data in the partial reference sequences. When all 454-derived gene reference sequences were considered, Illumina mRNA-Seq detected more than twice as many differentially expressed (DE) reference sequences as the tag-based workflows. This discrepancy was largely due to missing tag sites in the partial reference that led to false negatives in the tag-based workflows. When only the subset of reference sequences that unambiguously have tag sites was considered, we found broad congruence across workflows, and they all identified a similar set of DE sequences. Our results are promising in several regards for gene expression studies in non-model organisms. First, we demonstrate that a hybrid long-read/short-read sequencing strategy is an effective way to collect gene expression data when an annotated genome sequence is not available. Second, our replicated sampling indicates that expression profiles are highly consistent across field-collected animals in this case. Third, the impacts of partial reference sequences on the ability to detect DE can be mitigated through

  20. Chromatin modification contributes to the expression divergence of three TaGS2 homoeologs in hexaploid wheat

    PubMed Central

    Zhang, Wei; Fan, Xiaoli; Gao, Yingjie; Liu, Lei; Sun, Lijing; Su, Qiannan; Han, Jie; Zhang, Na; Cui, Fa; Ji, Jun; Tong, Yiping; Li, Junming

    2017-01-01

    Plastic glutamine synthetase (GS2) is responsible for ammonium assimilation. The reason that TaGS2 homoeologs in hexaploid wheat experience different selection pressures in the breeding process remains unclear. TaGS2 were minimally expressed in roots but predominantly expressed in leaves, and TaGS2-B had higher expression than TaGS2-A and TaGS2-D. ChIP assays revealed that the activation of TaGS2-B expression in leaves was correlated with increased H3K4 trimethylation. The transcriptional silencing of TaGS2 in roots was correlated with greater cytosine methylation and less H3K4 trimethylation. Micrococcal nuclease and DNase I accessibility experiments indicated that the promoter region was more resistant to digestion in roots than leaves, which indicated that the closed nucleosome conformation of the promoter region was important to the transcription initiation for the spatial-temporal expression of TaGS2. In contrast, the transcribed regions possess different nuclease accessibilities of three TaGS2 homoeologs in the same tissue, suggesting that nucleosome conformation of the transcribed region was part of the fine adjustment of TaGS2 homoeologs. This study provides evidence that histone modification, DNA methylation and nuclease accessibility coordinated the control of the transcription of TaGS2 homoeologs. Our results provided important evidence that TaGS2-B experienced the strongest selection pressures during the breeding process. PMID:28300215

  1. Chromatin modification contributes to the expression divergence of three TaGS2 homoeologs in hexaploid wheat.

    PubMed

    Zhang, Wei; Fan, Xiaoli; Gao, Yingjie; Liu, Lei; Sun, Lijing; Su, Qiannan; Han, Jie; Zhang, Na; Cui, Fa; Ji, Jun; Tong, Yiping; Li, Junming

    2017-03-16

    Plastic glutamine synthetase (GS2) is responsible for ammonium assimilation. The reason that TaGS2 homoeologs in hexaploid wheat experience different selection pressures in the breeding process remains unclear. TaGS2 were minimally expressed in roots but predominantly expressed in leaves, and TaGS2-B had higher expression than TaGS2-A and TaGS2-D. ChIP assays revealed that the activation of TaGS2-B expression in leaves was correlated with increased H3K4 trimethylation. The transcriptional silencing of TaGS2 in roots was correlated with greater cytosine methylation and less H3K4 trimethylation. Micrococcal nuclease and DNase I accessibility experiments indicated that the promoter region was more resistant to digestion in roots than leaves, which indicated that the closed nucleosome conformation of the promoter region was important to the transcription initiation for the spatial-temporal expression of TaGS2. In contrast, the transcribed regions possess different nuclease accessibilities of three TaGS2 homoeologs in the same tissue, suggesting that nucleosome conformation of the transcribed region was part of the fine adjustment of TaGS2 homoeologs. This study provides evidence that histone modification, DNA methylation and nuclease accessibility coordinated the control of the transcription of TaGS2 homoeologs. Our results provided important evidence that TaGS2-B experienced the strongest selection pressures during the breeding process.

  2. Myocardial tagging by Cardiovascular Magnetic Resonance: evolution of techniques--pulse sequences, analysis algorithms, and applications

    PubMed Central

    2011-01-01

    Cardiovascular magnetic resonance (CMR) tagging has been established as an essential technique for measuring regional myocardial function. It allows quantification of local intramyocardial motion measures, e.g. strain and strain rate. The invention of CMR tagging came in the late eighties, where the technique allowed for the first time for visualizing transmural myocardial movement without having to implant physical markers. This new idea opened the door for a series of developments and improvements that continue up to the present time. Different tagging techniques are currently available that are more extensive, improved, and sophisticated than they were twenty years ago. Each of these techniques has different versions for improved resolution, signal-to-noise ratio (SNR), scan time, anatomical coverage, three-dimensional capability, and image quality. The tagging techniques covered in this article can be broadly divided into two main categories: 1) Basic techniques, which include magnetization saturation, spatial modulation of magnetization (SPAMM), delay alternating with nutations for tailored excitation (DANTE), and complementary SPAMM (CSPAMM); and 2) Advanced techniques, which include harmonic phase (HARP), displacement encoding with stimulated echoes (DENSE), and strain encoding (SENC). Although most of these techniques were developed by separate groups and evolved from different backgrounds, they are in fact closely related to each other, and they can be interpreted from more than one perspective. Some of these techniques even followed parallel paths of developments, as illustrated in the article. As each technique has its own advantages, some efforts have been made to combine different techniques together for improved image quality or composite information acquisition. In this review, different developments in pulse sequences and related image processing techniques are described along with the necessities that led to their invention, which makes this

  3. Primer and platform effects on 16S rRNA tag sequencing

    DOE PAGES

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; ...

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as wellmore » as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.« less

  4. Primer and platform effects on 16S rRNA tag sequencing

    SciTech Connect

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.

  5. Development of microsatellite markers in the tetraploid fern Ceratopteris thalictroides (Parkeriaceae) using RAD tag sequencing.

    PubMed

    Yang, X Y; Long, Z C; Gichira, A W; Guo, Y H; Wang, Q F; Chen, J M

    2016-02-19

    To understand the genetic variability of the tetraploid fern Ceratopteris thalictroides (Parkeriaceae), we described 30 polymorphic microsatellite markers obtained using the restriction site-associated DNA (RAD) tag sequencing technique. A total of 26 individuals were genotyped for each marker. The number of alleles per locus ranged from 4 to 10, and the expected heterozygosity and the Shannon-Wiener index ranged from 0.264 to 0.852 and 0.676 to 2.032, respectively. Because these 30 microsatellite markers exhibit high degrees of genetic variation, they will be useful tools for studying the adaptive genetic variation and sustainable conservation of C. thalictroides.

  6. An efficient tag derived from the common epitope of tospoviral NSs proteins for monitoring recombinant proteins expressed in both bacterial and plant systems.

    PubMed

    Cheng, Hao-Wen; Chen, Kuan-Chun; Raja, Joseph A J; Li, Jian-Xian; Yeh, Shyi-Dong

    2013-04-15

    NSscon (23 aa), a common epitope in the gene silencing suppressor NSs proteins of the members of the Watermelon silver mottle virus (WSMoV) serogroup, was previously identified. In this investigation, we expressed different green fluorescent protein (GFP)-fused deletions of NSscon in bacteria and reacted with NSscon monoclonal antibody (MAb). Our results indicated that the core 9 amino acids, "(109)KFTMHNQIF(117)", denoted as "nss", retain the reactivity of NSscon. In bacterial pET system, four different recombinant proteins labeled with nss, either at N- or C-extremes, were readily detectable without position effects, with sensitivity superior to that for the polyhistidine-tag. When the nss-tagged Zucchini yellow mosaic virus (ZYMV) helper component-protease (HC-Pro) and WSMoV nucleocapsid protein were transiently expressed by agroinfiltration in tobacco, they were readily detectable and the tag's possible efficacy for gene silencing suppression was not noticed. Co-immunoprecipitation of nss-tagged and non-tagged proteins expressed from bacteria confirmed the interaction of potyviral HC-Pro and coat protein. Thus, we conclude that this novel nss sequence is highly valuable for tagging recombinant proteins in both bacterial and plant expression systems.

  7. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array

    PubMed Central

    Fuller, Carl W.; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P. Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T.; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J.; Kasianowicz, John J.; Davis, Randy; Roever, Stefan; Church, George M.; Ju, Jingyue

    2016-01-01

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5′-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

  8. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array.

    PubMed

    Fuller, Carl W; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J; Kasianowicz, John J; Davis, Randy; Roever, Stefan; Church, George M; Ju, Jingyue

    2016-05-10

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5'-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods.

  9. Transcriptome analysis of the medulla tissue from cattle in response to bovine spongiform encephalopathy using digital gene expression tag profiling.

    PubMed

    Basu, Urmila; Almeida, Luciane; Olson, N Eric; Meng, Yan; Williams, John L; Moore, Stephen S; Guan, Le Luo

    2011-01-01

    Bovine spongiform encephalopathy (BSE) is a transmissible, fatal neurodegenerative disorder of cattle produced by prions. The use of excessive parallel sequencing for comparison of gene expression in bovine control and infected tissues may help to elucidate the molecular mechanisms associated with this disease. In this study, tag profiling Solexa sequencing was used for transcriptome analysis of bovine brain tissues. Replicate libraries were prepared from mRNA isolated from control and infected (challenged with 100 g of BSE-infected brain) medulla tissues 45 mo after infection. For each library, 5-6 million sequence reads were generated and approximately 67-70% of the reads were mapped against the Bovine Genome database to approximately 13,700-14,120 transcripts (each having at least one read). About 42-47% of the total reads mapped uniquely. Using the GeneSifter software package, 190 differentially expressed (DE) genes were identified (>2.0-fold change, p < .01): 73 upregulated and 117 downregulated. Seventy-nine DE genes had functions described in the Gene Ontology (GO) database and 16 DE genes were involved in 38 different pathways described in the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Digital analysis expression by tag profiling may be a powerful approach to comprehensive transcriptome analysis to identify changes associated with disease progression, leading to a better understanding of the underlying mechanism of pathogenesis of BSE.

  10. Moving Away from the Reference Genome: Evaluating a Peptide Sequencing Tagging Approach for Single Amino Acid Polymorphism Identifications in the Genus Populus

    SciTech Connect

    Abraham, Paul E; Adams, Rachel M; Tuskan, Gerald A; Hettich, Robert {Bob} L

    2013-01-01

    The genetic diversity across natural populations of the model organism, Populus, is extensive, containing a single nucleotide polymorphism roughly every 200 base pairs. When deviations from the reference genome occur in coding regions, they can impact protein sequences. Rather than relying on a static reference database to profile protein expression, we employed a peptide sequence tagging (PST) approach capable of decoding the plasticity of the Populus proteome. Using shotgun proteomics data from two genotypes of P. trichocarpa, a tag-based approach enabled the detection of 6,653 unexpected sequence variants. Through manual validation, our study investigated how the most abundant chemical modification (methionine oxidation) could masquerade as a sequence variant (AlaSer) when few site-determining ions existed. In fact, precise localization of an oxidation site for peptides with more than one potential placement was indeterminate for 70% of the MS/MS spectra. We demonstrate that additional fragment ions made available by high energy collisional dissociation enhances the robustness of the peptide sequence tagging approach (81% of oxidation events could be exclusively localized to a methionine). We are confident that augmenting fragmentation processes for a PST approach will further improve the identification of single amino acid polymorphism in Populus and potentially other species as well.

  11. Genomics of hybrid poplar (Populus trichocarpax deltoides) interacting with forest tent caterpillars (Malacosoma disstria): normalized and full-length cDNA libraries, expressed sequence tags, and a cDNA microarray for the study of insect-induced defences in poplar.

    PubMed

    Ralph, Steven; Oddy, Claire; Cooper, Dawn; Yueh, Hesther; Jancsik, Sharon; Kolosova, Natalia; Philippe, Ryan N; Aeschliman, Dana; White, Rick; Huber, Dezene; Ritland, Carol E; Benoit, François; Rigby, Tracey; Nantel, André; Butterfield, Yaron S N; Kirkpatrick, Robert; Chun, Elizabeth; Liu, Jerry; Palmquist, Diana; Wynhoven, Brian; Stott, Jeffrey; Yang, George; Barber, Sarah; Holt, Robert A; Siddiqui, Asim; Jones, Steven J M; Marra, Marco A; Ellis, Brian E; Douglas, Carl J; Ritland, Kermit; Bohlmann, Jörg

    2006-04-01

    As part of a genomics strategy to characterize inducible defences against insect herbivory in poplar, we developed a comprehensive suite of functional genomics resources including cDNA libraries, expressed sequence tags (ESTs) and a cDNA microarray platform. These resources are designed to complement the existing poplar genome sequence and poplar (Populus spp.) ESTs by focusing on herbivore- and elicitor-treated tissues and incorporating normalization methods to capture rare transcripts. From a set of 15 standard, normalized or full-length cDNA libraries, we generated 139,007 3'- or 5'-end sequenced ESTs, representing more than one-third of the c. 385,000 publicly available Populus ESTs. Clustering and assembly of 107,519 3'-end ESTs resulted in 14,451 contigs and 20,560 singletons, altogether representing 35,011 putative unique transcripts, or potentially more than three-quarters of the predicted c. 45,000 genes in the poplar genome. Using this EST resource, we developed a cDNA microarray containing 15,496 unique genes, which was utilized to monitor gene expression in poplar leaves in response to herbivory by forest tent caterpillars (Malacosoma disstria). After 24 h of feeding, 1191 genes were classified as up-regulated, compared to only 537 down-regulated. Functional classification of this induced gene set revealed genes with roles in plant defence (e.g. endochitinases, Kunitz protease inhibitors), octadecanoid and ethylene signalling (e.g. lipoxygenase, allene oxide synthase, 1-aminocyclopropane-1-carboxylate oxidase), transport (e.g. ABC proteins, calreticulin), secondary metabolism [e.g. polyphenol oxidase, isoflavone reductase, (-)-germacrene D synthase] and transcriptional regulation [e.g. leucine-rich repeat transmembrane kinase, several transcription factor classes (zinc finger C3H type, AP2/EREBP, WRKY, bHLH)]. This study provides the first genome-scale approach to characterize insect-induced defences in a woody perennial providing a solid platform for

  12. Expression of a Translationally Fused TAP-Tagged Plasma Membrane Proton Pump in Arabidopsis thaliana

    PubMed Central

    2015-01-01

    The Arabidopsis thaliana plasma membrane proton ATPase genes, AHA1 and AHA2, are the two most highly expressed isoforms of an 11 gene family and are collectively essential for embryo development. We report the translational fusion of a tandem affinity-purification tag to the 5′ end of the AHA1 open reading frame in a genomic clone. Stable expression of TAP-tagged AHA1 in Arabidopsis rescues the embryonic lethal phenotype of endogenous double aha1/aha2 knockdowns. Western blots of SDS-PAGE and Blue Native gels show enrichment of AHA1 in plasma membrane fractions and indicate a hexameric quaternary structure. TAP-tagged AHA1 rescue lines exhibited reduced vertical root growth. Analysis of the plasma membrane and soluble proteomes identified several plasma membrane-localized proteins with alterred abundance in TAP-tagged AHA1 rescue lines compared to wild type. Using affinity-purification mass spectrometry, we uniquely identified two additional AHA isoforms, AHA9 and AHA11, which copurified with TAP-tagged AHA1. In conclusion, we have generated transgenic Arabidopsis lines in which a TAP-tagged AHA1 transgene has complemented all essential endogenous AHA1 and AHA2 functions and have shown that these plants can be used to purify AHA1 protein and to identify in planta interacting proteins by mass spectrometry. PMID:24397334

  13. Optimized E. coli expression strain LOBSTR eliminates common contaminants from His-tag purification.

    PubMed

    Andersen, Kasper R; Leksa, Nina C; Schwartz, Thomas U

    2013-11-01

    His-tag affinity purification is one of the most commonly used methods to purify recombinant proteins expressed in E. coli. One drawback of using the His-tag is the co-purification of contaminating histidine-rich E. coli proteins. We engineered a new E. coli expression strain, LOBSTR (low background strain), which eliminates the most abundant contaminants. LOBSTR is derived from the E. coli BL21(DE3) strain and carries genomically modified copies of arnA and slyD, whose protein products exhibit reduced affinities to Ni and Co resins, resulting in a much higher purity of the target protein. The use of LOBSTR enables the pursuit of challenging low-expressing protein targets by reducing background contamination with no additional purification steps, materials, or costs, and thus pushes the limits of standard His-tag purifications.

  14. Accurate and unambiguous tag-to-gene mapping in serial analysis of gene expression

    PubMed Central

    Malig, Rodrigo; Varela, Cristian; Agosin, Eduardo; Melo, Francisco

    2006-01-01

    Background In this study, we present a robust and reliable computational method for tag-to-gene assignment in serial analysis of gene expression (SAGE). The method relies on current genome information and annotation, incorporation of several new features, and key improvements over alternative methods, all of which are important to determine gene expression levels more accurately. The method provides a complete annotation of potential virtual SAGE tags within a genome, along with an estimation of their confidence for experimental observation that ranks tags that present multiple matches in the genome. Results We applied this method to the Saccharomyces cerevisiae genome, producing the most thorough and accurate annotation of potential virtual SAGE tags that is available today for this organism. The usefulness of this method is exemplified by the significant reduction of ambiguous cases in existing experimental SAGE data. In addition, we report new insights from the analysis of existing SAGE data. First, we found that experimental SAGE tags mapping onto introns, intron-exon boundaries, and non-coding RNA elements are observed in all available SAGE data. Second, a significant fraction of experimental SAGE tags was found to map onto genomic regions currently annotated as intergenic. Third, a significant number of existing experimental SAGE tags for yeast has been derived from truncated cDNAs, which are synthesized through oligo-d(T) priming to internal poly-(A) regions during reverse transcription. Conclusion We conclude that an accurate and unambiguous tag mapping process is essential to increase the quality and the amount of information that can be extracted from SAGE experiments. This is supported by the results obtained here and also by the large impact that the erroneous interpretation of these data could have on downstream applications. PMID:17083742

  15. Species-diagnostic single-nucleotide polymorphism and sequence-tagged site markers for the parasitic wasp genus Nasonia (Hymenoptera: Pteromalidae).

    PubMed

    Niehuis, O; Judson, A K; Werren, J H; Hunter, W B; Dang, P M; Dowd, S E; Grillenberger, B; Beukeboom, L W; Gadau, J

    2007-08-01

    Wasps of the genus Nasonia are important biological control agents of house flies and related filth flies, which are major vectors of human pathogens. Species of Nasonia (Hymenoptera: Pteromalidae) are not easily differentiated from one another by morphological characters, and molecular markers for their reliable identification have been missing so far. Here, we report eight single-nucleotide polymorphism and three sequence-tagged site markers derived from expressed sequenced tag libraries for the two closely related and regionally sympatric species N. giraulti and N. vitripennis. We studied variation of these markers in natural populations of the two species, and we mapped them in the Nasonia genome. The markers are species-diagnostic and evenly spread over all five chromosomes. They are ideal for rapid species identification and hybrid recognition, and they can be used to map economically relevant quantitative trait loci in the Nasonia genome.

  16. Recombinant enterokinase light chain with affinity tag: expression from Saccharomyces cerevisiae and its utilities in fusion protein technology.

    PubMed

    Choi, S I; Song, H W; Moon, J W; Seong, B L

    2001-12-20

    Enterokinase and recombinant enterokinase light chain (rEK(L)) have been used widely to cleave fusion proteins with the target sequence of (Asp)(4)-Lys. In this work, we show that their utility as a site-specific cleavage agent is compromised by sporadic cleavage at other sites, albeit at low levels. Further degradation of the fusion protein in cleavage reaction is due to an intrinsic broad specificity of the enzyme rather than to the presence of contaminating proteases. To offer facilitated purification from fermentation broth and efficient removal of rEK(L) after cleavage reaction, thus minimizing unwanted cleavage of target protein, histidine affinity tag was introduced into rEK(L). Utilizing the secretion enhancer peptide derived from the human interleukin 1 beta, the recombinant EK(L) was expressed in Saccharomyces cerevisiae and efficiently secreted into culture medium. The C-terminal His-tagged EK(L) was purified in a single-step procedure on nickel affinity chromatography. It retained full enzymatic activity similar to that of EK(L), whereas the N-terminal His-tagged EK(L) was neither efficiently purified nor had any enzymatic activity. After cleavage reaction of fusion protein, the C-terminal His-tagged EK(L) was efficiently removed from the reaction mixture by a single passage through nickel-NTA spin column. The simple affinity tag renders rEK(L) extremely useful for purification, post-cleavage removal, recovery, and recycling and will broaden the utility and the versatility of the enterokinase for the production of recombinant proteins.

  17. Uncovering the Salt Response of Soybean by Unraveling Its Wild and Cultivated Functional Genomes Using Tag Sequencing

    PubMed Central

    Ali, Zulfiqar; Zhang, Da Yong; Xu, Zhao Long; Xu, Ling; Yi, Jin Xin; He, Xiao Lan; Huang, Yi Hong; Liu, Xiao Qing; Khan, Asif Ali; Trethowan, Richard M.; Ma, Hong Xiang

    2012-01-01

    Soil salinity has very adverse effects on growth and yield of crop plants. Several salt tolerant wild accessions and cultivars are reported in soybean. Functional genomes of salt tolerant Glycine soja and a salt sensitive genotype of Glycine max were investigated to understand the mechanism of salt tolerance in soybean. For this purpose, four libraries were constructed for Tag sequencing on Illumina platform. We identify around 490 salt responsive genes which included a number of transcription factors, signaling proteins, translation factors and structural genes like transporters, multidrug resistance proteins, antiporters, chaperons, aquaporins etc. The gene expression levels and ratio of up/down-regulated genes was greater in tolerant plants. Translation related genes remained stable or showed slightly higher expression in tolerant plants under salinity stress. Further analyses of sequenced data and the annotations for gene ontology and pathways indicated that soybean adapts to salt stress through ABA biosynthesis and regulation of translation and signal transduction of structural genes. Manipulation of these pathways may mitigate the effect of salt stress thus enhancing salt tolerance. PMID:23209559

  18. Application of HaloTag Technology to Expression and Purification of Cannabinoid Receptor CB2

    PubMed Central

    Locatelli-Hoops, Silvia; Sheen, Fangmin C.; Zoubak, Lioudmila; Gawrisch, Klaus; Yeliseev, Alexei A.

    2013-01-01

    Expression of milligram quantities of functional, stable G protein-coupled receptors (GPCR) for high-resolution structural studies remains a challenging task. The goal of this work was to evaluate the usefulness of the HaloTag system (Promega) for expression and purification of the human cannabinoid receptor CB2, an important target for development of drugs for treatment of immune disorders, inflammation, and pain. Here we investigated expression in Escherichia coli cells of the integral membrane receptor CB2 as a fusion with the 34 kDa HaloTag at N- or C-terminal location, either in the presence or in the absence of the N-terminal maltose-binding protein (MBP). The CB2 was flanked at both ends by the tobacco etch virus (TEV) protease cleavage sites to allow for subsequent removal of expression partners. Expression by induction with either IPTG (in E. coli BL21(DE3) cell cultures) or by auto-induction (in E. coli KRX cells) were compared. While the N-terminal location of the HaloTag resulted in high levels of expression of the fusion CB2, the recombinant receptor was not functional. However, when the HaloTag was placed in the C-terminal location, a fully active receptor was produced irrespective of induction method or bacterial strain used. For purification, the fusion protein was captured onto HaloLink resin in the presence of detergents. Treatment with specific TEV protease released the CB2 upon washing. To our knowledge, this study represents the first example of expression, surface immobilization and purification of a functional GPCR using HaloTag technology. PMID:23470778

  19. Chromatographic purification of an insoluble histidine tag recombinant Ykt6p SNARE from Arabidopsis thaliana over-expressed in E. coli.

    PubMed

    Vincent, Patrick; Dieryck, Wilfrid; Maneta-Peyret, Lilly; Moreau, Patrick; Cassagne, Claude; Santarelli, Xavier

    2004-08-25

    In order to undertake in plant cell the study of the endoplasmic reticulum (ER)-Golgi apparatus (GA) protein and/or lipid vesicular transport pathway, expressed sequence tag (EST) coding for a homologue to the yeast soluble N-ethylmaleimide-sensitive factor attachment protein receptors (SNAREs) Ykt6p has been cloned in Arabidopsis thaliana by reverse transcription polymerase chain reaction (RT-PCR). The corresponding protein was over-expressed as a recombinant histidine-tag (his-tag) protein in E. coli. Starting from one litter of culture, an ultrasonic homogenization was performed for cell disruption and after centrifugation the Arabidopsis Ykt6p SNARE present in inclusion bodies in the pellet was solubilized. After centrifugation, the clarified feedstock obtained was injected onto an immobilized metal affinity chromatography (IMAC) in presence of 6 M guanidine and on-column refolding was performed. Folded and subsequently purified (94% purity) recombinant protein was obtained with 82% of recovery.

  20. Serial number tagging reveals a prominent sequence preference of retrotransposon integration.

    PubMed

    Chatterjee, Atreyi Ghatak; Esnault, Caroline; Guo, Yabin; Hung, Stevephen; McQueen, Philip G; Levin, Henry L

    2014-07-01

    Transposable elements (TE) have both negative and positive impact on the biology of their host. As a result, a balance is struck between the host and the TE that relies on directing integration to specific genome territories. The extraordinary capacity of DNA sequencing can create ultra dense maps of integration that are being used to study the mechanisms that position integration. Unfortunately, the great increase in the numbers of insertion sites detected comes with the cost of not knowing which positions are rare targets and which sustain high numbers of insertions. To address this problem we developed the serial number system, a TE tagging method that measures the frequency of integration at single nucleotide positions. We sequenced 1 million insertions of retrotransposon Tf1 in the genome of Schizosaccharomyces pombe and obtained the first profile of integration with frequencies for each individual position. Integration levels at individual nucleotides varied over two orders of magnitude and revealed that sequence recognition plays a key role in positioning integration. The serial number system is a general method that can be applied to determine precise integration maps for retroviruses and gene therapy vectors.

  1. Large-scale clustering of CAGE tag expression data

    PubMed Central

    Shimokawa, Kazuro; Okamura-Oho, Yuko; Kurita, Takio; Frith, Martin C; Kawai, Jun; Carninci, Piero; Hayashizaki, Yoshihide

    2007-01-01

    Background Recent analyses have suggested that many genes possess multiple transcription start sites (TSSs) that are differentially utilized in different tissues and cell lines. We have identified a huge number of TSSs mapped onto the mouse genome using the cap analysis of gene expression (CAGE) method. The standard hierarchical clustering algorithm, which gives us easily understandable graphical tree images, has difficulties in processing such huge amounts of TSS data and a better method to calculate and display the results is needed. Results We use a combination of hierarchical and non-hierarchical clustering to cluster expression profiles of TSSs based on a large amount of CAGE data to profit from the best of both methods. We processed the genome-wide expression data, including 159,075 TSSs derived from 127 RNA samples of various organs of mouse, and succeeded in categorizing them into 70–100 clusters. The clusters exhibited intriguing biological features: a cluster supergroup with a ubiquitous expression profile, tissue-specific patterns, a distinct distribution of non-coding RNA and functional TSS groups. Conclusion Our approach succeeded in greatly reducing the calculation cost, and is an appropriate solution for analyzing large-scale TSS usage data. PMID:17517134

  2. WEBSAGE: a web tool for visual analysis of differentially expressed human SAGE tags.

    PubMed

    Pylouster, Jean; Sénamaud-Beaufort, Catherine; Saison-Behmoaras, Tula Ester

    2005-07-01

    The serial analysis of gene expression (SAGE) is a powerful method to compare gene expression of mRNA populations. To provide quantitative expression levels on a genome-wide scale, the Cancer Genome Anatomy Project (CGAP) uses SAGE. Over 7 million SAGE tags, from 171 human cell types have been assembled. The growing number of laboratories involved in SAGE research necessitates the use of software that provides statistical analysis of raw data, allowing the rapid visualization and interpretation of results. We have created the first simple tool that performs statistical analysis on SAGE data, identifies the tags differentially expressed and shows the results in a scatter plot. It is freely available and accessible at http://bioserv.rpbs.jussieu.fr/websage/index.php.

  3. Expression and purification of recombinant proteins in Escherichia coli tagged with a small metal-binding protein from Nitrosomonas europaea.

    PubMed

    Vargas-Cortez, Teresa; Morones-Ramirez, Jose Ruben; Balderas-Renteria, Isaias; Zarate, Xristo

    2016-02-01

    Escherichia coli is still the preferred organism for large-scale production of recombinant proteins. The use of fusion proteins has helped considerably in enhancing the solubility of heterologous proteins and their purification with affinity chromatography. Here, the use of a small metal-binding protein (SmbP) from Nitrosomonas europaea is described as a new fusion protein for protein expression and purification in E. coli. Fluorescent proteins tagged at the N-terminal with SmbP showed high levels of solubility, compared with those of maltose-binding protein and glutathione S-transferase, and low formation of inclusion bodies. Using commercially available IMAC resins charged with Ni(II), highly pure recombinant proteins were obtained after just one chromatography step. Proteins may be purified from the periplasm of E. coli if SmbP contains the signal sequence at the N-terminal. After removal of the SmbP tag from the protein of interest, high-yields are obtained since SmbP is a protein of just 9.9 kDa. The results here obtained suggest that SmbP is a good alternative as a fusion protein/affinity tag for the production of soluble recombinant proteins in E. coli.

  4. Human expressed tagged sites on the X chromosome: A mapping resource for heritable sex-linked chorioretinal disorders

    SciTech Connect

    MacDonald, I.M.; Nesslinger, N.; Wong, P.

    1994-09-01

    We have isolated a bank of human X-specific genomic clones which harbor chorioretinal expressed sequences using library to library cross-screening. The steps included (1) the creation of a {lambda}gt-10 library of human chorioretinal cDNA, (2) the creation of a human X-specific EMBL-3 genomic library from a somatic cell hybrid (82082a) containing the X chromosome as its only human component and lacking the hamster X, and (3) a PCR-based cross-screen to identify 78 clones expressed in choroid and retina. The characterization of one human X-specific EMBL-3 clone (XEH.8; DXS542) has provided a clear illustration of the feasibility of this approach. FISH mapping confirms the regional localization of XEH.8 to Xp11. Localization of additional clones, XEH.1, XEH.34, XEH.41, and XEH.52 will be presented along with partial sequencing and characterization. Our approach has focused on the search for expressed sequences which can serve as expressed tagged sites (ESTs) in mapping or candidate genes for heritable eye disorders.

  5. Gene Catchr—Gene Cloning And Tagging for Caenorhabditis elegans using yeast Homologous Recombination: a novel approach for the analysis of gene expression

    PubMed Central

    Sassi, Holly E.; Renihan, Stephanie; Spence, Andrew M.; Cooperstock, Ramona L.

    2005-01-01

    Expression patterns of gene products provide important insights into gene function. Reporter constructs are frequently used to analyze gene expression in Caenorhabditis elegans, but the sequence context of a given gene is inevitably altered in such constructs. As a result, these transgenes may lack regulatory elements required for proper gene expression. We developed Gene Catchr, a novel method of generating reporter constructs that exploits yeast homologous recombination (YHR) to subclone and tag worm genes while preserving their local sequence context. YHR facilitates the cloning of large genomic regions, allowing the isolation of regulatory sequences in promoters, introns, untranslated regions and flanking DNA. The endogenous regulatory context of a given gene is thus preserved, producing expression patterns that are as accurate as possible. Gene Catchr is flexible: any tag can be inserted at any position without introducing extra sequence. Each step is simple and can be adapted to process multiple genes in parallel. We show that expression patterns derived from Gene Catchr transgenes are consistent with previous reports and also describe novel expression data. Mutant rescue assays demonstrate that Gene Catchr-generated transgenes are functional. Our results validate the use of Gene Catchr as a valuable tool to study spatiotemporal gene expression. PMID:16254074

  6. Population Genomics of Parallel Adaptation in Threespine Stickleback using Sequenced RAD Tags

    PubMed Central

    Etter, Paul D.; Stiffler, Nicholas; Johnson, Eric A.; Cresko, William A.

    2010-01-01

    Next-generation sequencing technology provides novel opportunities for gathering genome-scale sequence data in natural populations, laying the empirical foundation for the evolving field of population genomics. Here we conducted a genome scan of nucleotide diversity and differentiation in natural populations of threespine stickleback (Gasterosteus aculeatus). We used Illumina-sequenced RAD tags to identify and type over 45,000 single nucleotide polymorphisms (SNPs) in each of 100 individuals from two oceanic and three freshwater populations. Overall estimates of genetic diversity and differentiation among populations confirm the biogeographic hypothesis that large panmictic oceanic populations have repeatedly given rise to phenotypically divergent freshwater populations. Genomic regions exhibiting signatures of both balancing and divergent selection were remarkably consistent across multiple, independently derived populations, indicating that replicate parallel phenotypic evolution in stickleback may be occurring through extensive, parallel genetic evolution at a genome-wide scale. Some of these genomic regions co-localize with previously identified QTL for stickleback phenotypic variation identified using laboratory mapping crosses. In addition, we have identified several novel regions showing parallel differentiation across independent populations. Annotation of these regions revealed numerous genes that are candidates for stickleback phenotypic evolution and will form the basis of future genetic analyses in this and other organisms. This study represents the first high-density SNP–based genome scan of genetic diversity and differentiation for populations of threespine stickleback in the wild. These data illustrate the complementary nature of laboratory crosses and population genomic scans by confirming the adaptive significance of previously identified genomic regions, elucidating the particular evolutionary and demographic history of such regions in natural

  7. The Use of Affinity Tags to Overcome Obstacles in Recombinant Protein Expression and Purification.

    PubMed

    Amarasinghe, Chinthaka; Jin, Jian-Ping

    2015-01-01

    Research and industrial demands for recombinant proteins continue to increase over time for their broad applications in structural and functional studies and as therapeutic agents. These applications often require large quantities of recombinant protein at desirable purity, which highlights the importance of developing and improving production approaches that provide high level expression and readily achievable purity of recombinant protein. E. coli is the most widely used host for the expression of a diverse range of proteins at low cost. However, there are common pitfalls that can severely limit the expression of exogenous proteins, such as stability, low solubility and toxicity to the host cell. To overcome these obstacles, one strategy that has found to be promising is the use of affinity tags or carrier peptide to aid in the folding of the target protein, increase solubility, lower toxicity and increase the level of expression. In the meantime, the tags and fusion proteins can be designed to facilitate affinity purification. Since the fusion protein may not exhibit the native conformation of the target protein, various strategies have been developed to remove the tag during or after purification to avoid potential complications in structural and functional studies and to obtain native biological activities. Despite extensive research and rapid development along these lines, there are unsolved problems and imperfect applications. This focused review compares and contrasts various strategies that employ affinity tags to improve bacterial expression and to facilitate purification of recombinant proteins. The pros and cons of the approaches are discussed for more effective applications and new directions of future improvement.

  8. p53 elevation in human cells halt SV40 infection by inhibiting T-ag expression

    PubMed Central

    Drayman, Nir; Ben-nun-Shaul, Orly; Butin-Israeli, Veronika; Srivastava, Rohit; Rubinstein, Ariel M.; Mock, Caroline S.; Elyada, Ela; Ben-Neriah, Yinon; Lahav, Galit; Oppenheim, Ariella

    2016-01-01

    SV40 large T-antigen (T-ag) has been known for decades to inactivate the tumor suppressor p53 by sequestration and additional mechanisms. Our present study revealed that the struggle between p53 and T-ag begins very early in the infection cycle. We found that p53 is activated early after SV40 infection and defends the host against the infection. Using live cell imaging and single cell analyses we found that p53 dynamics are variable among individual cells, with only a subset of cells activating p53 immediately after SV40 infection. This cell-to-cell variabilty had clear consequences on the outcome of the infection. None of the cells with elevated p53 at the beginning of the infection proceeded to express T-ag, suggesting a p53-dependent decision between abortive and productive infection. In addition, we show that artificial elevation of p53 levels prior to the infection reduces infection efficiency, supporting a role for p53 in defending against SV40. We further found that the p53-mediated host defense mechanism against SV40 is not facilitated by apoptosis nor via interferon-stimulated genes. Instead p53 binds to the viral DNA at the T-ag promoter region, prevents its transcriptional activation by Sp1, and halts the progress of the infection. These findings shed new light on the long studied struggle between SV40 T-ag and p53, as developed during virus-host coevolution. Our studies indicate that the fate of SV40 infection is determined as soon as the viral DNA enters the nucleus, before the onset of viral gene expression. PMID:27462916

  9. Transposon Tc1-derived, sequence-tagged sites in Caenorhabditis elegans as markers for gene mapping

    PubMed Central

    Korswagen, Hendrik C.; Durbin, Richard M.; Smits, Miriam T.; Plasterk, Ronald H. A.

    1996-01-01

    We present an approach to map large numbers of Tc1 transposon insertions in the genome of Caenorhabditis elegans. Strains have been described that contain up to 500 polymorphic Tc1 insertions. From these we have cloned and shotgun sequenced over 2000 Tc1 flanks, resulting in an estimated set of 400 or more distinct Tc1 insertion alleles. Alignment of these sequences revealed a weak Tc1 insertion site consensus sequence that was symmetric around the invariant TA target site and reads CAYATATRTG. The Tc1 flanking sequences were compared with 40 Mbp of a C. elegans genome sequence. We found 151 insertions within the sequenced area, a density of ≈1 Tc1 insertion in every 265 kb. As the rest of the C. elegans genome sequence is obtained, remaining Tc1 alleles will fall into place. These mapped Tc1 insertions can serve two functions: (i) insertions in or near genes can be used to isolate deletion derivatives that have that gene mutated; and (ii) they represent a dense collection of polymorphic sequence-tagged sites. We demonstrate a strategy to use these Tc1 sequence-tagged sites in fine-mapping mutations. PMID:8962114

  10. Preparation of next-generation sequencing libraries using Nextera™ technology: simultaneous DNA fragmentation and adaptor tagging by in vitro transposition.

    PubMed

    Caruccio, Nicholas

    2011-01-01

    DNA library preparation is a common entry point and bottleneck for next-generation sequencing. Current methods generally consist of distinct steps that often involve significant sample loss and hands-on time: DNA fragmentation, end-polishing, and adaptor-ligation. In vitro transposition with Nextera™ Transposomes simultaneously fragments and covalently tags the target DNA, thereby combining these three distinct steps into a single reaction. Platform-specific sequencing adaptors can be added, and the sample can be enriched and bar-coded using limited-cycle PCR to prepare di-tagged DNA fragment libraries. Nextera technology offers a streamlined, efficient, and high-throughput method for generating bar-coded libraries compatible with multiple next-generation sequencing platforms.

  11. Fungal His-Tagged Nitrilase from Gibberella intermedia: Gene Cloning, Heterologous Expression and Biochemical Properties

    PubMed Central

    Gong, Jin-Song; Li, Heng; Zhu, Xiao-Yan; Lu, Zhen-Ming; Wu, Yan; Shi, Jing-Song; Xu, Zheng-Hong

    2012-01-01

    Background Nitrilase is an important member of the nitrilase superfamiliy. It has attracted substantial interest from academia and industry for its function of converting nitriles directly into the corresponding carboxylic acids in recent years. Thus nitrilase has played a crucial role in production of commercial carboxylic acids in chemical industry and detoxification of nitrile-contaminated wastes. However, conventional studies mainly focused on the bacterial nitrilase and the potential of fungal nitrilase has been far from being fully explored. Research on fungal nitrilase gene expression will advance our understanding for its biological function of fungal nitrilase in nitrile hydrolysis. Methodology/Principal Findings A fungal nitrilase gene from Gibberella intermedia was cloned through reverse transcription-PCR. The open reading frame consisted of 963 bp and potentially encoded a protein of 320 amino acid residues with a theoretical molecular mass of 35.94 kDa. Furthermore, the catalytic triad (Glu-45, Lys-127, and Cys-162) was proposed and confirmed by site-directed mutagenesis. The encoding gene was expressed in Escherichia coli Rosetta-gami (DE3) and the recombinant protein with His6-tag was purified to electrophoretic homogeneity. The purified enzyme exhibited optimal activity at 45°C and pH 7.8. This nitrilase was specific towards aliphatic and aromatic nitriles. The kinetic parameters Vmax and Km for 3-cyanopyridine were determined to be 0.81 µmol/min·mg and 12.11 mM through Hanes-Woolf plot, respectively. 3-Cyanopyridine (100 mM) could be thoroughly hydrolyzed into nicotinic acid within 10 min using the recombinant strain with the release of about 3% nicotinamide and no substrate was detected. Conclusions/Significance In the present study, a fungal nitrilase was cloned from the cDNA sequence of G. intermedia and successfully expressed in E. coli Rosetta-gami (DE3). The recombinant strain displayed good 3-cyanopyridine degradation efficiency and wide

  12. Broad host range vectors for expression of proteins with (Twin-) Strep-tag, His-tag and engineered, export optimized yellow fluorescent protein

    PubMed Central

    2013-01-01

    Background In current protein research, a limitation still is the production of active recombinant proteins or native protein associations to assess their function. Especially the localization and analysis of protein-complexes or the identification of modifications and small molecule interaction partners by co-purification experiments requires a controllable expression of affinity- and/or fluorescence tagged variants of a protein of interest in its native cellular background. Advantages of periplasmic and/or homologous expressions can frequently not be realized due to a lack of suitable tools. Instead, experiments are often limited to the heterologous production in one of the few well established expression strains. Results Here, we introduce a series of new RK2 based broad host range expression plasmids for inducible production of affinity- and fluorescence tagged proteins in the cytoplasm and periplasm of a wide range of Gram negative hosts which are designed to match the recently suggested modular Standard European Vector Architecture and database. The vectors are equipped with a yellow fluorescent protein variant which is engineered to fold and brightly fluoresce in the bacterial periplasm following Sec-mediated export, as shown from fractionation and imaging studies. Expression of Strep-tag®II and Twin-Strep-tag® fusion proteins in Pseudomonas putida KT2440 is demonstrated for various ORFs. Conclusion The broad host range constructs we have produced enable good and controlled expression of affinity tagged protein variants for single-step purification and qualify for complex co-purification experiments. Periplasmic export variants enable production of affinity tagged proteins and generation of fusion proteins with a novel engineered Aequorea-based yellow fluorescent reporter protein variant with activity in the periplasm of the tested Gram-negative model bacteria Pseudomonas putida KT2440 and Escherichia coli K12 for production, localization or co

  13. Two versatile eukaryotic vectors permitting epitope tagging, radiolabelling and nuclear localisation of expressed proteins.

    PubMed

    Georgiev, O; Bourquin, J P; Gstaiger, M; Knoepfel, L; Schaffner, W; Hovens, C

    1996-02-12

    Two versatile eukaryotic expression vectors have been developed which permit the production of an epitope-tagged cDNA insert by transient transfection in mammalian cells or by in vitro transcription-translation. The first vector, pCATCH, can be used to clone cDNA inserts in three different frames via eight unique restriction sites in a multiple cloning site (MCS) located downstream from both the FLAG epitope and the specific heart muscle kinase phosphorylation site, conferring the possibility of in vitro radiolabelling. A specific protease cleavage site enables the removal of the FLAG epitope, simplifying affinity purification of recombinant CATCH proteins. pCATCH possesses stop codons in all three reading frames at the 3' terminal end of the MCS. A derivate of this vector, pCATCH-NLS, was constructed by incorporating an SV40 nuclear localisation signal upstream from the MCS, for directed localisation of the tagged proteins.

  14. DeNovoID: a web-based tool for identifying peptides from sequence and mass tags deduced from de novo peptide sequencing by mass spectroscopy.

    PubMed

    Halligan, Brian D; Ruotti, Victor; Twigger, Simon N; Greene, Andrew S

    2005-07-01

    One of the core activities of high-throughput proteomics is the identification of peptides from mass spectra. Some peptides can be identified using spectral matching programs like Sequest or Mascot, but many spectra do not produce high quality database matches. De novo peptide sequencing is an approach to determine partial peptide sequences for some of the unidentified spectra. A drawback of de novo peptide sequencing is that it produces a series of ordered and disordered sequence tags and mass tags rather than a complete, non-degenerate peptide amino acid sequence. This incomplete data is difficult to use in conventional search programs such as BLAST or FASTA. DeNovoID is a program that has been specifically designed to use degenerate amino acid sequence and mass data derived from MS experiments to search a peptide database. Since the algorithm employed depends on the amino acid composition of the peptide and not its sequence, DeNovoID does not have to consider all possible sequences, but rather a smaller number of compositions consistent with a spectrum. DeNovoID also uses a geometric indexing scheme that reduces the number of calculations required to determine the best peptide match in the database. DeNovoID is available at http://proteomics.mcw.edu/denovoid.

  15. Metagenomic sequencing of expressed prostate secretions.

    PubMed

    Smelov, Vitaly; Arroyo Mühr, L Sara; Bzhalava, Davit; Brown, Lyndon J; Komyakov, Boris; Dillner, Joakim

    2014-12-01

    To investigate which microorganisms may be present in expressed prostate secretions (EPS) metagenomic sequencing (MGS) was applied to prostate secretion samples from five men with prostatitis and five matched control men as well as to combined expressed prostate secretion and urine from six patients with prostate cancer and six matched control men. The prostate secretion samples contained a variety of bacterial sequences, mostly belonging to the Proteobacteria phylum. The combined prostate secretion and urine samples were dominated by abundant presence of the JC polyomavirus, representing >20% of all detected metagenomic sequence reads. There were also other viruses detected, for example, human papillomavirus type 81. All combined prostate secretion and urine samples were also positive for Proteobacteria. In summary, MGS of expressed prostate secretion is informative for detecting a variety of bacteria and viruses, suggesting that a more large-scale use of MGS of prostate secretions may be useful in medical and epidemiological studies of prostate infections.

  16. Heterologous expression and N-terminal His-tagging processes affect the catalytic properties of staphylococcal lipases: a monolayer study.

    PubMed

    Horchani, Habib; Sabrina, Lignon; Régine, Lebrun; Sayari, Adel; Gargouri, Youssef; Verger, Robert

    2010-10-15

    The interfacial and kinetic properties of wild type, untagged recombinant and tagged recombinant forms of three staphylococcal lipases (SSL, SXL and SAL3) were compared using the monomolecular film technique. A kinetic study on the dependence of the stereoselectivity of these nine lipase forms on the surface pressure was performed using the three dicaprin isomers spread in the form of monomolecular films at the air-water interface. New parameters, termed Recombinant expression Effects on Catalysis (REC), N-Tag Effects on Catalysis (TEC), and N-Tag and Recombinant expression Effects on Catalysis (TREC), were introduced. The findings obtained showed that with all the lipases tested, the recombinant expression process and the N-terminal His-tag slightly affect the sn-1 preference for dicaprin enantiomers as well as the penetration capacity into monomolecular films of phosphatidylcholine but significantly decrease the catalytic rate of hydrolysis of three dicaprin isomers. This rate reduction is more pronounced at high surface pressures, i.e. at low interfacial energies. In conclusion, the effects of the heterologous expression process on the catalytic properties of the staphylococcal lipases are three times more deleterious than the presence of an N-terminal tag extension. In the case of the situation most commonly encountered in the literature, i.e. the heterologous expression of a tagged lipase, the rate of catalysis can be decreased by these processes by 42-83% on average in comparison with the values measured with the corresponding wild type form.

  17. Detection of genes expressed in Bordetella bronchiseptica colonizing rat trachea by in vivo expressed-tag immunoprecipitation method.

    PubMed

    Abe, Hiroyuki; Kamitani, Shigeki; Fukui-Miyazaki, Aya; Shinzawa, Naoaki; Nakamura, Keiji; Horiguchi, Yasuhiko

    2015-05-01

    Analyses of bacterial genes expressed in response to the host environment provide clues to understanding the host-pathogen interactions that lead to the establishment of infection. In this study, a novel method named In Vivo Expressed-Tag ImmunoPrecipitation (IVET-PI) was developed for detecting genes expressed in bacteria that are recovered in a small numbers from host tissues. IVET-IP was designed to overcome some drawbacks of previous similar methods. We applied IVET-IP to Bordetella bronchiseptica colonizing rat trachea and identified 173 genes that were expressed in the bacteria over the entire course of an infection. These gene products included two transcriptional factors that are involved in the expression of filamentous hemagglutinin, adenylate cyclase toxin, and major virulence factors for the bordetellae. We consider that this method might provide novel insight into the course of Bordetella infection.

  18. Metagenomic 16S rDNA Illumina tags are a powerful alternative to amplicon sequencing to explore diversity and structure of microbial communities.

    PubMed

    Logares, Ramiro; Sunagawa, Shinichi; Salazar, Guillem; Cornejo-Castillo, Francisco M; Ferrera, Isabel; Sarmento, Hugo; Hingamp, Pascal; Ogata, Hiroyuki; de Vargas, Colomban; Lima-Mendez, Gipsi; Raes, Jeroen; Poulain, Julie; Jaillon, Olivier; Wincker, Patrick; Kandels-Lewis, Stefanie; Karsenti, Eric; Bork, Peer; Acinas, Silvia G

    2014-09-01

    Sequencing of 16S rDNA polymerase chain reaction (PCR) amplicons is the most common approach for investigating environmental prokaryotic diversity, despite the known biases introduced during PCR. Here we show that 16S rDNA fragments derived from Illumina-sequenced environmental metagenomes (mi tags) are a powerful alternative to 16S rDNA amplicons for investigating the taxonomic diversity and structure of prokaryotic communities. As part of the Tara Oceans global expedition, marine plankton was sampled in three locations, resulting in 29 subsamples for which metagenomes were produced by shotgun Illumina sequencing (ca. 700 Gb). For comparative analyses, a subset of samples was also selected for Roche-454 sequencing using both shotgun (m454 tags; 13 metagenomes, ca. 2.4 Gb) and 16S rDNA amplicon (454 tags; ca. 0.075 Gb) approaches. Our results indicate that by overcoming PCR biases related to amplification and primer mismatch, mi tags may provide more realistic estimates of community richness and evenness than amplicon 454 tags. In addition, mi tags can capture expected beta diversity patterns. Using mi tags is now economically feasible given the dramatic reduction in high-throughput sequencing costs, having the advantage of retrieving simultaneously both taxonomic (Bacteria, Archaea and Eukarya) and functional information from the same microbial community.

  19. Tryptophan tags and de novo designed complementary affinity ligands for the expression and purification of recombinant proteins.

    PubMed

    Pina, Ana Sofia; Carvalho, Sara; Dias, Ana Margarida G C; Guilherme, Márcia; Pereira, Alice S; Caraça, Luciana T; Coroadinha, Ana Sofia; Lowe, Christopher R; Roque, A Cecília A

    2016-11-11

    A common strategy for the production and purification of recombinant proteins is to fuse a tag to the protein terminal residues and employ a "tag-specific" ligand for fusion protein capture and purification. In this work, we explored the effect of two tryptophan-based tags, NWNWNW and WFWFWF, on the expression and purification of Green Fluorescence Protein (GFP) used as a model fusion protein. The titers obtained with the expression of these fusion proteins in soluble form were 0.11mgml(-1) and 0.48mgml(-1) for WFWFWF and NWNWNW, respectively. A combinatorial library comprising 64 ligands based on the Ugi reaction was prepared and screened for binding GFP-tagged and non-tagged proteins. Complementary ligands A2C2 and A3C1 were selected for the effective capture of NWNWNW and WFWFWF tagged proteins, respectively, in soluble forms. These affinity pairs displayed 10(6)M(-1) affinity constants and Qmax values of 19.11±2.60ugg(-1) and 79.39ugg(-1) for the systems WFWFWF AND NWNWNW, respectively. GFP fused to the WFWFWF affinity tag was also produced as inclusion bodies, and a refolding-on column strategy was explored using the ligand A4C8, selected from the combinatorial library of ligands but in presence of denaturant agents.

  20. SAGExplore: a web server for unambiguous tag mapping in serial analysis of gene expression oriented to gene discovery and annotation.

    PubMed

    Norambuena, Tomás; Malig, Rodrigo; Melo, Francisco

    2007-07-01

    We describe a web server for the accurate mapping of experimental tags in serial analysis of gene expression (SAGE). The core of the server relies on a database of genomic virtual tags built by a recently described method that attempts to reduce the amount of ambiguous assignments for those tags that are not unique in the genome. The method provides a complete annotation of potential virtual SAGE tags within a genome, along with an estimation of their confidence for experimental observation that ranks tags that present multiple matches in the genome. The output of the server consists of a table in HTML format that contains links to a graphic representation of the results and to some external servers and databases, facilitating the tasks of analysis of gene expression and gene discovery. Also, a table in tab delimited text format is produced, allowing the user to export the results into custom databases and software for further analysis. The current server version provides the most accurate and complete SAGE tag mapping source that is available for the yeast organism. In the near future, this server will also allow the accurate mapping of experimental SAGE-tags from other model organisms such as human, mouse, frog and fly. The server is freely available on the web at: http://dna.bio.puc.cl/SAGExplore.html.

  1. Toxin Fused with SUMO Tag: A New Expression Vector Strategy to Obtain Recombinant Venom Toxins with Easy Tag Removal inside the Bacteria

    PubMed Central

    Shimokawa-Falcão, Lhiri H. A. L.; Caporrino, Maria C.; Barbaro, Katia C.; Della-Casa, Maisa S.; Magalhães, Geraldo S.

    2017-01-01

    Many animal toxins may target the same molecules that need to be controlled in certain pathologies; therefore, some toxins have led to the formulation of drugs that are presently used, and many other drugs are still under development. Nevertheless, collecting sufficient toxins from the original source might be a limiting factor in studying their biological activities. Thus, molecular biology techniques have been applied in order to obtain large amounts of recombinant toxins into Escherichia coli. However, most animal toxins are difficult to express in this system, which results in insoluble, misfolded, or unstable proteins. To solve these issues, toxins have been fused with tags that may improve protein expression, solubility, and stability. Among these tags, the SUMO (small ubiquitin-related modifier) has been shown to be very efficient and can be removed by the Ulp1 protease. However, removing SUMO is a labor- and time-consuming process. To enhance this system, here we show the construction of a bicistronic vector that allows the expression of any protein fused to both the SUMO and Ulp1 protease. In this way, after expression, Ulp1 is able to cleave SUMO and leave the protein interest-free and ready for purification. This strategy was validated through the expression of a new phospholipase D from the spider Loxosceles gaucho and a disintegrin from the Bothrops insularis snake. Both recombinant toxins showed good yield and preserved biological activities, indicating that the bicistronic vector may be a viable method to produce proteins that are difficult to express. PMID:28264436

  2. Down Syndome: A search for expressed sequences

    SciTech Connect

    Pritchard, M.; Fuentes, J.J.; Bosch, A.

    1994-09-01

    Down Syndrome (DS) is a major cause of congenital heart disease and mental retardation. The most common anomaly is an extra copy of human chromosome 21 (HC21); however, chromosomal studies in rare patients with partial trisomy 21 have defined a minimal region for DS, including human chromosome 21 bands q22.2-q22.3. The study of genes in this chromosomal region will allow the elucidation of the biochemical and molecular bases for several of the distinct phenotypic traits of the syndrome. This information is the key to the design of therapeutic, pharmacological and genetic tools to counter the effects of three copies of chromosome 21 in the cells of DS patients. Towards this goal, we aim to build a transcriptional map of this region and then characterize any genes isolated. We are using two methods to isolate expressed sequences: (1) Alu-splice consensus PCR (2) cDNA hybridizsation selection. We use as starting material, YACs (CEPH/Genethon) from the specified region and cosmid minilibraries constructed from these YACs. Products are subcloned, sequenced and analyzed in the sequence databases. Several homologies with reported expressed sequences have been found and will be discussed. The HC21 origin of these putative expressed sequences is determined and they are then used to initially screen a human fetal brain full-length cDNA library. We have isolated several cDNAs and these are now being analyzed.

  3. Annotating nonspecific SAGE tags with microarray data.

    PubMed

    Ge, Xijin; Jung, Yong-Chul; Wu, Qingfa; Kibbe, Warren A; Wang, San Ming

    2006-01-01

    SAGE (serial analysis of gene expression) detects transcripts by extracting short tags from the transcripts. Because of the limited length, many SAGE tags are shared by transcripts from different genes. Relying on sequence information in the general gene expression database has limited power to solve this problem due to the highly heterogeneous nature of the deposited sequences. Considering that the complexity of gene expression at a single tissue level should be much simpler than that in the general expression database, we reasoned that by restricting gene expression to tissue level, the accuracy of gene annotation for the nonspecific SAGE tags should be significantly improved. To test the idea, we developed a tissue-specific SAGE annotation database based on microarray data (). This database contains microarray expression information represented as UniGene clusters for 73 normal human tissues and 18 cancer tissues and cell lines. The nonspecific SAGE tag is first matched to the database by the same tissue type used by both SAGE and microarray analysis; then the multiple UniGene clusters assigned to the nonspecific SAGE tag are searched in the database under the matched tissue type. The UniGene cluster presented solely or at higher expression levels in the database is annotated to represent the specific gene for the nonspecific SAGE tags. The accuracy of gene annotation by this database was largely confirmed by experimental data. Our study shows that microarray data provide a useful source for annotating the nonspecific SAGE tags.

  4. N-terminal SKIK peptide tag markedly improves expression of difficult-to-express proteins in Escherichia coli and Saccharomyces cerevisiae.

    PubMed

    Ojima-Kato, Teruyo; Nagai, Satomi; Nakano, Hideo

    2017-01-02

    Despite advances in microbial protein expression systems, low production of proteins remains a great concern for some genes. Here we report that the insertion of a short peptide tag, consisting of Ser-Lys-Ile-Lys (SKIK), adjacent to the start codon of genes encoding difficult-to-express proteins can increase protein expression in Escherichia coli and Saccharomyces cerevisiae. Protein expression levels of a mouse monoclonal antibody (mAb), rabbit mAbs obtained from clonal B cells, and an artificially designed peptide were significantly increased simply by the addition of the SKIK tag in E. coli systems. In particular, a ∼30-fold increase in protein production was observed for the mouse mAb, and the artificially designed peptide band became detectable in sodium dodecyl sulfate-poly acrylamide gel electrophoresis after coomassie brilliant blue staining or western blotting on adding the SKIK tag. The tag also increased the expression of tagged proteins in S. cerevisiae and an E. coli cell-free protein synthesis system. Although the mechanism of high protein expression on addition of the tag is unclear, our findings offer great benefits to biotechnology research and industry.

  5. Massive parallel insertion site sequencing of an arrayed Sinorhizobium meliloti signature-tagged mini-Tn 5 transposon mutant library.

    PubMed

    Serrania, Javier; Johner, Tobias; Rupp, Oliver; Goesmann, Alexander; Becker, Anke

    2017-02-21

    Transposon mutagenesis in conjunction with identification of genomic transposon insertion sites is a powerful tool for gene function studies. We have implemented a protocol for parallel determination of transposon insertion sites by Illumina sequencing involving a hierarchical barcoding method that allowed for tracking back insertion sites to individual clones of an arrayed signature-tagged transposon mutant library. This protocol was applied to further characterize a signature-tagged mini-Tn 5 mutant library comprising about 12,000 mutants of the symbiotic nitrogen-fixing alphaproteobacterium Sinorhizobium meliloti (Pobigaylo et al., 2006; Appl. Environ. Microbiol. 72, 4329-4337). Previously, insertion sites have been determined for 5000 mutants of this library. Combining an adapter-free, inverse PCR method for sequencing library preparation with next generation sequencing, we identified 4473 novel insertion sites, increasing the total number of transposon mutants with known insertion site to 9562. The number of protein-coding genes that were hit at least once by a transposon increased by 1231 to a total number of 3673 disrupted genes, which represents 59% of the predicted protein-coding genes in S. meliloti.

  6. Construction of a yeast artificial chromosome contig spanning the pseudoautosomal region and isolation of 25 new sequence-tagged sites

    SciTech Connect

    Slim, R. Laboratoire de Cytogenetique et Genetique Oncologiques, Villejuif ); Le Paslier, D.; Ougen, P.; Billault, A.; Cohen, D. ); Compain, S.; Levilliers, J.; Mintz, L.; Weissenbach, J.; Petit, C. )

    1993-06-01

    Thirty-one yeast artificial chromosomes (YACs) from the human pseudoautosomal region were identified by a combination of sequence-tagged site (STS) screenings and colony hybridizations, using a subtelomeric interspersed repetitive element mapping predominantly to the pseudoautosomal region. Twenty-five new pseudoautosomal STSs were generated, of which 4 detected restriction fragment length polymorphisms. A total of 33 STSs were used to assemble the 31 YACs into a single contiguous set of overlapping DNA fragments spanning at least 2.3 megabases of the pseudoautosomal region. In addition, four pseudoautosomal genes including hydroxyindole O-methyltransferase have been positioned on this set of fragments. 48 refs., 1 fig., 3 tabs.

  7. Construction of a plasmid coding for green fluorescent protein tagged cathepsin L and data on expression in colorectal carcinoma cells.

    PubMed

    Tamhane, Tripti; Wolters, Brit K; Illukkumbura, Rukshala; Maelandsmo, Gunhild M; Haugen, Mads H; Brix, Klaudia

    2015-12-01

    The endo-lysosomal cysteine cathepsin L has recently been shown to have moonlighting activities in that its unexpected nuclear localization in colorectal carcinoma cells is involved in cell cycle progression (Tamhane et al., 2015) [1]. Here, we show data on the construction and sequence of a plasmid coding for human cathepsin L tagged with an enhanced green fluorescent protein (phCL-EGFP) in which the fluorescent protein is covalently attached to the C-terminus of the protease. The plasmid was used for transfection of HCT116 colorectal carcinoma cells, while data from non-transfected and pEGFP-N1-transfected cells is also shown. Immunoblotting data of lysates from non-transfected controls and HCT116 cells transfected with pEGFP-N1 and phCL-EGFP, showed stable expression of cathepsin L-enhanced green fluorescent protein chimeras, while endogenous cathepsin L protein amounts exceed those of hCL-EGFP chimeras. An effect of phCL-EGFP expression on proliferation and metabolic states of HCT116 cells at 24 h post-transfection was observed.

  8. Construction of a plasmid coding for green fluorescent protein tagged cathepsin L and data on expression in colorectal carcinoma cells

    PubMed Central

    Tamhane, Tripti; Wolters, Brit K.; Illukkumbura, Rukshala; Maelandsmo, Gunhild M.; Haugen, Mads H.; Brix, Klaudia

    2015-01-01

    The endo-lysosomal cysteine cathepsin L has recently been shown to have moonlighting activities in that its unexpected nuclear localization in colorectal carcinoma cells is involved in cell cycle progression (Tamhane et al., 2015) [1]. Here, we show data on the construction and sequence of a plasmid coding for human cathepsin L tagged with an enhanced green fluorescent protein (phCL-EGFP) in which the fluorescent protein is covalently attached to the C-terminus of the protease. The plasmid was used for transfection of HCT116 colorectal carcinoma cells, while data from non-transfected and pEGFP-N1-transfected cells is also shown. Immunoblotting data of lysates from non-transfected controls and HCT116 cells transfected with pEGFP-N1 and phCL-EGFP, showed stable expression of cathepsin L-enhanced green fluorescent protein chimeras, while endogenous cathepsin L protein amounts exceed those of hCL-EGFP chimeras. An effect of phCL-EGFP expression on proliferation and metabolic states of HCT116 cells at 24 h post-transfection was observed. PMID:26594658

  9. P-selectin-mediated LOX expression promotes insulinoma growth in Rip1-Tag2 mice by increasing tissue stiffness

    PubMed Central

    Qi, Cuiling; Li, Jialin; Guo, Simei; Li, Mengshi; Li, Yuanyuan; Li, Jiangchao; Zhang, Qianqian; Zheng, Lingyun; He, Xiaodong; Zheng, Xiaoming; He, Yanli; Wang, Lijing; Wei, Bo

    2016-01-01

    P-selectin, a cell adhesion molecule, is an important member of the selectin family. Recent studies have shown that P-selectin deletion inhibits tumor growth in Rip1-Tag2 mice by suppressing platelet accumulation in tumor tissues. This study aimed to evaluate whether and how P-selectin affects tumor stiffness in Rip1-Tag2 mice. To explore the role of P-selectin in tissue stiffness, we demonstrated that tumor progression in Rip1-Tag2 mice was correlated with tissue stiffness using immunofluorescence and histological staining. Furthermore, we showed that P-selectin deficiency significantly decreased tissue stiffness by inhibiting lysyl oxidase (LOX) expression. Our experiments involving Rip1-Tag2 mice treated with the LOX inhibitor BAPN showed that BAPN significantly abolished collagen deposition to decrease tumor stiffness and thus inhibit tumor growth. These results indicate that P-selectin deletion significantly decreases tumor stiffness in Rip1-Tag2 mice by inhibiting LOX expression. Further study demonstrated that P-selectin-mediated platelet accumulation increases tissue stiffness mainly by increasing LOX expression and thus promotes tumor growth. Therefore, P-selectin may be an effective therapeutic targeting for treating human insulinomas. PMID:27877081

  10. 1D4: a versatile epitope tag for the purification and characterization of expressed membrane and soluble proteins.

    PubMed

    Molday, Laurie L; Molday, Robert S

    2014-01-01

    Incorporation of short epitope tags into proteins for recognition by commercially available monoclonal or polyclonal antibodies has greatly facilitated the detection, characterization, localization, and purification of heterologously expressed proteins for structure-function studies. A number of tags have been developed, but many epitope-antibody combinations do not work effectively for all immunochemical techniques due to the nature of the tag and the specificity of the antibodies. A highly versatile, multipurpose epitope tag is the 9 amino acid C-terminal 1D4 peptide. This peptide tag together with the Rho1D4 monoclonal antibody can be used to detect proteins in complex mixtures by western blotting and ELISA assays, localize proteins in cells by immunofluorescence and immunoelectron microscopic labeling techniques, identify subunits and interacting proteins by co-immunoprecipitation, and purify functionally active proteins including membrane proteins by immunoaffinity chromatography. In this chapter we describe various immunochemical procedures which can be used for the detection, purification and localization of 1D4-tagged proteins for structure-function studies.

  11. N-ICE plasmids for generating N-terminal 3 × FLAG tagged genes that allow inducible, constitutive or endogenous expression in Saccharomyces cerevisiae.

    PubMed

    Zhang, Yueping; Serratore, Nina D; Briggs, Scott D

    2016-12-12

    PCR-mediated homologous recombination is a powerful approach to introduce epitope tags into the chromosomal loci at the N-terminus or the C-terminus of targeted genes. Although strategies of C-terminal epitope tagging of target genes at their loci are simple and widely used in yeast, C-terminal epitope tagging is not practical for all proteins. For example, a C-terminal tag may affect protein function or a protein may get cleaved or processed, resulting in the loss of the epitope tag. Therefore, N-terminal epitope tagging may be necessary to resolve these problems. In some cases, an epitope tagging strategy is used to introduce a heterologous promoter with the epitope tag at the N-terminus of a gene of interest. The potential issue with this strategy is that the tagged gene is not expressed at the endogenous level. Another strategy after integration is to excise the selection marker, using the Cre-LoxP system, leaving the epitope tagged gene expressed from the endogenous promoter. However, N-terminal epitope tagging of essential genes using this strategy requires a diploid strain followed by tetrad dissection. Here we present 14 new plasmids for N-terminal tagging, which combines two previous strategies for epitope tagging in a haploid strain. These 'N-ICE' plasmids were constructed so that non-essential and essential genes can be N-terminally 3 × FLAG tagged and expressed from an inducible promoter (GAL1), constitutive promoters (CYC1 or PYK1) or the endogenous promoter. We have validated the N-ICE plasmid system by N-terminal tagging two non-essential genes (SET1 and SET2) and two essential genes (ERG11 and PKC1). Copyright © 2016 John Wiley & Sons, Ltd.

  12. Recombinant expression of antimicrobial peptides using a novel self-cleaving aggregation tag in Escherichia coli.

    PubMed

    Luan, Chao; Xie, Yong Gang; Pu, Yu Tian; Zhang, Hai Wen; Han, Fei Fei; Feng, Jie; Wang, Yi Zhen

    2014-03-01

    Antimicrobial peptides (AMPs) are part of the innate immune system of complex multicellular organisms. Despite the fact that AMPs show great potential as a novel class of antibiotics, the lack of a cost-effective means for their mass production limits both basic research and clinical use. In this work, we describe a novel expression system for the production of antimicrobial peptides in Escherichia coli by combining ΔI-CM mini-intein with the self-assembling amphipathic peptide 18A to drive the formation of active aggregates. Two AMPs, human β-defensin 2 and LL-37, were fused to the self-cleaving tag and expressed as active protein aggregates. The active aggregates were recovered by centrifugation and the intact antimicrobial peptides were released into solution by an intein-mediated cleavage reaction in cleaving buffer (phosphate-buffered saline supplemented with 40 mmol/L Bis-Tris, 2 mmol/L EDTA, pH 6.2). The peptides were further purified by cation-exchange chromatography. Peptides yields of 0.82 ± 0.24 and 0.59 ± 0.11 mg/L were achieved for human β-defensin 2 and LL-37, respectively, with demonstrated antimicrobial activity. Using our expression system, intact antimicrobial peptides were recovered by simple centrifugation from active protein aggregates after the intein-mediated cleavage reaction. Thus, we provide an economical and efficient way to produce intact antimicrobial peptides in E. coli.

  13. The generation of knock-in mice expressing fluorescently tagged galanin receptors 1 and 2

    PubMed Central

    Kerr, Niall; Holmes, Fiona E.; Hobson, Sally-Ann; Vanderplank, Penny; Leard, Alan; Balthasar, Nina; Wynick, David

    2015-01-01

    The neuropeptide galanin has diverse roles in the central and peripheral nervous systems, by activating the G protein-coupled receptors Gal1, Gal2 and the less studied Gal3 (GalR1–3 gene products). There is a wealth of data on expression of Gal1–3 at the mRNA level, but not at the protein level due to the lack of specificity of currently available antibodies. Here we report the generation of knock-in mice expressing Gal1 or Gal2 receptor fluorescently tagged at the C-terminus with, respectively, mCherry or hrGFP (humanized Renilla green fluorescent protein). In dorsal root ganglia (DRG) neurons expressing the highest levels of Gal1-mCherry, localization to the somatic cell membrane was detected by live-cell fluorescence and immunohistochemistry, and that fluorescence decreased upon addition of galanin. In spinal cord, abundant Gal1-mCherry immunoreactive processes were detected in the superficial layers of the dorsal horn, and highly expressing intrinsic neurons of the lamina III/IV border showed both somatic cell membrane localization and outward transport of receptor from the cell body, detected as puncta within cell processes. In brain, high levels of Gal1-mCherry immunofluorescence were detected within thalamus, hypothalamus and amygdala, with a high density of nerve endings in the external zone of the median eminence, and regions with lesser immunoreactivity included the dorsal raphe nucleus. Gal2-hrGFP mRNA was detected in DRG, but live-cell fluorescence was at the limits of detection, drawing attention to both the much lower mRNA expression than to Gal1 in mice and the previously unrecognized potential for translational control by upstream open reading frames (uORFs). PMID:26292267

  14. Parallel tagged amplicon sequencing reveals major lineages and phylogenetic structure in the North American tiger salamander (Ambystoma tigrinum) species complex.

    PubMed

    O'Neill, Eric M; Schwartz, Rachel; Bullock, C Thomas; Williams, Joshua S; Shaffer, H Bradley; Aguilar-Miguel, X; Parra-Olea, Gabriela; Weisrock, David W

    2013-01-01

    Modern analytical methods for population genetics and phylogenetics are expected to provide more accurate results when data from multiple genome-wide loci are analysed. We present the results of an initial application of parallel tagged sequencing (PTS) on a next-generation platform to sequence thousands of barcoded PCR amplicons generated from 95 nuclear loci and 93 individuals sampled across the range of the tiger salamander (Ambystoma tigrinum) species complex. To manage the bioinformatic processing of this large data set (344 330 reads), we developed a pipeline that sorts PTS data by barcode and locus, identifies high-quality variable nucleotides and yields phased haplotype sequences for each individual at each locus. Our sequencing and bioinformatic strategy resulted in a genome-wide data set with relatively low levels of missing data and a wide range of nucleotide variation. structure analyses of these data in a genotypic format resulted in strongly supported assignments for the majority of individuals into nine geographically defined genetic clusters. Species tree analyses of the most variable loci using a multi-species coalescent model resulted in strong support for most branches in the species tree; however, analyses including more than 50 loci produced parameter sampling trends that indicated a lack of convergence on the posterior distribution. Overall, these results demonstrate the potential for amplicon-based PTS to rapidly generate large-scale data for population genetic and phylogenetic-based research.

  15. Genome-wide discovery of cis-elements in promoter sequences using gene expression.

    PubMed

    Troukhan, Maxim; Tatarinova, Tatiana; Bouck, John; Flavell, Richard B; Alexandrov, Nickolai N

    2009-04-01

    The availability of complete or nearly complete genome sequences, a large number of 5' expressed sequence tags, and significant public expression data allow for a more accurate identification of cis-elements regulating gene expression. We have implemented a global approach that takes advantage of available expression data, genomic sequences, and transcript information to predict cis-elements associated with specific expression patterns. The key components of our approach are: (1) precise identification of transcription start sites, (2) specific locations of cis-elements relative to the transcription start site, and (3) assessment of statistical significance for all sequence motifs. By applying our method to promoters of Arabidopsis thaliana and Mus musculus, we have identified motifs that affect gene expression under specific environmental conditions or in certain tissues. We also found that the presence of the TATA box is associated with increased variability of gene expression. Strong correlation between our results and experimentally determined motifs shows that the method is capable of predicting new functionally important cis-elements in promoter sequences.

  16. Expression vectors for C-terminal fusions with fluorescent proteins and epitope tags in Candida glabrata.

    PubMed

    Yáñez-Carrillo, Patricia; Orta-Zavalza, Emmanuel; Gutiérrez-Escobedo, Guadalupe; Patrón-Soberano, Araceli; De Las Peñas, Alejandro; Castaño, Irene

    2015-07-01

    Candida glabrata is a haploid yeast considered the second most common of the Candida species found in nosocomial infections, accounting for approximately 18% of candidemias worldwide. Even though molecular biology methods are easily adapted to study this organism, there are not enough vectors that will allow probing the transcriptional and translational activity of any gene of interest in C. glabrata. In this work we have generated a set of expression vectors to systematically tag any gene of interest at the carboxy-terminus with three different fluorophores (CFP, YFP and mCherry) or three epitopes (HA, FLAG or cMyc) independently. This system offers the possibility to generate translational fusions in three versions: under the gene's own promoter integrated in its native locus in genome, on a replicative plasmid under its own promoter, or on a replicative plasmid under a strong promoter to overexpress the fusions. The expression of these translational fusions will allow determining the transcriptional and translational activity of the gene of interest as well as the intracellular localization of the protein. We have tested these expression vectors with two biosynthetic genes, HIS3 and TRP1. We detected fluorescence under the microscope and we were able to immunodetect the fusions using the three different versions of the system. These vectors permit coexpression of several different fusions simultaneously in the same cell, which will allow determining protein-protein and protein-DNA interactions. This set of vectors adds a new toolbox to study expression and protein interactions in the fungal pathogen C. glabrata.

  17. Expression of the affinity tags, glutathione-S-transferase and maltose-binding protein, in tobacco chloroplasts.

    PubMed

    Ahmad, Niaz; Michoux, Franck; McCarthy, James; Nixon, Peter J

    2012-04-01

    Chloroplast transformation offers an exciting platform for the safe, inexpensive and large-scale production of recombinant proteins in plants. An important advantage for the isolation of proteins produced in the chloroplast would be the use of affinity tags for rapid purification by affinity chromatography. To date, only His-tags have been used. In this study, we have tested the feasibility of expressing two additional affinity tags: glutathione-S-transferase (GST) and a His-tagged derivative of the maltose-binding protein (His₆-MBP). By using the chloroplast 16S rRNA promoter and 5' untranslated region of phage T7 gene 10, GST and His₆-MBP were expressed in homoplastomic tobacco plants at approximately 7% and 37% of total soluble protein, respectively. GST could be purified by one-step-affinity purification using a glutathione column. Much better recoveries were obtained for His₆-MBP by using a twin-affinity purification procedure involving first immobilised nickel followed by binding to amylose. Interestingly, expression of GST led to cytoplasmic male sterility. Overall, our work expands the tools available for purifying recombinant proteins from the chloroplast.

  18. Improved measurement of brain deformation during mild head acceleration using a novel tagged MRI sequence.

    PubMed

    Knutsen, Andrew K; Magrath, Elizabeth; McEntee, Julie E; Xing, Fangxu; Prince, Jerry L; Bayly, Philip V; Butman, John A; Pham, Dzung L

    2014-11-07

    In vivo measurements of human brain deformation during mild acceleration are needed to help validate computational models of traumatic brain injury and to understand the factors that govern the mechanical response of the brain. Tagged magnetic resonance imaging is a powerful, noninvasive technique to track tissue motion in vivo which has been used to quantify brain deformation in live human subjects. However, these prior studies required from 72 to 144 head rotations to generate deformation data for a single image slice, precluding its use to investigate the entire brain in a single subject. Here, a novel method is introduced that significantly reduces temporal variability in the acquisition and improves the accuracy of displacement estimates. Optimization of the acquisition parameters in a gelatin phantom and three human subjects leads to a reduction in the number of rotations from 72 to 144 to as few as 8 for a single image slice. The ability to estimate accurate, well-resolved, fields of displacement and strain in far fewer repetitions will enable comprehensive studies of acceleration-induced deformation throughout the human brain in vivo.

  19. Toward a physical map of Drosophila buzzatii. Use of randomly amplified polymorphic dna polymorphisms and sequence-tagged site landmarks.

    PubMed Central

    Laayouni, H; Santos, M; Fontdevila, A

    2000-01-01

    We present a physical map based on RAPD polymorphic fragments and sequence-tagged sites (STSs) for the repleta group species Drosophila buzzatii. One hundred forty-four RAPD markers have been used as probes for in situ hybridization to the polytene chromosomes, and positive results allowing the precise localization of 108 RAPDs were obtained. Of these, 73 behave as effectively unique markers for physical map construction, and in 9 additional cases the probes gave two hybridization signals, each on a different chromosome. Most markers (68%) are located on chromosomes 2 and 4, which partially agree with previous estimates on the distribution of genetic variation over chromosomes. One RAPD maps close to the proximal breakpoint of inversion 2z(3) but is not included within the inverted fragment. However, it was possible to conclude from this RAPD that the distal breakpoint of 2z(3) had previously been wrongly assigned. A total of 39 cytologically mapped RAPDs were converted to STSs and yielded an aggregate sequence of 28,431 bp. Thirty-six RAPDs (25%) did not produce any detectable hybridization signal, and we obtained the DNA sequence from three of them. Further prospects toward obtaining a more developed genetic map than the one currently available for D. buzzatii are discussed. PMID:11102375

  20. In-depth cDNA library sequencing provides quantitative gene expression profiling in cancer biomarker discovery.

    PubMed

    Yang, Wanling; Ying, Dingge; Lau, Yu-Lung

    2009-06-01

    Quantitative gene expression analysis plays an important role in identifying differentially expressed genes in various pathological states, gene expression regulation and co-regulation, shedding light on gene functions. Although microarray is widely used as a powerful tool in this regard, it is suboptimal quantitatively and unable to detect unknown gene variants. Here we demonstrated effective detection of differential expression and co-regulation of certain genes by expressed sequence tag analysis using a selected subset of cDNA libraries. We discussed the issues of sequencing depth and library preparation, and propose that increased sequencing depth and improved preparation procedures may allow detection of many expression features for less abundant gene variants. With the reduction of sequencing cost and the emerging of new generation sequencing technology, in-depth sequencing of cDNA pools or libraries may represent a better and powerful tool in gene expression profiling and cancer biomarker detection. We also propose using sequence-specific subtraction to remove hundreds of the most abundant housekeeping genes to increase sequencing depth without affecting relative expression ratio of other genes, as transcripts from as few as 300 most abundantly expressed genes constitute about 20% of the total transcriptome. In-depth sequencing also represents a unique advantage of detecting unknown forms of transcripts, such as alternative splicing variants, fusion genes, and regulatory RNAs, as well as detecting mutations and polymorphisms that may play important roles in disease pathogenesis.

  1. Transcriptome sequencing and profiling of expressed genes in cambial zone and differentiating xylem of Japanese cedar (Cryptomeria japonica)

    PubMed Central

    2014-01-01

    Background Forest trees have ecological and economic importance, and Japanese cedar has highly valued wood attributes. Thus, studies of molecular aspects of wood formation offer practical information that may be used for screening and forward genetics approaches to improving wood quality. Results After identifying expressed sequence tags in Japanese cedar tissue undergoing xylogenesis, we designed a custom cDNA microarray to compare expression of highly regulated genes throughout a growing season. This led to identification of candidate genes involved both in wood formation and later cessation of growth and dormancy. Based on homology to orthologous protein groups, the genes were assigned to functional classes. A high proportion of sequences fell into functional classes related to posttranscriptional modification and signal transduction, while transcription factors and genes involved in the metabolism of sugars, cell-wall synthesis and lignification, and cold hardiness were among other classes of genes identified as having a potential role in xylem formation and seasonal wood formation. Conclusions We obtained 55,051 unique sequences by next-generation sequencing of a cDNA library prepared from cambial meristem and derivative cells. Previous studies on conifers have identified unique sequences expressed in developing xylem, but this is the first comprehensive study utilizing a collection of expressed sequence tags for expression studies related to xylem formation in Japanese cedar, which belongs to a different lineage than the Pinaceae. Our characterization of these sequences should allow comparative studies of genome evolution and functional genetics of wood species. PMID:24649833

  2. Random Tagging Genotyping by Sequencing (rtGBS), an Unbiased Approach to Locate Restriction Enzyme Sites across the Target Genome

    PubMed Central

    Hilario, Elena; Barron, Lorna; Deng, Cecilia H.; Datson, Paul M.; Davy, Marcus W.; Storey, Roy D.

    2015-01-01

    Genotyping by sequencing (GBS) is a restriction enzyme based targeted approach developed to reduce the genome complexity and discover genetic markers when a priori sequence information is unavailable. Sufficient coverage at each locus is essential to distinguish heterozygous from homozygous sites accurately. The number of GBS samples able to be pooled in one sequencing lane is limited by the number of restriction sites present in the genome and the read depth required at each site per sample for accurate calling of single-nucleotide polymorphisms. Loci bias was observed using a slight modification of the Elshire et al. method: some restriction enzyme sites were represented in higher proportions while others were poorly represented or absent. This bias could be due to the quality of genomic DNA, the endonuclease and ligase reaction efficiency, the distance between restriction sites, the preferential amplification of small library restriction fragments, or bias towards cluster formation of small amplicons during the sequencing process. To overcome these issues, we have developed a GBS method based on randomly tagging genomic DNA (rtGBS). By randomly landing on the genome, we can, with less bias, find restriction sites that are far apart, and undetected by the standard GBS (stdGBS) method. The study comprises two types of biological replicates: six different kiwifruit plants and two independent DNA extractions per plant; and three types of technical replicates: four samples of each DNA extraction, stdGBS vs. rtGBS methods, and two independent library amplifications, each sequenced in separate lanes. A statistically significant unbiased distribution of restriction fragment size by rtGBS showed that this method targeted 49% (39,145) of BamH I sites shared with the reference genome, compared to only 14% (11,513) by stdGBS. PMID:26633193

  3. Whole genome wide expression profiles of Vitis amurensis grape responding to downy mildew by using Solexa sequencing technology

    PubMed Central

    2010-01-01

    Background Downy mildew (DM), caused by pathogen Plasmopara viticola (PV) is the single most damaging disease of grapes (Vitis L.) worldwide. However, the mechanisms of the disease development in grapes are poorly understood. A method for estimating gene expression levels using Solexa sequencing of Type I restriction-endonuclease-generated cDNA fragments was used for deep sequencing the transcriptomes resulting from PV infected leaves of Vitis amurensis Rupr. cv. Zuoshan-1. Our goal is to identify genes that are involved in resistance to grape DM disease. Results Approximately 8.5 million (M) 21-nt cDNA tags were sequenced in the cDNA library derived from PV pathogen-infected leaves, and about 7.5 M were sequenced from the cDNA library constructed from the control leaves. When annotated, a total of 15,249 putative genes were identified from the Solexa sequencing tags for the infection (INF) library and 14,549 for the control (CON) library. Comparative analysis between these two cDNA libraries showed about 0.9% of the unique tags increased by at least five-fold, and about 0.6% of the unique tags decreased more than five-fold in infected leaves, while 98.5% of the unique tags showed less than five-fold difference between the two samples. The expression levels of 12 differentially expressed genes were confirmed by Real-time RT-PCR and the trends observed agreed well with the Solexa expression profiles, although the degree of change was lower in amplitude. After pathway enrichment analysis, a set of significantly enriched pathways were identified for the differentially expressed genes (DEGs), which associated with ribosome structure, photosynthesis, amino acid and sugar metabolism. Conclusions This study presented a series of candidate genes and pathways that may contribute to DM resistance in grapes, and illustrated that the Solexa-based tag-sequencing approach was a powerful tool for gene expression comparison between control and treated samples. PMID:21029438

  4. Gene expression and protein length influence codon usage and rates of sequence evolution in Populus tremula.

    PubMed

    Ingvarsson, Pär K

    2007-03-01

    Codon bias is generally thought to be determined by a balance between mutation, genetic drift, and natural selection on translational efficiency. However, natural selection on codon usage is considered to be a weak evolutionary force and selection on codon usage is expected to be strongest in species with large effective population sizes. In this paper, I study associations between codon usage, gene expression, and molecular evolution at synonymous and nonsynonymous sites in the long-lived, woody perennial plant Populus tremula (Salicaceae). Using expression data for 558 genes derived from expressed sequence tags (EST) libraries from 19 different tissues and developmental stages, I study how gene expression levels within single tissues as well as across tissues affect codon usage and rates sequence evolution at synonymous and nonsynonymous sites. I show that gene expression have direct effects on both codon usage and the level of selective constraint of proteins in P. tremula, although in different ways. Codon usage genes is primarily determined by how highly expressed a genes is, whereas rates of sequence evolution are primarily determined by how widely expressed genes are. In addition to the effects of gene expression, protein length appear to be an important factor influencing virtually all aspects of molecular evolution in P. tremula.

  5. Visualizing the replication cycle of bunyamwera orthobunyavirus expressing fluorescent protein-tagged Gc glycoprotein.

    PubMed

    Shi, Xiaohong; van Mierlo, Joël T; French, Andrew; Elliott, Richard M

    2010-09-01

    The virion glycoproteins Gn and Gc of Bunyamwera virus (BUNV), the prototype of the Bunyaviridae family and also of the Orthobunyavirus genus, are encoded by the medium (M) RNA genome segment and are involved in both viral attachment and entry. After their synthesis Gn and Gc form a heterodimer in the endoplasmic reticulum (ER) and transit to the Golgi compartment for virus assembly. The N-terminal half of the Gc ectodomain was previously shown to be dispensable for virus replication in cell culture (X. Shi, J. Goli, G. Clark, K. Brauburger, and R. M. Elliott, J. Gen. Virol. 90:2483-2492, 2009.). In this study, the coding sequence for a fluorescent protein, either enhanced green fluorescent protein (eGFP) or mCherry fluorescent protein, was fused to the N terminus of truncated Gc, and two recombinant BUNVs (rBUNGc-eGFP and rBUNGc-mCherry) were rescued by reverse genetics. The recombinant viruses showed bright autofluorescence under UV light and were competent for replication in various mammalian cell lines. rBUNGc-mCherry was completely stable over 10 passages, whereas internal, in-frame deletions occurred in the chimeric Gc-eGFP protein of rBUNGc-eGFP, resulting in loss of fluorescence between passages 5 and 7. Autofluorescence of the recombinant viruses allowed visualization of different stages of the infection cycle, including virus attachment to the cell surface, budding of virus particles in Golgi membranes, and virus-induced morphological changes to the Golgi compartment at later stages of infection. The fluorescent protein-tagged viruses will be valuable reagents for live-cell imaging studies to investigate virus entry, budding, and morphogenesis in real time.

  6. Sequence-tagged-site (STS) markers of arbitrary genes: development, characterization and analysis of linkage in black spruce.

    PubMed Central

    Perry, D J; Bousquet, J

    1998-01-01

    Sequence-tagged-site (STS) markers of arbitrary genes were investigated in black spruce [Picea mariana (Mill.) B.S.P.]. Thirty-nine pairs of PCR primers were used to screen diverse panels of haploid and diploid DNAs for variation that could be detected by standard agarose gel electrophoresis without further manipulation of amplification products. Codominant length polymorphisms were revealed at 15 loci. Three of these loci also had null amplification alleles as did 3 other loci that had no apparent product-length variation. Dominant length polymorphisms were observed at 2 other loci. Alleles of codominant markers differed in size by as little as 1 bp to as much as an estimated 175 bp with nearly all insertions/deletions found in noncoding regions. Polymorphisms at 3 loci involved large (33 bp to at least 114 bp) direct repeats and similar repeats were found in 7 of 51 cDNAs sequenced. Allelic segregation was in accordance with Mendelian inheritance and linkage was detected for 5 of 63 pairwise combinations of loci tested. Codominant STS markers of 12 loci revealed an average heterozygosity of 0.26 and an average of 2.8 alleles in a range-wide sample of 22 trees. PMID:9611216

  7. Development of high-density linkage map and tagging leaf spot resistance in pearl millet using genotyping-by-sequencing markers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Pearl millet is an important forage and grain crop in many parts of the world. Genome mapping studies are a prerequisite for tagging agronomically important traits. Genotyping-by-Sequencing (GBS) markers can be used to build high density linkage maps even in species lacking a reference genome. A re...

  8. Parallel tagged amplicon sequencing of transcriptome-based genetic markers for Triturus newts with the Ion Torrent next-generation sequencing platform

    PubMed Central

    Wielstra, B; Duijm, E; Lagler, P; Lammers, Y; Meilink, W R M; Ziermann, J M; Arntzen, J W

    2014-01-01

    Next-generation sequencing is a fast and cost-effective way to obtain sequence data for nonmodel organisms for many markers and for many individuals. We describe a protocol through which we obtain orthologous markers for the crested newts (Amphibia: Salamandridae: Triturus), suitable for analysis of interspecific hybridization. We use transcriptome data of a single Triturus species and design 96 primer pairs that amplify c. 180 bp fragments positioned in 3-prime untranslated regions. Next, these markers are tested with uniplex PCR for a set of species spanning the taxonomical width of the genus Triturus. The 52 markers that consistently show a single band of expected length at gel electrophoreses for all tested crested newt species are then amplified in five multiplex PCRs (with a plexity of ten or eleven) for 132 individual newts: a set of 84 representing the seven (candidate) species and a set of 48 from a presumed hybrid population. After pooling multiplexes per individual, unique tags are ligated to link amplicons to individuals. Subsequently, individuals are pooled equimolar and sequenced on the Ion Torrent next-generation sequencing platform. A bioinformatics pipeline identifies the alleles and recodes these to a genotypic format. Next, we test the utility of our markers. baps allocates the 84 crested newt individuals representing (candidate) species to their expected (candidate) species, confirming the markers are suitable for species delineation. newhybrids, a hybrid index and hiest confirm the 48 individuals from the presumed hybrid population to be genetically admixed, illustrating the potential of the markers to identify interspecific hybridization. We expect the set of markers we designed to provide a high resolving power for analysis of hybridization in Triturus. PMID:24571307

  9. Parallel tagged amplicon sequencing of transcriptome-based genetic markers for Triturus newts with the Ion Torrent next-generation sequencing platform.

    PubMed

    Wielstra, B; Duijm, E; Lagler, P; Lammers, Y; Meilink, W R M; Ziermann, J M; Arntzen, J W

    2014-09-01

    Next-generation sequencing is a fast and cost-effective way to obtain sequence data for nonmodel organisms for many markers and for many individuals. We describe a protocol through which we obtain orthologous markers for the crested newts (Amphibia: Salamandridae: Triturus), suitable for analysis of interspecific hybridization. We use transcriptome data of a single Triturus species and design 96 primer pairs that amplify c. 180 bp fragments positioned in 3-prime untranslated regions. Next, these markers are tested with uniplex PCR for a set of species spanning the taxonomical width of the genus Triturus. The 52 markers that consistently show a single band of expected length at gel electrophoreses for all tested crested newt species are then amplified in five multiplex PCRs (with a plexity of ten or eleven) for 132 individual newts: a set of 84 representing the seven (candidate) species and a set of 48 from a presumed hybrid population. After pooling multiplexes per individual, unique tags are ligated to link amplicons to individuals. Subsequently, individuals are pooled equimolar and sequenced on the Ion Torrent next-generation sequencing platform. A bioinformatics pipeline identifies the alleles and recodes these to a genotypic format. Next, we test the utility of our markers. baps allocates the 84 crested newt individuals representing (candidate) species to their expected (candidate) species, confirming the markers are suitable for species delineation. newhybrids, a hybrid index and hiest confirm the 48 individuals from the presumed hybrid population to be genetically admixed, illustrating the potential of the markers to identify interspecific hybridization. We expect the set of markers we designed to provide a high resolving power for analysis of hybridization in Triturus.

  10. Novel Y-chromosomal microdeletions associated with non-obstructive azoospermia uncovered by high throughput sequencing of sequence-tagged sites (STSs)

    PubMed Central

    Liu, Xiao; Li, Zesong; Su, Zheng; Zhang, Junjie; Li, Honggang; Xie, Jun; Xu, Hanshi; Jiang, Tao; Luo, Liya; Zhang, Ruifang; Zeng, Xiaojing; Xu, Huaiqian; Huang, Yi; Mou, Lisha; Hu, Jingchu; Qian, Weiping; Zeng, Yong; Zhang, Xiuqing; Xiong, Chengliang; Yang, Huanming; Kristiansen, Karsten; Cai, Zhiming; Wang, Jun; Gui, Yaoting

    2016-01-01

    Y-chromosomal microdeletion (YCM) serves as an important genetic factor in non-obstructive azoospermia (NOA). Multiplex polymerase chain reaction (PCR) is routinely used to detect YCMs by tracing sequence-tagged sites (STSs) in the Y chromosome. Here we introduce a novel methodology in which we sequence 1,787 (post-filtering) STSs distributed across the entire male-specific Y chromosome (MSY) in parallel to uncover known and novel YCMs. We validated this approach with 766 Chinese men with NOA and 683 ethnically matched healthy individuals and detected 481 and 98 STSs that were deleted in the NOA and control group, representing a substantial portion of novel YCMs which significantly influenced the functions of spermatogenic genes. The NOA patients tended to carry more and rarer deletions that were enriched in nearby intragenic regions. Haplogroup O2* was revealed to be a protective lineage for NOA, in which the enrichment of b1/b3 deletion in haplogroup C was also observed. In summary, our work provides a new high-resolution portrait of deletions in the Y chromosome. PMID:26907467

  11. One-step expression and purification of single-chain variable antibody fragment using an improved hexahistidine tag phagemid vector.

    PubMed

    Zhao, Qi; Chan, Yin-Wah; Lee, Susanna Sau-Tuen; Cheung, Wing-Tai

    2009-12-01

    Millions of candidate clones are commonly obtained following rounds of phage-displayed antibody library panning, and expression of those selected single-chain variable fragment (scFv) is required for secondary functional screening to identify positive clones. Large scale functional screening is often hampered by the time-consuming and labor-intensive subcloning of those candidate scFv clones into a bacterial expression vector carrying an affinity tag for scFv purification and detection. To overcome the limitations and to develop a multiplex approach, an improved hexahistidine tag phagemid vector was constructed for one-step scFv expression and purification. By using hexahistidine as an affinity tag, soluble scFvs can be rapidly and cost-effectively captured from Escherichia coli periplasmic extracts. For proof-of-concept, feasibility of the improved phagemid vector was examined against two scFvs, L17E4d targeting a cell surface antigen and L18Hh5 recognizing a monoclonal antibody (mAb). Using 1 ml of Ni-NTA agarose, 0.2-0.5 mg of soluble scFv was obtained from 1 L of bacteria culture, and the purified scFvs bound specifically to their target antigens with high affinity. Moreover, using two randomly selected hapten-specific scFv phage clones, it was demonstrated that the display of scFvs on phage surface was not affected by the hexahistidine affinity tag. These results suggest the improved phagemid vector allows the shuttle of phage-displayed antibody library panning and functional scFv production. Importantly, the improved phagemid vector can be easily adapted for multiplex screening.

  12. Use of adenylate kinase as a solubility tag for high level expression of T4 DNA ligase in Escherichia coli.

    PubMed

    Liu, Xinxin; Huang, Anliang; Luo, Dan; Liu, Haipeng; Han, Huzi; Xu, Yang; Liang, Peng

    2015-05-01

    The discovery of T4 DNA ligase in 1960s was pivotal in the spread of molecular biotechnology. The enzyme has become ubiquitous for recombinant DNA routinely practiced in biomedical research around the globe. Great efforts have been made to express and purify T4 DNA ligase to meet the world demand, yet over-expression of soluble T4 DNA ligase in E. coli has been difficult. Here we explore the use of adenylate kinase to enhance T4 DNA ligase expression and its downstream purification. E.coli adenylate kinase, which can be expressed in active form at high level, was fused to the N-terminus of T4 DNA ligase. The resulting His-tagged AK-T4 DNA ligase fusion protein was greatly over-expressed in E. coli, and readily purified to near homogeneity via two purification steps consisting of Blue Sepharose and Ni-NTA chromatography. The purified AK-T4 DNA ligase not only is fully active for DNA ligation, but also can use ADP in addition to ATP as energy source since adenylate kinase converts ADP to ATP and AMP. Thus adenylate kinase may be used as a solubility tag to facilitate recombinant protein expression as well as their downstream purification.

  13. Sequence Tag Site and Host Range Assays Demonstrate that Radapholus similis and R. citraphilus are not Reproductively Isolated.

    PubMed

    Kaplan, D T; Vanderspool, M C; Opperman, C H

    1997-12-01

    Males of citrus-parasitic Radopholus citrophilus (FL1) were mated with non-citrus-parasitic R. similis (FL5) females. Progeny inherited a 2.4-kb sequence tag site (DK#1) and the ability to reproduce in citrus from the paternal parent (FLl); both traits were absent in the maternal line (FL5). The hybrid progeny produced offspring in roots of citrus seedlings over an 8-month period and therefore were considered reproductively viable. Genomic DNA hybridization studies indicated that one or more copies of DK#1 were present in R. citrophilus FL1. It is not likely that DK#1 represents a citrus parasitism gene because it was amplified from some burrowing nematode isolates that did not parasitize citrus and because DK#1 contains no open reading frames. Inability to reliably test individual nematodes for their ability to parasitize citrus was a constraint to obtaining F2 data required for definitive genetic characterization of citrus parasitism in burrowing nematodes, and alternate approaches will be required. Although the physical relationship of DK#1 and the citrus parasitism locus remains undefined, results of controlled mating studies using these parameters as genetic markers enabled us to identify hybrid F progeny. Therefore, R. similis and R. citrophilus are not sibling species since gene flow between the two does not appear to be restricted via geographic isolation (sympatric in Florida) or by genetics.

  14. Applying thiouracil tagging to mouse transcriptome analysis.

    PubMed

    Gay, Leslie; Karfilis, Kate V; Miller, Michael R; Doe, Chris Q; Stankunas, Kryn

    2014-02-01

    Transcriptional profiling is a powerful approach for studying mouse development, physiology and disease models. Here we describe a protocol for mouse thiouracil tagging (TU tagging), a transcriptome analysis technology that includes in vivo covalent labeling, purification and analysis of cell type-specific RNA. TU tagging enables the isolation of RNA from a given cell population of a complex tissue, avoiding transcriptional changes induced by cell isolation trauma, as well as the identification of actively transcribed RNAs and not preexisting transcripts. Therefore, in contrast to other cell-specific transcriptional profiling methods based on the purification of tagged ribosomes or nuclei, TU tagging provides a direct examination of transcriptional regulation. We describe how to (i) deliver 4-thiouracil to transgenic mice to thio-label cell lineage-specific transcripts, (ii) purify TU-tagged RNA and prepare libraries for Illumina sequencing and (iii) follow a straightforward bioinformatics workflow to identify cell type-enriched or differentially expressed genes. Tissue containing TU-tagged RNA can be obtained in 1 d, RNA-seq libraries can be generated within 2 d and, after sequencing, an initial bioinformatics analysis can be completed in 1 additional day.

  15. Expression of recombinant West Nile virus prM protein fused to an affinity tag for use as a diagnostic antigen.

    PubMed

    Setoh, Y X; Hobson-Peters, J; Prow, N A; Young, P R; Hall, R A

    2011-07-01

    Previous studies have concluded that the Flavivirus prM protein is a suitable viral antigen to distinguish serologically between infections with closely related Flaviviruses (Cardosa et al., 2002). To express the recombinant West Nile virus (WNV) prM antigen fused to a suitable affinity tag for purification, a series of prM-His-tag and prM-V5-tag fusion proteins were generated. Analysis of the prM-His-tag fusion proteins revealed that either prM epitopes were disrupted or the His-tag was not presented properly depending on the location of the His tag and the presence of the prM transmembrane domains in these constructs. This identified domains critical for proper folding of prM, and arrangements that allowed the correct presentation of the His-tag. However, the inclusion of the V5 epitope tag fused to the C terminus of prM allowed formation of the authentic antigenic structure of prM and the proper presentation of the V5 epitope. Capture of tagged recombinant WNV(NY99) prM antigen to the solid phase with anti-V5 antibody in ELISA enabled the detection of prM-specific antibodies in WNV(NY99)-immune horse serum, confirming its potential as a useful diagnostic reagent.

  16. EST sequencing and gene expression profiling of cultivated peanut (Arachis hypogaea L.).

    PubMed

    Bi, Yu-Ping; Liu, Wei; Xia, Han; Su, Lei; Zhao, Chuan-Zhi; Wan, Shu-Bo; Wang, Xing-Jun

    2010-10-01

    Peanut (Arachis hypogaea L.) is one of the most important oil crops in the world. However, biotechnological based improvement of peanut is far behind many other crops. It is critical and urgent to establish the biotechnological platform for peanut germplasm innovation. In this study, a peanut seed cDNA library was constructed to establish the biotechnological platform for peanut germplasm innovation. About 17,000 expressed sequence tags (ESTs) were sequenced and used for further investigation. Among which, 12.5% were annotated as metabolic related and 4.6% encoded transcription or post-transcription factors. ESTs encoding storage protein and enzymes related to protein degradation accounted for 28.8% and formed the largest group of the annotated ESTs. ESTs that encoded stress responsive proteins and pathogen-related proteins accounted for 5.6%. ESTs that encoded unknown proteins or showed no hit in the GenBank nr database accounted for 20.1% and 13.9%, respectively. A total number of 5066 EST sequences were selected to make a cDNA microarray. Expression analysis revealed that these sequences showed diverse expression patterns in peanut seeds, leaves, stems, roots, flowers, and gynophores. We also analyzed the gene expression pattern during seed development. Genes that were upregulated (≥twofold) at 15, 25, 35, and 45 days after pegging (DAP) were found and compared with 70 DAP. The potential value of these genes and their promoters in the peanut gene engineering study is discussed.

  17. DNA Sequence Heterogeneity of Campylobacter jejuni CJIE4 Prophages and Expression of Prophage Genes

    PubMed Central

    Clark, Clifford G.; Chong, Patrick M.; McCorrister, Stuart J.; Mabon, Philip; Walker, Matthew; Westmacott, Garrett R.

    2014-01-01

    Campylobacter jejuni carry temperate bacteriophages that can affect the biology or virulence of the host bacterium. Known effects include genomic rearrangements and resistance to DNA transformation. C. jejuni prophage CJIE1 shows sequence variability and variability in the content of morons. Homologs of the CJIE1 prophage enhance both adherence and invasion to cells in culture and increase the expression of a specific subset of bacterial genes. Other C. jejuni temperate phages have so far not been well characterized. In this study we describe investigations into the DNA sequence variability and protein expression in a second prophage, CJIE4. CJIE4 sequences were obtained de novo from DNA sequencing of five C. jejuni isolates, as well as from whole genome sequences submitted to GenBank by other research groups. These CJIE4 DNA sequences were heterogenous, with several different insertions/deletions (indels) in different parts of the prophage genome. Two variants of a 3–4 kb region inserted within CJIE4 had different gene content that distinguished two major conserved CJIE4 prophage families. Additional indels were detected throughout the prophage. Detection of proteins in the five isolates characterized in our laboratory in isobaric Tags for Relative and Absolute Quantitation (iTRAQ) experiments indicated that prophage proteins within each of the two large indel variants were expressed during growth of the bacteria on Mueller Hinton agar plates. These proteins included the extracellular DNase associated with resistance to DNA transformation and prophage repressor proteins. Other proteins associated with known or suspected roles in prophage biology were also expressed from CJIE4, including capsid protein, the phage integrase, and MazF, a type II toxin-antitoxin system protein. Together with the results previously obtained for the CJIE1 prophage these results demonstrate that sequence variability and expression of moron genes are both general properties of temperate

  18. Expression and antigenicity of recombinant human respiratory syncytial virus glycoproteins having different affinity tags.

    PubMed

    Lee, Han Saem; Kim, A-Reum; Kim, Kisoon; Lee, Wan-Ji; Kim, Sung Soon; Kim, You-Jin

    2016-12-29

    Human respiratory syncytial virus (HRSV) is a main cause of lower respiratory tract infections in infants and the elderly. Glycoprotein (G) is major antigen on the viral surface, and plays a key role for virus entry. Therefore, purification of the glycoprotein of HRSV is critical for the development of HRSV vaccine and serological diagnosis. In this study, we report the design and characterization of glycoprotein engineered rationally to enhance the protein solubility and to facilitate efficient purification. We permuted HRSV glycoproteins with two tags: (i) an immunoglobulin (Ig) M signal peptide and a protein A B domain tag to render HRSV glycoprotein secret into the culture media and (ii) a foldon and 6 × histidine tag with or without transmembrane domain. Three recombinant baculoviruses were constructed: (i) transmembrane-truncated HRSV glycoprotein (amino acid positions 66-298) inserted with the N-terminal IgM signal peptide and protein A B domain (MG-GΔTM), (ii) truncated HRSV glycoprotein (amino acid positions 66-298) fused with a C-terminal foldon and 6 × histidine tag (GΔTM-FH), and (iii) full-length HRSV glycoprotein (amino acid positions 1-298) fused with a C-terminal foldon and 6 × histidine tag (G-FH). Highly soluble recombinant MG-GΔTM protein was clearly purified using one-step affinity chromatography with IgG-sepharose resin, whereas the recombinant G-FH protein and truncated GΔTM-FH were purified partially using nickel-resin. Although, the antigenicity of GΔTM-FH was stronger than highly mannose-rich MG-GΔTM protein, MG-GΔTM induced neutralizing antibodies efficiently in the mice to protect from infectious HRSV.

  19. Variability in the Immunodetection of His-tagged Recombinant Proteins

    PubMed Central

    Debeljak, Nataša; Feldman, Laurie; Davis, Kerry L.; Komel, Radovan; Sytkowski, Arthur J.

    2006-01-01

    Labeling of recombinant proteins with polypeptide fusion partners, or affinity tagging, is a useful method to facilitate subsequent protein purification and detection. Poly-histidine tags (His-tags) are among the most commonly used affinity tags. We report strikingly variable immunodetection of two His-tagged recombinant human erythropoietins (Epo), wild type Epo (Epowt) and Epo containing an R103A mutation (EpoR103A). Both were engineered to contain a C-terminal six residue His-tag. The cDNA constructs were stably transfected into CHO cells and COS-7 cells. Clones from the CHO cell transfections were selected for further characterization and larger-scale protein expression. Three chromatographic steps were utilized to achieve pharmacologically pure Epo. Conditioned media from the Epo-expressing cell lines and protein-containing samples from each step of purification were analyzed by SDS-PAGE and dot blot, using both monoclonal anti-human Epo antibody (AE7A5) and anti-His antibodies. While the successful incorporation of the His-tag into our constructs was confirmed by Epo binding to Ni2+-NTA resin and by μLC/MS/MS amino acid sequencing, the levels of immunodetection of His-tagged protein varied markedly depending on the particular anti His-tag antibody used. Such variability in His-tag immunorecognition can lead to critical adverse effects on several analytical methods. PMID:17081490

  20. High-yield secretion of recombinant proteins expressed in tobacco cell culture with a designer glycopeptide tag: Process development.

    PubMed

    Zhang, Ningning; Gonzalez, Maria; Savary, Brett; Xu, Jianfeng

    2016-03-01

    Low-yield protein production remains the most significant economic hurdle with plant cell culture technology. Fusions of recombinant proteins with hydroxyproline-O-glycosylated designer glycopeptide tags have consistently boosted secreted protein yields. This prompted us to study the process development of this technology aiming to achieve productivity levels necessary for commercial viability. We used a tobacco BY-2 cell culture expressing EGFP as fusion with a glycopeptide tag comprised of 32 repeat of "Ser-Pro" dipeptide, or (SP)32 , to study cell growth and protein secretion, culture scale-up, and establishment of perfusion cultures for continuous production. The BY-2 cells accumulated low levels of cell biomass (~7.5 g DW/L) in Schenk & Hildebrandt medium, but secreted high yields of (SP)32 -tagged EGFP (125 mg/L). Protein productivity of the cell culture has been stable for 6.0 years. The BY-2 cells cultured in a 5-L bioreactor similarly produced high secreted protein yield at 131 mg/L. Successful operation of a cell perfusion culture for 30 days was achieved under the perfusion rate of 0.25 and 0.5 day(-1) , generating a protein volumetric productivity of 17.6 and 28.9 mg/day/L, respectively. This research demonstrates the great potential of the designer glycopeptide technology for use in commercial production of valuable proteins with plant cell cultures.

  1. One-step affinity tag purification of full-length recombinant human AP-1 complexes from bacterial inclusion bodies using a polycistronic expression system.

    PubMed

    Wang, Wei-Ming; Lee, A-Young; Chiang, Cheng-Ming

    2008-05-01

    The AP-1 transcription factor is a dimeric protein complex formed primarily between Jun (c-Jun, JunB, JunD) and Fos (c-Fos, FosB, Fra-1, Fra-2) family members. These distinct AP-1 complexes are expressed in many cell types and modulate target gene expression implicated in cell proliferation, differentiation, and stress responses. Although the importance of AP-1 has long been recognized, the biochemical characterization of AP-1 remains limited in part due to the difficulty in purifying full-length, reconstituted dimers with active DNA-binding and transcriptional activity. Using a combination of bacterial coexpression and epitope-tagging methods, we successfully purified all 12 heterodimers (3 Junx4 Fos) of full-length human AP-1 complexes as well as c-Jun/c-Jun, JunD/JunD, and c-Jun/JunD dimers from bacterial inclusion bodies using one-step nickel-NTA affinity tag purification following denaturation and renaturation of coexpressed AP-1 subunits. Coexpression of two constitutive components in a dimeric AP-1 complex helps stabilize the proteins when compared with individual protein expression in bacteria. Purified dimeric AP-1 complexes are functional in sequence-specific DNA binding, as illustrated by electrophoretic mobility shift assays and DNase I footprinting, and are also active in transcription with in vitro-reconstituted human papillomavirus (HPV) chromatin containing AP-1-binding sites in the native configuration of HPV nucleosomes. The availability of these recombinant full-length human AP-1 complexes has greatly facilitated mechanistic studies of AP-1-regulated gene transcription in many biological systems.

  2. Transcriptional Profiling of Newly Generated Dentate Granule Cells Using TU Tagging Reveals Pattern Shifts in Gene Expression during Circuit Integration1,2

    PubMed Central

    Chatzi, Christina; Shen, Rongkun; Goodman, Richard H.

    2016-01-01

    Abstract Despite representing only a small fraction of hippocampal granule cells, adult-generated newborn granule cells have been implicated in learning and memory (Aimone et al., 2011). Newborn granule cells undergo functional maturation and circuit integration over a period of weeks. However, it is difficult to assess the accompanying gene expression profiles in vivo with high spatial and temporal resolution using traditional methods. Here we used a novel method [“thiouracil (TU) tagging”] to map the profiles of nascent mRNAs in mouse immature newborn granule cells compared with mature granule cells. We targeted a nonmammalian uracil salvage enzyme, uracil phosphoribosyltransferase, to newborn neurons and mature granule cells using retroviral and lentiviral constructs, respectively. Subsequent injection of 4-TU tagged nascent RNAs for analysis by RNA sequencing. Several hundred genes were significantly enhanced in the retroviral dataset compared with the lentiviral dataset. We compared a selection of the enriched genes with steady-state levels of mRNAs using quantitative PCR. Ontology analysis revealed distinct patterns of nascent mRNA expression, with newly generated immature neurons showing enhanced expression for genes involved in synaptic function, and neural differentiation and development, as well as genes not previously associated with granule cell maturation. Surprisingly, the nascent mRNAs enriched in mature cells were related to energy homeostasis and metabolism, presumably indicative of the increased energy demands of synaptic transmission and their complex dendritic architecture. The high spatial and temporal resolution of our modified TU-tagging method provides a foundation for comparison with steady-state RNA analyses by traditional transcriptomic approaches in defining the functional roles of newborn neurons. PMID:27011954

  3. Efficient expression of codon-adapted human acetaldehyde dehydrogenase 2 cDNA with 6xHis tag in Pichia pastoris.

    PubMed

    Zhao, YuFeng; Lei, MingKe; Wu, YuanXin; Zhang, ZiSheng; Wang, CunWen

    2009-10-01

    Human mitochondrial acetaldehyde dehydrogenase 2 (ALDH2) catalyzes the oxidation of acetaldehyde to acetic acid. Therefore, ALDH2 has therapeutic potential in detoxification of acetaldehyde. Furthermore, ALDH2 catalyzes nitroglycerin to nitrate and 1, 2-glyceryldinitrate during therapy for angina pectoris, myocardial infarction, and heart failure. Large quantities of ALDH2 will be needed for potential clinical practice. In this study, Pichia pastoris was used as a platform for expression of human ALDH2. Based on the ALDH2*1 cDNA sequence, we designed ALDH2 cDNA by choosing the P. pastoris preferred codons and by decreasing the G + C content level. The sequence was synthesized using the overlap extension PCR method. The cDNA and 6xHis tags were subcloned into the plasmid pPIC9K. The recombinant protein was expressed in P. pastoris GS115 and purified using Ni(2+)-Sepharose affinity chromatography. The amount of secreted protein in the culture was 80 mg/L in shake-flask cultivation and 260 mg/L in high-density bioreactor fermentation. Secreted ALDH2 was easily purified from the culture supernatant by using Ni(2+)-Sepharose affinity chromatography. After purification of the fermentation supernatant, the enzyme had a specific activity of 1.2 U/mg protein. The yield was about 16 mg/L in a shake flask culture of P. pastoris GS115 which contained the original human ALDH2*1 cDNA.

  4. Analytic signal phase-based myocardial motion estimation in tagged MRI sequences by a bilinear model and motion compensation.

    PubMed

    Wang, Liang; Basarab, Adrian; Girard, Patrick R; Croisille, Pierre; Clarysse, Patrick; Delachartre, Philippe

    2015-08-01

    Different mathematical tools, such as multidimensional analytic signals, allow for the calculation of 2D spatial phases of real-value images. The motion estimation method proposed in this paper is based on two spatial phases of the 2D analytic signal applied to cardiac sequences. By combining the information of these phases issued from analytic signals of two successive frames, we propose an analytical estimator for 2D local displacements. To improve the accuracy of the motion estimation, a local bilinear deformation model is used within an iterative estimation scheme. The main advantages of our method are: (1) The phase-based method allows the displacement to be estimated with subpixel accuracy and is robust to image intensity variation in time; (2) Preliminary filtering is not required due to the bilinear model. The proposed algorithm, integrating phase-based optical flow motion estimation and the combination of global motion compensation with local bilinear transform, allows spatio-temporal cardiac motion analysis, e.g. strain and dense trajectory estimation over the cardiac cycle. Results from 7 realistic simulated tagged magnetic resonance imaging (MRI) sequences show that our method is more accurate compared with state-of-the-art method for cardiac motion analysis and with another differential approach from the literature. The motion estimation errors (end point error) of the proposed method are reduced by about 33% compared with that of the two methods. In our work, the frame-to-frame displacements are further accumulated in time, to allow for the calculation of myocardial Lagrangian cardiac strains and point trajectories. Indeed, from the estimated trajectories in time on 11 in vivo data sets (9 patients and 2 healthy volunteers), the shape of myocardial point trajectories belonging to pathological regions are clearly reduced in magnitude compared with the ones from normal regions. Myocardial point trajectories, estimated from our phase-based analytic

  5. Mulberry (Morus L.) methionine sulfoxide rreductase gene cloning, sequence analysis, and expression in plant development and stress response.

    PubMed

    Tong, Wei; Zhang, Yinghua; Wang, Heng; Li, Feng; Liu, Zhaoyue; Wang, Yuhua; Fang, Rongjun; Zhao, Weiguo; Li, Long

    2013-01-01

    Methionine sulfoxide reductase plays a regulatory role in plant growth and development, especially in scavenging reactive oxygen species by restoration of the oxidation of methionine in protein. A full-length cDNA sequence encoding methionine sulfoxide reductase (MSR) from mulberry, which we designated MMSR, was cloned based on mulberry expressed sequence tags (ESTs). Sequence analysis showed that the MMSR is 810 bp long, encoding 194 amino acids with a predicted molecular weight of 21.6 kDa and an isoelectric point of 6.78. The expression level of the MMSR gene under conditions of drought and salt stresses was quantified by qRT-PCR. The results show that the expression level changed significantly under the stress conditions compared to the normal growth environment. It helps us to get a better understanding of the molecular basis for signal transduction mechanisms underlying the stress response in mulberry.

  6. Sequence determinants of prokaryotic gene expression level under heat stress.

    PubMed

    Xiong, Heng; Yang, Yi; Hu, Xiao-Pan; He, Yi-Ming; Ma, Bin-Guang

    2014-11-01

    Prokaryotic gene expression is environment-dependent and temperature plays an important role in shaping the gene expression profile. Revealing the regulation mechanisms of gene expression pertaining to temperature has attracted tremendous efforts in recent years particularly owning to the yielding of transcriptome and proteome data by high-throughput techniques. However, most of the previous works concentrated on the characterization of the gene expression profile of individual organism and little effort has been made to disclose the commonality among organisms, especially for the gene sequence features. In this report, we collected the transcriptome and proteome data measured under heat stress condition from recently published literature and studied the sequence determinants for the expression level of heat-responsive genes on multiple layers. Our results showed that there indeed exist commonness and consistent patterns of the sequence features among organisms for the differentially expressed genes under heat stress condition. Some features are attributed to the requirement of thermostability while some are dominated by gene function. The revealed sequence determinants of bacterial gene expression level under heat stress complement the knowledge about the regulation factors of prokaryotic gene expression responding to the change of environmental conditions. Furthermore, comparisons to thermophilic adaption have been performed to reveal the similarity and dissimilarity of the sequence determinants for the response to heat stress and for the adaption to high habitat temperature, which elucidates the complex landscape of gene expression related to the same physical factor of temperature.

  7. Affinity Purification of a Recombinant Protein Expressed as a Fusion with the Maltose-Binding Protein (MBP) Tag.

    PubMed

    Duong-Ly, Krisna C; Gabelli, Sandra B

    2015-01-01

    Expression of fusion proteins such as MBP fusions can be used as a way to improve the solubility of the expressed protein in E. coli (Fox and Waugh, 2003; Nallamsetty et al., 2005; Nallamsetty and Waugh, 2006) and as a way to introduce an affinity purification tag. The protocol that follows was designed by the authors as a first step in the purification of a recombinant protein fused with MBP, using fast protein liquid chromatography (FPLC). Cells should have been thawed, resuspended in binding buffer, and lysed by sonication or microfluidization before mixing with the amylose resin or loading on the column. Slight modifications to this protocol may be made to accommodate both the protein of interest and the availability of equipment.

  8. Fusion tags and chaperone co-expression modulate both the solubility and the inclusion body features of the recombinant CLIPB14 serine protease.

    PubMed

    Schrödel, Andrea; Volz, Jennifer; de Marco, Ario

    2005-10-17

    Chaperone co-expression and the fusion to different tags were used to modify the aggregation pattern of the putative serine protease CLIPB14 precipitated in Escherichia coli inclusion bodies. A set of common tags used in expression vectors has been selected, as well as two bacterial strains over-expressing the chaperones GroELS and ibpA/B, respectively. The presence of the fused tags resulted in an improved solubility of CLIPB14 but also in a higher presence of contaminants in the inclusion bodies, while chaperone co-expression promoted the binding of all the chaperone machinery involved into the disaggregation to the CLIPB14. Furthermore, each tag influenced in a specific manner the re-aggregation of the denatured CLIPB14 constructs during urea dilution and the preliminary trials indicated that the CLIPB14 fusions with higher homogeneity and lower re-aggregation rate were the optimal candidates for refolding assays. In conclusion, it is possible to tune the quality of the inclusion bodies by choosing the suitable combination of tag and chaperone co-expression that minimize the non-productive side reactions during refolding.

  9. Birbeck granule-like "organized smooth endoplasmic reticulum" resulting from the expression of a cytoplasmic YFP-tagged langerin.

    PubMed

    Lenormand, Cédric; Spiegelhalter, Coralie; Cinquin, Bertrand; Bardin, Sabine; Bausinger, Huguette; Angénieux, Catherine; Eckly, Anita; Proamer, Fabienne; Wall, David; Lich, Ben; Tourne, Sylvie; Hanau, Daniel; Schwab, Yannick; Salamero, Jean; de la Salle, Henri

    2013-01-01

    Langerin is required for the biogenesis of Birbeck granules (BGs), the characteristic organelles of Langerhans cells. We previously used a Langerin-YFP fusion protein having a C-terminal luminal YFP tag to dynamically decipher the molecular and cellular processes which accompany the traffic of Langerin. In order to elucidate the interactions of Langerin with its trafficking effectors and their structural impact on the biogenesis of BGs, we generated a YFP-Langerin chimera with an N-terminal, cytosolic YFP tag. This latter fusion protein induced the formation of YFP-positive large puncta. Live cell imaging coupled to a fluorescence recovery after photobleaching approach showed that this coalescence of proteins in newly formed compartments was static. In contrast, the YFP-positive structures present in the pericentriolar region of cells expressing Langerin-YFP chimera, displayed fluorescent recovery characteristics compatible with active membrane exchanges. Using correlative light-electron microscopy we showed that the coalescent structures represented highly organized stacks of membranes with a pentalaminar architecture typical of BGs. Continuities between these organelles and the rough endoplasmic reticulum allowed us to identify the stacks of membranes as a form of "Organized Smooth Endoplasmic Reticulum" (OSER), with distinct molecular and physiological properties. The involvement of homotypic interactions between cytoplasmic YFP molecules was demonstrated using an A206K variant of YFP, which restored most of the Langerin traffic and BG characteristics observed in Langerhans cells. Mutation of the carbohydrate recognition domain also blocked the formation of OSER. Hence, a "double-lock" mechanism governs the behavior of YFP-Langerin, where asymmetric homodimerization of the YFP tag and homotypic interactions between the lectin domains of Langerin molecules participate in its retention and the subsequent formation of BG-like OSER. These observations confirm that

  10. Chromatin Interaction Analysis with Paired-End Tag Sequencing (ChIA-PET) for Mapping Chromatin Interactions and Understanding Transcription Regulation

    PubMed Central

    Poh, Huay Mei; Peh, Su Qin; Ong, Chin Thing; Zhang, Jingyao; Ruan, Xiaoan; Ruan, Yijun

    2012-01-01

    Genomes are organized into three-dimensional structures, adopting higher-order conformations inside the micron-sized nuclear spaces 7, 2, 12. Such architectures are not random and involve interactions between gene promoters and regulatory elements 13. The binding of transcription factors to specific regulatory sequences brings about a network of transcription regulation and coordination 1, 14. Chromatin Interaction Analysis by Paired-End Tag Sequencing (ChIA-PET) was developed to identify these higher-order chromatin structures 5,6. Cells are fixed and interacting loci are captured by covalent DNA-protein cross-links. To minimize non-specific noise and reduce complexity, as well as to increase the specificity of the chromatin interaction analysis, chromatin immunoprecipitation (ChIP) is used against specific protein factors to enrich chromatin fragments of interest before proximity ligation. Ligation involving half-linkers subsequently forms covalent links between pairs of DNA fragments tethered together within individual chromatin complexes. The flanking MmeI restriction enzyme sites in the half-linkers allow extraction of paired end tag-linker-tag constructs (PETs) upon MmeI digestion. As the half-linkers are biotinylated, these PET constructs are purified using streptavidin-magnetic beads. The purified PETs are ligated with next-generation sequencing adaptors and a catalog of interacting fragments is generated via next-generation sequencers such as the Illumina Genome Analyzer. Mapping and bioinformatics analysis is then performed to identify ChIP-enriched binding sites and ChIP-enriched chromatin interactions 8. We have produced a video to demonstrate critical aspects of the ChIA-PET protocol, especially the preparation of ChIP as the quality of ChIP plays a major role in the outcome of a ChIA-PET library. As the protocols are very long, only the critical steps are shown in the video. PMID:22564980

  11. Power of deep sequencing and agilent microarray for gene expression profiling study.

    PubMed

    Feng, Lin; Liu, Hang; Liu, Yu; Lu, Zhike; Guo, Guangwu; Guo, Suping; Zheng, Hongwei; Gao, Yanning; Cheng, Shujun; Wang, Jian; Zhang, Kaitai; Zhang, Yong

    2010-06-01

    Next-generation sequencing-based Digital Gene Expression tag profiling (DGE) has been used to study the changes in gene expression profiling. To compare the quality of the data generated by microarray and DGE, we examined the gene expression profiles of an in vitro cell model with these platforms. In this study, 17,362 and 15,938 genes were detected by microarray and DGE, respectively, with 13,221 overlapping genes. The correlation coefficients between the technical replicates were >0.99 and the detection variance was <9% for both platforms. The dynamic range of microarray was fixed with four orders of magnitude, whereas that of DGE was extendable. The consistency of the two platforms was high, especially for those abundant genes. It was more difficult for the microarray to distinguish the expression variation of less abundant genes. Although microarrays might be eventually replaced by DGE or transcriptome sequencing (RNA-seq) in the near future, microarrays are still stable, practical, and feasible, which may be useful for most biological researchers.

  12. Optimized soluble expression and purification of an aggregation-prone protein by fusion tag systems and on-column cleavage in Escherichia coli.

    PubMed

    Li, Wen; Gao, Mingming; Liu, Wenchao; Kong, Yuelin; Tian, Hong; Yao, Wenbing; Gao, Xiangdong

    2012-12-01

    Previously we constructed a fusion protein based on GLP-1 and globular adiponectin but unfortunately its yield was low because it was mainly expressed as inclusion bodies. Herein to optimize the soluble expression of this fusion protein we tried several fusion tag systems. Fusion tags, including GST-, Trx- and MBP-tag, greatly improved the soluble expression of the fusion protein. However, these tag-fusion proteins were aggregation-prone as judged by Native PAGE and gel filtration chromatography, and this aggregation reduced the specificity of enterokinase-mediated enzyme cleavage which was essential to remove the fusion tags. To improve the specificity of protein cleavage, we employed on-column cleavage for downstream purification. Finally using optimized expression followed by on-column cleavage, we obtained the product fusion protein with a yield of 1.2 mg per g wet bacterial cells which was 8-fold higher than before. This method improved the yield and simplified the process, and as a convenient method it can also be used for the preparation of other aggregation-prone proteins.

  13. Method to produce acetyldiacylglycerols (ac-TAGs) by expression of an acetyltransferase gene isolated from Euonymus alatus (burning bush)

    DOEpatents

    Durrett, Timothy; Ohlrogge, John; Pollard, Michael

    2016-05-03

    The present invention relates to novel diacylglycerol acyltransferase genes and proteins, and methods of their use. In particular, the invention describes genes encoding proteins having diacylglycerol acetyltransferase activity, specifically for transferring an acetyl group to a diacylglycerol substrate to form acetyl-Triacylglycerols (ac-TAGS), for example, a 3-acetyl-1,2-diacyl-sn-glycerol. The present invention encompasses both native and recombinant wild-type forms of the transferase, as well as mutants and variant forms. The present invention also relates to methods of using novel diacylglycerol acyltransferase genes and proteins, including their expression in transgenic organisms at commercially viable levels, for increasing production of 3-acetyl-1,2-diacyl-sn-glycerols in plant oils and altering the composition of oils produced by microorganisms, such as yeast, by increasing ac-TAG production. Additionally, oils produced by methods of the present inventions comprising genes and proteins are contemplated for use as biodiesel fuel, in polymer production and as naturally produced food oils with reduced calories.

  14. New shuttle vector-based expression system to generate polyhistidine-tagged fusion proteins in Staphylococcus aureus and Escherichia coli.

    PubMed

    Schwendener, Sybille; Perreten, Vincent

    2015-05-01

    Four Staphylococcus aureus-Escherichia coli shuttle vectors were constructed for gene expression and production of tagged fusion proteins. Vectors pBUS1-HC and pTSSCm have no promoter upstream of the multiple cloning site (MCS), and this allows study of genes under the control of their native promoters, and pBUS1-Pcap-HC and pTSSCm-Pcap contain the strong constitutive promoter of S. aureus type 1 capsule gene 1A (Pcap) upstream of a novel MCS harboring codons for the peptide tag Arg-Gly-Ser-hexa-His (rgs-his6). All plasmids contained the backbone derived from pBUS1, including the E. coli origin ColE1, five copies of terminator rrnB T1, and tetracycline resistance marker tet(L) for S. aureus and E. coli. The minimum pAMα1 replicon from pBUS1 was improved through either complementation with the single-strand origin oriL from pUB110 (pBUS1-HC and pBUS1-Pcap-HC) or substitution with a pT181-family replicon (pTSSCm and pTSSCm-Pcap). The new constructs displayed increased plasmid yield and segregational stability in S. aureus. Furthermore, pBUS1-Pcap-HC and pTSSCm-Pcap offer the potential to generate C-terminal RGS-His6 translational fusions of cloned genes using simple molecular manipulation. BcgI-induced DNA excision followed by religation converts the TGA stop codon of the MCS into a TGC codon and links the rgs-his6 codons to the 3' end of the target gene. The generation of the rgs-his6 codon-fusion, gene expression, and protein purification were demonstrated in both S. aureus and E. coli using the macrolide-lincosamide-streptogramin B resistance gene erm(44) inserted downstream of Pcap. The new His tag expression system represents a helpful tool for the direct analysis of target gene function in staphylococcal cells.

  15. Epitope-Tagged Autotransporters as Single-Cell Reporters for Gene Expression by a Salmonella Typhimurium wbaP Mutant

    PubMed Central

    Curkić, Ismeta; Schütz, Monika; Oberhettinger, Philipp; Diard, Médéric; Claassen, Manfred; Linke, Dirk; Hardt, Wolf-Dietrich

    2016-01-01

    Phenotypic diversity is an important trait of bacterial populations and can enhance fitness of the existing genotype in a given environment. To characterize different subpopulations, several studies have analyzed differential gene expression using fluorescent reporters. These studies visualized either single or multiple genes within single cells using different fluorescent proteins. However, variable maturation and folding kinetics of different fluorophores complicate the study of dynamics of gene expression. Here, we present a proof-of-principle study for an alternative gene expression system in a wbaP mutant of Salmonella Typhimurium (S. Tm) lacking the O-sidechain of the lipopolysaccharide. We employed the hemagglutinin (HA)-tagged inverse autotransporter invasin (invAHA) as a transcriptional reporter for the expression of the type three secretion system 1 (T1) in S. Tm. Using a two-reporter approach with GFP and the InvAHA in single cells, we verify that this reporter system can be used for T1 gene expression analysis, at least in strains lacking the O-antigen (wbaP), which are permissive for detection of the surface-exposed HA-epitope. When we placed the two reporters gfp and invAHA under the control of either one or two different promoters of the T1 regulon, we were able to show correlative expression of both reporters. We conclude that the invAHA reporter system is a suitable tool to analyze T1gene expression in S. Tm and propose its applicability as molecular tool for gene expression studies within single cells. PMID:27149272

  16. Expression profiling and comparative sequence derived insights into lipid metabolism

    SciTech Connect

    Callow, Matthew J.; Rubin, Edward M.

    2001-12-19

    Expression profiling and genomic DNA sequence comparisons are increasingly being applied to the identification and analysis of the genes involved in lipid metabolism. Not only has genome-wide expression profiling aided in the identification of novel genes involved in important processes in lipid metabolism such as sterol efflux, but the utilization of information from these studies has added to our understanding of the regulation of pathways participating in the process. Coupled with these gene expression studies, cross species comparison, searching for sequences conserved through evolution, has proven to be a powerful tool to identify important non-coding regulatory sequences as well as the discovery of novel genes relevant to lipid biology. An example of the value of this approach was the recent chance discovery of a new apolipoprotein gene (apo AV) that has dramatic effects upon triglyceride metabolism in mice and humans.

  17. Abiotic stress-related expressed sequence tags from the diploid strawberry Fragaria vesca f. semperflorens

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background Plants of the Rosaceae family such as the large-statured tree crops apple, peach and cherry, the ornamental rose, the cultivated octoploid strawberry, F. ×ananassa, represent a considerable share of horticultural crops worldwide, many of which fail to produce maximally due to environment...

  18. A chromosome bin map of 2148 expressed sequence tag loci of wheat homoeologous group 7.

    PubMed

    Hossain, K G; Kalavacharla, V; Lazo, G R; Hegstad, J; Wentz, M J; Kianian, P M A; Simons, K; Gehlhar, S; Rust, J L; Syamala, R R; Obeori, K; Bhamidimarri, S; Karunadharma, P; Chao, S; Anderson, O D; Qi, L L; Echalier, B; Gill, B S; Linkiewicz, A M; Ratnasiri, A; Dubcovsky, J; Akhunov, E D; Dvorák, J; Miftahudin; Ross, K; Gustafson, J P; Radhawa, H S; Dilbirligi, M; Gill, K S; Peng, J H; Lapitan, N L V; Greene, R A; Bermudez-Kandianis, C E; Sorrells, M E; Feril, O; Pathan, M S; Nguyen, H T; Gonzalez-Hernandez, J L; Conley, E J; Anderson, J A; Choi, D W; Fenton, D; Close, T J; McGuire, P E; Qualset, C O; Kianian, S F

    2004-10-01

    The objectives of this study were to develop a high-density chromosome bin map of homoeologous group 7 in hexaploid wheat (Triticum aestivum L.), to identify gene distribution in these chromosomes, and to perform comparative studies of wheat with rice and barley. We mapped 2148 loci from 919 EST clones onto group 7 chromosomes of wheat. In the majority of cases the numbers of loci were significantly lower in the centromeric regions and tended to increase in the distal regions. The level of duplicated loci in this group was 24% with most of these loci being localized toward the distal regions. One hundred nineteen EST probes that hybridized to three fragments and mapped to the three group 7 chromosomes were designated landmark probes and were used to construct a consensus homoeologous group 7 map. An additional 49 probes that mapped to 7AS, 7DS, and the ancestral translocated segment involving 7BS also were designated landmarks. Landmark probe orders and comparative maps of wheat, rice, and barley were produced on the basis of corresponding rice BAC/PAC and genetic markers that mapped on chromosomes 6 and 8 of rice. Identification of landmark ESTs and development of consensus maps may provide a framework of conserved coding regions predating the evolution of wheat genomes.

  19. Generation, Analysis and Functional Annotation of Expressed Sequence Tags from the Sheepshead Minnow (Cyprinodon variegatus)

    DTIC Science & Technology

    2010-01-01

    Exposure tempera- tures were maintained at 26-28°C under a 16 h light:8 h dark photoperiod . Larvae were sampled at 1, 3, 5, and 7 days post hatch and stored...flow-through exposure system and a 16 h light:8 h dark photoperiod . Through- out the experiments, water temperature and salinity were maintained at 27± 1

  20. Expressed sequence tags from the black-winged sharpshooter: Application to biology and vector control

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We identified 14 putative full-length transcripts of proteins important for the survival of the black-winged sharpshooter, BWSS, Oncometopia nigricans. The BWSS is considered a highly competent vector of several strains of the xylem-inhabiting bacterium Xylella fastidiosa, the causal agent of a numb...

  1. Microsatellite markers derived from Quercus mongolica var. crispula (Fagaceae) inner bark expressed sequence tags.

    PubMed

    Ueno, Saneyoshi; Taguchi, Yuriko; Tsumura, Yoshihiko

    2008-04-01

    In reforestation programs the genetic composition and diversity of populations that could be used as sources of planting material needs to be carefully considered to maximize the chances of successful establishment. For such purposes genetic analyses that include the identification of functional genes are required. In this study, we constructed a cDNA library from inner bark of Quercus mongolica (which is widely distributed in Japan) and collected 3385 ESTs. After constructing 2140 unigenes, 274 microsatellites were found within them. The most frequent microsatellite had AG motif (48%) and the next most common was AAG motif (12%). There were no CG repeats in the unigenes. In total, 20 EST-SSR markers were developed, polymorphisms of which were described by using eight individuals from eight populations over the species' distributional range. The number of alleles per locus (Na) and observed heterozygosity (H(o)) ranged from 2 to 12, and from 0.25 to 1.00, respectively. Cross-species amplification was successful for 19 loci in eight individuals of Q. serrata and for 20 loci in eight individuals of Q. dentata, with values of Na and H(o) comparable to those of Q. mongolica. The EST-SSR markers characterized in this study should facilitate the analysis of genetic diversity in future studies.

  2. Identification of host immune regulation candidate genes of Toxascaris leonina by expression sequenced tags (ESTs) analysis.

    PubMed

    Cho, Min Kyoung; Lee, Keun Hee; Lee, Sun Joo; Kang, Se Won; Ock, Mee Sun; Hong, Yeon Chul; Lee, Yong Seok; Yu, Hak Sun

    2009-10-14

    Toxascaris leonina adult worms live in the gastrointestinal tract of dog, cat, and fox, releasing eggs which enter the environment by the fecal route. Previously, we reported that T. leonina adult worm derived protein was able to inhibit OVA-specific Th2 responses, and in particular, immunization with parasite proteins exerts a more profound protective effect than allergen treatment. In order to gain greater insight into the relevant immune evasion mechanisms as well as basic scientific information, we have generated ESTs of T. leonina adult female worm and investigated their functions using euKaryotic Orthologous Groups (KOG) database analysis. From the randomly selected plasmids containing DNA inserts, a total of 487 reads were collected from the T. leonina adult worm cDNA library. The annotated ESTs were classified into 25 KOG categories; the most of ESTs (7.90%) were annotated with energy production and conversion, and the second highly annotated category is translation, ribosomal structure and biogenesis related ESTs (7.69%). We also identified many host-parasite immune related genes including C-type lectin, galectin, SXP, and cathepsin L-like cysteine protease coding genes. It is necessary to get more information regarding these genes for understanding about the mechanisms of immune evasion of Toxascaris.

  3. Analysis of expressed sequence tags derived from a compatible Mycosphaerella fijiensis-banana interaction.

    PubMed

    Portal, Orelvis; Izquierdo, Yovanny; De Vleesschauwer, David; Sánchez-Rodríguez, Aminael; Mendoza-Rodríguez, Milady; Acosta-Suárez, Mayra; Ocaña, Bárbara; Jiménez, Elio; Höfte, Monica

    2011-05-01

    Mycosphaerella fijiensis, a hemibiotrophic fungus, is the causal agent of black leaf streak disease, the most serious foliar disease of bananas and plantains. To analyze the compatible interaction of M. fijiensis with Musa spp., a suppression subtractive hybridization (SSH) cDNA library was constructed to identify transcripts induced at late stages of infection in the host and the pathogen. In addition, a full-length cDNA library was created from the same mRNA starting material as the SSH library. The SSH procedure was effective in identifying specific genes predicted to be involved in plant-fungal interactions and new information was obtained mainly about genes and pathways activated in the plant. Several plant genes predicted to be involved in the synthesis of phenylpropanoids and detoxification compounds were identified, as well as pathogenesis-related proteins that could be involved in the plant response against M. fijiensis infection. At late stages of infection, jasmonic acid and ethylene signaling transduction pathways appear to be active, which corresponds with the necrotrophic life style of M. fijiensis. Quantitative PCR experiments revealed that antifungal genes encoding PR proteins and GDSL-like lipase are only transiently induced 30 days post inoculation (dpi), indicating that the fungus is probably actively repressing plant defense. The only fungal gene found was induced 37 dpi and encodes UDP-glucose pyrophosphorylase, an enzyme involved in the biosynthesis of trehalose. Trehalose biosynthesis was probably induced in response to prior activation of plant antifungal genes and may act as an osmoprotectant against membrane damage.

  4. cisExpress: motif detection in DNA sequences

    PubMed Central

    Triska, Martin; Grocutt, David; Southern, James; Murphy, Denis J.; Tatarinova, Tatiana

    2013-01-01

    Motivation: One of the major challenges for contemporary bioinformatics is the analysis and accurate annotation of genomic datasets to enable extraction of useful information about the functional role of DNA sequences. This article describes a novel genome-wide statistical approach to the detection of specific DNA sequence motifs based on similarities between the promoters of similarly expressed genes. This new tool, cisExpress, is especially designed for use with large datasets, such as those generated by publicly accessible whole genome and transcriptome projects. cisExpress uses a task farming algorithm to exploit all available computational cores within a shared memory node. We demonstrate the robust nature and validity of the proposed method. It is applicable for use with a wide range of genomic databases for any species of interest. Availability: cisExpress is available at www.cisexpress.org. Contact: tatiana.tatarinova@usc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23793750

  5. Expression Profile of Ectopic Olfactory Receptors Determined by Deep Sequencing

    PubMed Central

    Flegel, Caroline; Manteniotis, Stavros; Osthold, Sandra; Hatt, Hanns; Gisselmann, Günter

    2013-01-01

    Olfactory receptors (ORs) provide the molecular basis for the detection of volatile odorant molecules by olfactory sensory neurons. The OR supergene family encodes G-protein coupled proteins that belong to the seven-transmembrane-domain receptor family. It was initially postulated that ORs are exclusively expressed in the olfactory epithelium. However, recent studies have demonstrated ectopic expression of some ORs in a variety of other tissues. In the present study, we conducted a comprehensive expression analysis of ORs using an extended panel of human tissues. This analysis made use of recent dramatic technical developments of the so-called Next Generation Sequencing (NGS) technique, which encouraged us to use open access data for the first comprehensive RNA-Seq expression analysis of ectopically expressed ORs in multiple human tissues. We analyzed mRNA-Seq data obtained by Illumina sequencing of 16 human tissues available from Illumina Body Map project 2.0 and from an additional study of OR expression in testis. At least some ORs were expressed in all the tissues analyzed. In several tissues, we could detect broadly expressed ORs such as OR2W3 and OR51E1. We also identified ORs that showed exclusive expression in one investigated tissue, such as OR4N4 in testis. For some ORs, the coding exon was found to be part of a transcript of upstream genes. In total, 111 of 400 OR genes were expressed with an FPKM (fragments per kilobase of exon per million fragments mapped) higher than 0.1 in at least one tissue. For several ORs, mRNA expression was verified by RT-PCR. Our results support the idea that ORs are broadly expressed in a variety of tissues and provide the basis for further functional studies. PMID:23405139

  6. Expression and purification of short hydrophobic elastin-like polypeptides with maltose-binding protein as a solubility tag.

    PubMed

    Bataille, Laure; Dieryck, Wilfrid; Hocquellet, Agnès; Cabanne, Charlotte; Bathany, Katell; Lecommandoux, Sébastien; Garbay, Bertrand; Garanger, Elisabeth

    2015-06-01

    Elastin-like polypeptides (ELPs) are biodegradable polymers with interesting physico-chemical properties for biomedical and biotechnological applications. The recombinant expression of hydrophobic elastin-like polypeptides is often difficult because they possess low transition temperatures, and therefore form aggregates at sub-ambient temperatures. To circumvent this difficulty, we expressed in Escherichia coli three hydrophobic ELPs (VPGIG)n with variable lengths (n=20, 40, and 60) in fusion with the maltose-binding protein (MBP). Fusion proteins were soluble and yields of purified MBP-ELP ranged between 66 and 127mg/L culture. After digestion of the fusion proteins by enterokinase, the ELP moiety was purified by using inverse transition cycling. The purified fraction containing ELP40 was slightly contaminated by traces of undigested fusion protein. Purification of ELP60 was impaired because of co-purification of the MBP tag during inverse transition cycling. ELP20 was successfully purified to homogeneity, as assessed by gel electrophoresis and mass spectrometry analyses. The transition temperature of ELP20 was measured at 15.4°C in low salt buffer. In conclusion, this method can be used to produce hydrophobic ELP of low molecular mass.

  7. Cloning, expression, purification and characterization of his-tagged human glucose-6-phosphate dehydrogenase: a simplified method for protein yield.

    PubMed

    Gómez-Manzo, Saúl; Terrón-Hernández, Jessica; de la Mora-de la Mora, Ignacio; García-Torres, Itzhel; López-Velázquez, Gabriel; Reyes-Vivas, Horacio; Oria-Hernández, Jesús

    2013-10-01

    Glucose-6-phosphate dehydrogenase (G6PD) catalyzes the first step of the pentose phosphate pathway. In erythrocytes, the functionality of the pathway is crucial to protect these cells against oxidative damage. G6PD deficiency is the most frequent enzymopathy in humans with a global prevalence of 4.9 %. The clinical picture is characterized by chronic or acute hemolysis in response to oxidative stress, which is related to the low cellular activity of G6PD in red blood cells. The disease is heterogeneous at genetic level with around 160 mutations described, mostly point mutations causing single amino acid substitutions. The biochemical studies aimed to describe the detrimental effects of mutations on the functional and structural properties of human G6PD are indispensable to understand the molecular physiopathology of this disease. Therefore, reliable systems for efficient expression and purification of the protein are highly desirable. In this work, human G6PD was heterologously expressed in Escherichia coli and purified by immobilized metal affinity chromatography in a single chromatographic step. The structural and functional characterization indicates that His-tagged G6PD resembles previous preparations of recombinant G6PD. In contrast with previous protein yield systems, our method is based on commonly available resources and fully accessible laboratory equipment; therefore, it can be readily implemented.

  8. Purification and Refolding to Amyloid Fibrils of (His)6-tagged Recombinant Shadoo Protein Expressed as Inclusion Bodies in E. coli.

    PubMed

    Li, Qiaojing; Richard, Charles-Adrien; Moudjou, Mohammed; Vidic, Jasmina

    2015-12-19

    The Escherichia coli expression system is a powerful tool for the production of recombinant eukaryotic proteins. We use it to produce Shadoo, a protein belonging to the prion family. A chromatographic method for the purification of (His)6-tagged recombinant Shadoo expressed as inclusion bodies is described. The inclusion bodies are solubilized in 8 M urea and bound to a Ni(2+)-charged column to perform ion affinity chromatography. Bound proteins are eluted by a gradient of imidazole. Fractions containing Shadoo protein are subjected to size exclusion chromatography to obtain a highly purified protein. In the final step purified Shadoo is desalted to remove salts, urea and imidazole. Recombinant Shadoo protein is an important reagent for biophysical and biochemical studies of protein conformation disorders occurring in prion diseases. Many reports demonstrated that prion neurodegenerative diseases originate from the deposition of stable, ordered amyloid fibrils. Sample protocols describing how to fibrillate Shadoo into amyloid fibrils at acidic and neutral/basic pHs are presented. The methods on how to produce and fibrillate Shadoo can facilitate research in laboratories working on prion diseases, since it allows for production of large amounts of protein in a rapid and low cost manner.

  9. P2A-Fluorophore Tagging of BRAF Tightly Links Expression to Fluorescence In Vivo

    PubMed Central

    McMahon, Martin

    2016-01-01

    The Braf proto-oncogene is a key component of the mitogen-activated protein kinase signaling cascade and is a critical regulator of both normal development and tumorigenesis in a variety of tissues. In order to elucidate BRAF’s differing roles in varying cell types, it is important to understand both the pattern and timing of BRAF expression. Here we report the production of a mouse model that links the expression of Braf with the bright red fluorescent protein, tdTomato. We have utilized a P2A knock-in strategy, ensuring that BRAF and the fluorophore are expressed from the same endogenous promoter and from the same bicistronic mRNA transcript. This mouse model (BrafTOM) shows bright red fluorescence in organs and cell types known to be sensitive to BRAF perturbation. We further show that on a cell-by-cell basis, fluorescence correlates with BRAF protein levels. Finally, we extend the utility of this mouse by demonstrating that the remnant P2A fragment attached to BRAF acts as a suitable epitope for immunoprecipitation and biochemical characterization of BRAF in vivo. PMID:27348307

  10. A mutant sumo facilitates quick plasmid construction for expressing proteins with native N-termini after fusion tag removal

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Sumo is one of the fusion tags commonly used to enhance the solubility and yield of recombinant proteins. One advantage of using sumo is that the removal of the sumo tag is highly specific because its recognition by the ULP sumo protease is determined by its structural characteristics, instead of th...

  11. Application of the High Resolution Melting analysis for genetic mapping of Sequence Tagged Site markers in narrow-leafed lupin (Lupinus angustifolius L.).

    PubMed

    Kamel, Katarzyna A; Kroc, Magdalena; Święcicki, Wojciech

    2015-01-01

    Sequence tagged site (STS) markers are valuable tools for genetic and physical mapping that can be successfully used in comparative analyses among related species. Current challenges for molecular markers genotyping in plants include the lack of fast, sensitive and inexpensive methods suitable for sequence variant detection. In contrast, high resolution melting (HRM) is a simple and high-throughput assay, which has been widely applied in sequence polymorphism identification as well as in the studies of genetic variability and genotyping. The present study is the first attempt to use the HRM analysis to genotype STS markers in narrow-leafed lupin (Lupinus angustifolius L.). The sensitivity and utility of this method was confirmed by the sequence polymorphism detection based on melting curve profiles in the parental genotypes and progeny of the narrow-leafed lupin mapping population. Application of different approaches, including amplicon size and a simulated heterozygote analysis, has allowed for successful genetic mapping of 16 new STS markers in the narrow-leafed lupin genome.

  12. Applying thiouracil (TU)-tagging for mouse transcriptome analysis

    PubMed Central

    Gay, Leslie; Karfilis, Kate V.; Miller, Michael R.; Doe, Chris Q.; Stankunas, Kryn

    2014-01-01

    Transcriptional profiling is a powerful approach to study mouse development, physiology, and disease models. Here, we describe a protocol for mouse thiouracil-tagging (TU-tagging), a transcriptome analysis technology that includes in vivo covalent labeling, purification, and analysis of cell type-specific RNA. TU-tagging enables 1) the isolation of RNA from a given cell population of a complex tissue, avoiding transcriptional changes induced by cell isolation trauma, and 2) the identification of actively transcribed RNAs and not pre-existing transcripts. Therefore, in contrast to other cell-specific transcriptional profiling methods based on purification of tagged ribosomes or nuclei, TU-tagging provides a direct examination of transcriptional regulation. We describe how to: 1) deliver 4-thiouracil to transgenic mice to thio-label cell lineage-specific transcripts, 2) purify TU-tagged RNA and prepare libraries for Illumina sequencing, and 3) follow a straight-forward bioinformatics workflow to identify cell type-enriched or differentially expressed genes. Tissue containing TU-tagged RNA can be obtained in one day, RNA-Seq libraries generated within two days, and, following sequencing, an initial bioinformatics analysis completed in one additional day. PMID:24457332

  13. Molecular cloning, sequence characterization, and gene expression profiling of a novel water buffalo (Bubalus bubalis) gene, AGPAT6.

    PubMed

    Song, S; Huo, J L; Li, D L; Yuan, Y Y; Yuan, F; Miao, Y W

    2013-10-01

    Several 1-acylglycerol-3-phosphate-O-acyltransferases (AGPATs) can acylate lysophosphatidic acid to produce phosphatidic acid. Of the eight AGPAT isoforms, AGPAT6 is a crucial enzyme for glycerolipids and triacylglycerol biosynthesis in some mammalian tissues. We amplified and identified the complete coding sequence (CDS) of the water buffalo AGPAT6 gene by using the reverse transcription-polymerase chain reaction, based on the conversed sequence information of the cattle or expressed sequence tags of other Bovidae species. This novel gene was deposited in the NCBI database (accession No. JX518941). Sequence analysis revealed that the CDS of this AGPAT6 encodes a 456-amino acid enzyme (molecular mass = 52 kDa; pI = 9.34). Water buffalo AGPAT6 contains three hydrophobic transmembrane regions and a signal 37-amino acid peptide, localized in the cytoplasm. The deduced amino acid sequences share 99, 98, 98, 97, 98, 98, 97 and 95% identity with their homologous sequences from cattle, horse, human, mouse, orangutan, pig, rat, and chicken, respectively. The phylogenetic tree analysis based on the AGPAT6 CDS showed that water buffalo has a closer genetic relationship with cattle than with other species. Tissue expression profile analysis shows that this gene is highly expressed in the mammary gland, moderately expressed in the heart, muscle, liver, and brain; weakly expressed in the pituitary gland, spleen, and lung; and almost silently expressed in the small intestine, skin, kidney, and adipose tissues. Four predicted microRNA target sites are found in the water buffalo AGPAT6 CDS. These results will establish a foundation for further insights into this novel water buffalo gene.

  14. CSTminer: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison.

    PubMed

    Castrignanò, Tiziana; Canali, Alessandro; Grillo, Giorgio; Liuni, Sabino; Mignone, Flavio; Pesole, Graziano

    2004-07-01

    The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score. The web tool, available at http://www.caspur.it/CSTminer/, is dynamically interconnected with the Ensembl genome resources and produces a graphical output showing a map of detected conserved sequences and annotated gene features.

  15. Expression and Subcellular Distribution of GFP-Tagged Human Tetraspanin Proteins in Saccharomyces cerevisiae

    PubMed Central

    Skaar, Karin; Korza, Henryk J.; Tarry, Michael; Sekyrova, Petra; Högbom, Martin

    2015-01-01

    Tetraspanins are integral membrane proteins that function as organizers of multimolecular complexes and modulate function of associated proteins. Mammalian genomes encode approximately 30 different members of this family and remotely related eukaryotic species also contain conserved tetraspanin homologs. Tetraspanins are involved in a number of fundamental processes such as regulation of cell migration, fusion, immunity and signaling. Moreover, they are implied in numerous pathological states including mental disorders, infectious diseases or cancer. Despite the great interest in tetraspanins, the structural and biochemical basis of their activity is still largely unknown. A major bottleneck lies in the difficulty of obtaining stable and homogeneous protein samples in large quantities. Here we report expression screening of 15 members of the human tetraspanin superfamily and successful protocols for the production in S. cerevisiae of a subset of tetraspanins involved in human cancer development. We have demonstrated the subcellular localization of overexpressed tetraspanin-green fluorescent protein fusion proteins in S. cerevisiae and found that despite being mislocalized, the fusion proteins are not degraded. The recombinantly produced tetraspanins are dispersed within the endoplasmic reticulum membranes or localized in granule-like structures in yeast cells. The recombinantly produced tetraspanins can be extracted from the membrane fraction and purified with detergents or the poly (styrene-co-maleic acid) polymer technique for use in further biochemical or biophysical studies. PMID:26218426

  16. High-throughput sequencing-based genome-wide identification of microRNAs expressed in developing cotton seeds.

    PubMed

    Wang, YanMei; Ding, Yan; Yu, DingWei; Xue, Wei; Liu, JinYuan

    2015-08-01

    MicroRNAs (miRNAs) have been shown to play critical regulatory roles in gene expression in cotton. Although a large number of miRNAs have been identified in cotton fibers, the functions of miRNAs in seed development remain unexplored. In this study, a small RNA library was constructed from cotton seeds sampled at 15 days post-anthesis (DPA) and was subjected to high-throughput sequencing. A total of 95 known miRNAs were detected to be expressed in cotton seeds. The expression pattern of these identified miRNAs was profiled and 48 known miRNAs were differentially expressed between cotton seeds and fibers at 15 DPA. In addition, 23 novel miRNA candidates were identified in 15-DPA seeds. Putative targets for 21 novel and 87 known miRNAs were successfully predicted and 900 expressed sequence tag (EST) sequences were proposed to be candidate target genes, which are involved in various metabolic and biological processes, suggesting a complex regulatory network in developing cotton seeds. Furthermore, miRNA-mediated cleavage of three important transcripts in vivo was validated by RLM-5' RACE. This study is the first to show the regulatory network of miRNAs that are involved in developing cotton seeds and provides a foundation for future studies on the specific functions of these miRNAs in seed development.

  17. Increased functional protein expression using nucleotide sequence features enriched in highly expressed genes in zebrafish.

    PubMed

    Horstick, Eric J; Jordan, Diana C; Bergeron, Sadie A; Tabor, Kathryn M; Serpe, Mihaela; Feldman, Benjamin; Burgess, Harold A

    2015-04-20

    Many genetic manipulations are limited by difficulty in obtaining adequate levels of protein expression. Bioinformatic and experimental studies have identified nucleotide sequence features that may increase expression, however it is difficult to assess the relative influence of these features. Zebrafish embryos are rapidly injected with calibrated doses of mRNA, enabling the effects of multiple sequence changes to be compared in vivo. Using RNAseq and microarray data, we identified a set of genes that are highly expressed in zebrafish embryos and systematically analyzed for enrichment of sequence features correlated with levels of protein expression. We then tested enriched features by embryo microinjection and functional tests of multiple protein reporters. Codon selection, releasing factor recognition sequence and specific introns and 3' untranslated regions each increased protein expression between 1.5- and 3-fold. These results suggested principles for increasing protein yield in zebrafish through biomolecular engineering. We implemented these principles for rational gene design in software for codon selection (CodonZ) and plasmid vectors incorporating the most active non-coding elements. Rational gene design thus significantly boosts expression in zebrafish, and a similar approach will likely elevate expression in other animal models.

  18. English Declarative Tags, Intonation Tags, and Tag Questions. Volume 10.

    ERIC Educational Resources Information Center

    Armagost, James L.

    This paper seeks to discover the rules active in the formation of tags (intonation tags, declarative tags, and tag questions) in English. The author discusses former analyses of these constructions and presents his own thoughts with many examples, concluding that English has at least two tag formation rules: one that accounts (perhaps…

  19. Dynamic changes in the composition of photosynthetic picoeukaryotes in the northwestern Pacific Ocean revealed by high-throughput tag sequencing of plastid 16S rRNA genes.

    PubMed

    Choi, Dong H; An, Sung M; Chun, Sungjun; Yang, Eun C; Selph, Karen E; Lee, Charity M; Noh, Jae H

    2016-02-01

    Photosynthetic picoeukaryotes (PPEs) are major oceanic primary producers. However, the diversity of such communities remains poorly understood, especially in the northwestern (NW) Pacific. We investigated the abundance and diversity of PPEs, and recorded environmental variables, along a transect from the coast to the open Pacific Ocean. High-throughput tag sequencing (using the MiSeq system) revealed the diversity of plastid 16S rRNA genes. The dominant PPEs changed at the class level along the transect. Prymnesiophyceae were the only dominant PPEs in the warm pool of the NW Pacific, but Mamiellophyceae dominated in coastal waters of the East China Sea. Phylogenetically, most Prymnesiophyceae sequences could not be resolved at lower taxonomic levels because no close relatives have been cultured. Within the Mamiellophyceae, the genera Micromonas and Ostreococcus dominated in marginal coastal areas affected by open water, whereas Bathycoccus dominated in the lower euphotic depths of oligotrophic open waters. Cryptophyceae and Phaeocystis (of the Prymnesiophyceae) dominated in areas affected principally by coastal water. We also defined the biogeographical distributions of Chrysophyceae, prasinophytes, Bacillariophyceaea and Pelagophyceae. These distributions were influenced by temperature, salinity and chlorophyll a and nutrient concentrations.

  20. Comparison of direct boiling method with commercial kits for extracting fecal microbiome DNA by Illumina sequencing of 16S rRNA tags.

    PubMed

    Peng, Xin; Yu, Ke-Qiang; Deng, Guan-Hua; Jiang, Yun-Xia; Wang, Yu; Zhang, Guo-Xia; Zhou, Hong-Wei

    2013-12-01

    Low cost and high throughput capacity are major advantages of using next generation sequencing (NGS) techniques to determine metagenomic 16S rRNA tag sequences. These methods have significantly changed our view of microorganisms in the fields of human health and environmental science. However, DNA extraction using commercial kits has shortcomings of high cost and time constraint. In the present study, we evaluated the determination of fecal microbiomes using a direct boiling method compared with 5 different commercial extraction methods, e.g., Qiagen and MO BIO kits. Principal coordinate analysis (PCoA) using UniFrac distances and clustering showed that direct boiling of a wide range of feces concentrations gave a similar pattern of bacterial communities as those obtained from most of the commercial kits, with the exception of the MO BIO method. Fecal concentration by boiling method affected the estimation of α-diversity indices, otherwise results were generally comparable between boiling and commercial methods. The operational taxonomic units (OTUs) determined through direct boiling showed highly consistent frequencies with those determined through most of the commercial methods. Even those for the MO BIO kit were also obtained by the direct boiling method with high confidence. The present study suggested that direct boiling could be used to determine the fecal microbiome and using this method would significantly reduce the cost and improve the efficiency of the sample preparation for studying gut microbiome diversity.

  1. Deciphering Poxvirus Gene Expression by RNA Sequencing and Ribosome Profiling

    PubMed Central

    Cao, Shuai; Martens, Craig A.; Porcella, Stephen F.; Xie, Zhi; Ma, Ming; Shen, Ben

    2015-01-01

    ABSTRACT The more than 200 closely spaced annotated open reading frames, extensive transcriptional read-through, and numerous unpredicted RNA start sites have made the analysis of vaccinia virus gene expression challenging. Genome-wide ribosome profiling provided an unprecedented assessment of poxvirus gene expression. By 4 h after infection, approximately 80% of the ribosome-associated mRNA was viral. Ribosome-associated mRNAs were detected for most annotated early genes at 2 h and for most intermediate and late genes at 4 and 8 h. Cluster analysis identified a subset of early mRNAs that continued to be translated at the later times. At 2 h, there was excellent correlation between the abundance of individual mRNAs and the numbers of associated ribosomes, indicating that expression was primarily transcriptionally regulated. However, extensive transcriptional read-through invalidated similar correlations at later times. The mRNAs with the highest density of ribosomes had host response, DNA replication, and transcription roles at early times and were virion components at late times. Translation inhibitors were used to map initiation sites at single-nucleotide resolution at the start of most annotated open reading frames although in some cases a downstream methionine was used instead. Additional putative translational initiation sites with AUG or alternative codons occurred mostly within open reading frames, and fewer occurred in untranslated leader sequences, antisense strands, and intergenic regions. However, most open reading frames associated with these additional translation initiation sites were short, raising questions regarding their biological roles. The data were used to construct a high-resolution genome-wide map of the vaccinia virus translatome. IMPORTANCE This report contains the first genome-wide, high-resolution analysis of poxvirus gene expression at both transcriptional and translational levels. The study was made possible by recent methodological

  2. Optimisation of a multivalent Strep tag for protein detection.

    PubMed

    Busby, Michael; Stadler, Lukas Kurt Josef; Ko Ferrigno, Paul; Davis, Jason J

    2010-11-01

    The Strep tag is a peptide sequence that is able to mimic biotin's ability to bind to streptavidin. Sequences of Strep tags from 0 to 5 have been appended to the N-terminus of a model protein, the Stefin A Quadruple Mutant (SQM) peptide aptamer scaffold, and the recombinant fusion proteins expressed. The affinities of the proteins for streptavidin have been assessed as a function of the number of tags inserted using a variety of labelled and label-free bioanalytical and surface based methods (Western blots, microarray assays and surface plasmon resonance spectroscopy). The binding affinity increases with the number of tags across all assays, reaching nanomolar levels with 5 inserts, an observation assigned to a progressive increase in the probability of a binding interaction occurring. In addition a novel interfacial FRET based assay has been developed for generic Strep tag interactions, which utilises a conventional microarray scanner and bypasses the requirement for expensive lifetime imaging equipment. By labelling both the tagged StrepX-SQM(2) and streptavidin targets, the conjugate is primed for label-free FRET based displacement assays.

  3. Expression platforms for producing eukaryotic proteins: a comparison of E. coli cell-based and wheat germ cell-free synthesis, affinity and solubility tags, and cloning strategies.

    PubMed

    Aceti, David J; Bingman, Craig A; Wrobel, Russell L; Frederick, Ronnie O; Makino, Shin-Ichi; Nichols, Karl W; Sahu, Sarata C; Bergeman, Lai F; Blommel, Paul G; Cornilescu, Claudia C; Gromek, Katarzyna A; Seder, Kory D; Hwang, Soyoon; Primm, John G; Sabat, Grzegorz; Vojtik, Frank C; Volkman, Brian F; Zolnai, Zsolt; Phillips, George N; Markley, John L; Fox, Brian G

    2015-06-01

    Vectors designed for protein production in Escherichia coli and by wheat germ cell-free translation were tested using 21 well-characterized eukaryotic proteins chosen to serve as controls within the context of a structural genomics pipeline. The controls were carried through cloning, small-scale expression trials, large-scale growth or synthesis, and purification. Successfully purified proteins were also subjected to either crystallization trials or (1)H-(15)N HSQC NMR analyses. Experiments evaluated: (1) the relative efficacy of restriction/ligation and recombinational cloning systems; (2) the value of maltose-binding protein (MBP) as a solubility enhancement tag; (3) the consequences of in vivo proteolysis of the MBP fusion as an alternative to post-purification proteolysis; (4) the effect of the level of LacI repressor on the yields of protein obtained from E. coli using autoinduction; (5) the consequences of removing the His tag from proteins produced by the cell-free system; and (6) the comparative performance of E. coli cells or wheat germ cell-free translation. Optimal promoter/repressor and fusion tag configurations for each expression system are discussed.

  4. Multifunctional phenylboronic acid-tagged fluorescent silica nanoparticles via thiol-ene click reaction for imaging sialic acid expressed on living cells.

    PubMed

    Cheng, Liwei; Zhang, Xianxia; Zhang, Zhengyong; Chen, Hui; Zhang, Song; Kong, Jilie

    2013-10-15

    Multifunctional fluorescent silica nanoparticles with phenylboronic acid tags were developed for labeling sialic acid on the surface of living cancer cells. In this paper, fluorescent silica nanoparticles (FSNPs) with strong and stable emission at 515 nm were firstly prepared through a reverse microemulsion process, and then modified with highly selective phenylboronic acid (PBA) tags on their surface via an aqueous 'thiol-ene' click reaction. These nanoparticles had a hydrodynamic diameter of 92.6 ± 9.1 nm, and a bright fluorescence signal, which is 366 times higher than that of a single dye molecule. Meanwhile, these PBA-tagged FSNPs were found very stable in aqueous solution as well as in cell culture medium, verified by transmission electron microscopy, X-ray photoelectron spectroscopy and zeta potential analysis. The over-expressed sialic acid (SA) on the membrane of living HeLa cells was visualized in situ by a confocal laser scanning microscopy, ascribed to the specific interaction between PBA and SA. Thus, the PBA-FSBPs showed a great potential in probing SA expressed on living cells with high selectivity and sensitivity.

  5. Expression Platforms for Producing Eukaryotic Proteins: A Comparison of E. coli Cell-Based and Wheat Germ Cell-Free Synthesis, Affinity and Solubility Tags, and Cloning Strategies

    PubMed Central

    Aceti, David J.; Bingman, Craig A.; Wrobel, Russell L.; Frederick, Ronnie O.; Makino, Shin-ichi; Nichols, Karl W.; Sahu, Sarata C.; Bergeman, Lai F.; Blommel, Paul G.; Cornilescu, Claudia C.; Gromek, Katarzyna A.; Seder, Kory D.; Hwang, Soyoon; Primm, John G.; Sabat, Grzegorz; Vojtik, Frank C.; Volkman, Brian F.; Zolnai, Zsolt; Phillips, George N.; Markley, John L.; Fox, Brian G.

    2015-01-01

    Vectors designed for protein production in Escherichia coli and by wheat germ cell-free translation were tested using 21 well-characterized eukaryotic proteins chosen to serve as controls within the context of a structural genomics pipeline. The controls were carried through cloning, small-scale expression trials, large-scale growth or synthesis, and purification. Successfully purified proteins were also subjected to either crystallization trials or 1H-15N HSQC NMR analyses. Experiments evaluated: (1) the relative efficacy of restriction/ligation and recombinational cloning systems; (2) the value of maltose-binding protein (MBP) as a solubility enhancement tag; (3) the consequences of in vivo proteolysis of the MBP fusion as an alternative to post-purification proteolysis; (4) the effect of the level of LacI repressor on the yields of protein obtained from E. coli using autoinduction; (5) the consequences of removing the His tag from proteins produced by the cell-free system; and (6) the comparative performance of E. coli cells or wheat germ cell-free translation. Optimal promoter/repressor and fusion tag configurations for each expression system are discussed. PMID:25854603

  6. The dynamics of the bacterial diversity in the redox transition and anoxic zones of the Cariaco Basin assessed by parallel tag sequencing.

    PubMed

    Rodriguez-Mora, Maria J; Scranton, Mary I; Taylor, Gordon T; Chistoserdov, Andrei Y

    2015-09-01

    Massively parallel tag sequencing was applied to describe the bacterial diversity in the redox transition and anoxic zones of the Cariaco Basin. In total, 14 samples from the Cariaco Basin were collected over a period of eight years from two stations. A total of 244 357 unique bacterial V6 amplicons were sequenced. The total number of operational taxonomic units (OTUs) found in this study was 4692, with a range of 511-1491 OTUs per sample. Approximately 95% of the OTUs found in the redox transition zone and anoxic layers of Cariaco are represented by less than 50 amplicons suggesting that only about 5% of the bacterial OTUs are responsible for the bulk of the microbial processes in the basin redox transition and anoxic zones. The same dominant OTUs were observed across all eight years of sampling although periodic fluctuations in their proportion were apparent. No distinctive differences were observed between the bacterial communities from the redox transition and anoxic layers of the Cariaco Basin water column. The largest proportion of amplicons belongs to Gammaproteobacteria represented mostly by sulfide oxidizers, followed by Marine Group A (originally described as SAR406; Gordon and Giovannoni 1996), a group of uncultured bacteria hypothesized to be involved in metal reduction, and sulfate-reducing Deltaproteobacteria. Gammaproteobacteria, Deltaproteobacteria and Marine Group A make up 67-90% of all V6 amplicons sequenced in this study. This strongly suggests that the basin's microbial communities are actively involved in the sulfur-related metabolism and coupling of the sulfur and carbon cycles. According to detrended canonical correspondence analysis, ecological factors such as chemoautotrophy, nitrate and oxidized and reduced sulfur compounds influence the structuring and distribution of the Cariaco microbial communities.

  7. Active populations of rare microbes in oceanic environments as revealed by bromodeoxyuridine incorporation and 454 tag sequencing.

    PubMed

    Hamasaki, Koji; Taniguchi, Akito; Tada, Yuya; Kaneko, Ryo; Miki, Takeshi

    2016-02-01

    The "rare biosphere" consisting of thousands of low-abundance microbial taxa is important as a seed bank or a gene pool to maintain microbial functional redundancy and robustness of the ecosystem. Here we investigated contemporaneous growth of diverse microbial taxa including rare taxa and determined their variability in environmentally distinctive locations along a north-south transect in the Pacific Ocean in order to assess which taxa were actively growing and how environmental factors influenced bacterial community structures. A bromodeoxyuridine-labeling technique in combination with PCR amplicon pyrosequencing of 16S rRNA genes gave 215-793 OTUs from 1200 to 3500 unique sequences in the total communities and 175-299 OTUs nearly 860 to 1800 sequences in the active communities. Unexpectedly, many of the active OTUs were not detected in the total fractions. Among these active but rare OTUs, some taxa (2-4% of rare OTUs) showed much higher abundance (>0.10% of total reads) in the active fraction than in the total fraction, suggesting that their contribution to bacterial community productivity or growth was much larger than that expected from their standing stocks at each location. An ordination plot by the principal component analysis presented that bacterial community compositions among 4 sampling locations and between total and active fractions were distinctive with each other. A redundancy analysis revealed that the variability of community compositions significantly correlated to seawater temperature and dissolved oxygen concentration. Also, a variation partitioning analysis showed that the environmental factors explained 49% of the variability of community compositions and the distance only explained 4.0% of its variability. These results implied very dynamic change of community structures due to environmental filtering. The active bacterial populations are more diverse and spread further in rare biosphere than we have ever seen. This study implied that rare

  8. How to analyze gene expression using RNA-sequencing data.

    PubMed

    Ramsköld, Daniel; Kavak, Ersen; Sandberg, Rickard

    2012-01-01

    RNA-Seq is arising as a powerful method for transcriptome analyses that will eventually make microarrays obsolete for gene expression analyses. Improvements in high-throughput sequencing and efficient sample barcoding are now enabling tens of samples to be run in a cost-effective manner, competing with microarrays in price, excelling in performance. Still, most studies use microarrays, partly due to the ease of data analyses using programs and modules that quickly turn raw microarray data into spreadsheets of gene expression values and significant differentially expressed genes. Instead RNA-Seq data analyses are still in its infancy and the researchers are facing new challenges and have to combine different tools to carry out an analysis. In this chapter, we provide a tutorial on RNA-Seq data analysis to enable researchers to quantify gene expression, identify splice junctions, and find novel transcripts using publicly available software. We focus on the analyses performed in organisms where a reference genome is available and discuss issues with current methodology that have to be solved before RNA-Seq data can utilize its full potential.

  9. Improved expression and purification of the Helicobacter pylori adhesin BabA through the incorporation of a hexa-lysine tag.

    PubMed

    Hage, Naim; Renshaw, Jonathan G; Winkler, G Sebastiaan; Gellert, Paul; Stolnik, Snow; Falcone, Franco H

    2015-02-01

    Helicobacter pylori is a pathogenic bacterium that has the remarkable ability to withstand the harsh conditions of the stomach for decades. This is achieved through unique evolutionary adaptations, which include binding Lewis(b) antigens found on the gastric epithelium using the outer membrane protein BabA. We show here the yield of a recombinant form of BabA, comprising its putative extracellular binding domain, can be significantly increased through the addition of a hexa-lysine tag to the C-terminus of the protein. BabA was expressed in the periplasmic space of Escherichia coli and purified using immobilised metal ion affinity and size exclusion chromatography - yielding approximately 1.8 mg of protein per litre of culture. The hexa-lysine tag does not inhibit the binding activity of BabA as the recombinant protein was found to possess affinity towards HSA-Lewis(b) glycoconjugates.

  10. Expression dynamics and ultrastructural localization of epitope-tagged Abutilon mosaic virus nuclear shuttle and movement proteins in Nicotiana benthamiana cells

    SciTech Connect

    Kleinow, Tatjana; Tanwir, Fariha; Kocher, Cornelia; Krenz, Bjoern; Wege, Christina; Jeske, Holger

    2009-09-01

    The geminivirus Abutilon mosaic virus (AbMV) encodes two proteins which are essential for viral spread within plants. The nuclear shuttle protein (NSP) transfers viral DNA between the nucleus and cytoplasm, whereas the movement protein (MP) facilitates transport between cells through plasmodesmata and long-distance via phloem. An inducible overexpression system for epitope-tagged NSP and MP in plants yielded unprecedented amounts of both proteins. Western blots revealed extensive posttranslational modification and truncation for MP, but not for NSP. Ultrastructural examination of Nicotiana benthamiana tissues showed characteristic nucleopathic alterations, including fibrillar rings, when epitope-tagged NSP and MP were simultaneously expressed in leaves locally infected with an AbMV DNA A in which the coat protein gene was replaced by a green fluorescent protein encoding gene. Immunogold labelling localized NSP in the nucleoplasm and in the fibrillar rings. MP appeared at the cell periphery, probably the plasma membrane, and plasmodesmata.

  11. A simple and effective strategy for solving the problem of inclusion bodies in recombinant protein technology: His-tag deletions enhance soluble expression.

    PubMed

    Zhu, Shaozhou; Gong, Cuiyu; Ren, Lu; Li, Xingzhou; Song, Dawei; Zheng, Guojun

    2013-01-01

    The formation of inclusion bodies (IBs) in recombinant protein biotechnology has become one of the most frequent undesirable occurrences in both research and industrial applications. So far, the pET System is the most powerful system developed for the production of recombinant proteins when Escherichia coli is used as the microbial cell factory. Also, using fusion tags to facilitate detection and purification of the target protein is a commonly used tactic. However, there is still a large fraction of proteins that cannot be produced in E. coli in a soluble (and hence functional) form. Intensive research efforts have tried to address this issue, and numerous parameters have been modulated to avoid the formation of inclusion bodies. However, hardly anyone has noticed that adding fusion tags to the recombinant protein to facilitate purification is a key factor that affects the formation of inclusion bodies. To test this idea, the industrial biocatalysts uridine phosphorylase from Aeropyrum pernix K1 and (+)-γ-lactamase and (-)-γ-lactamase from Bradyrhizobium japonicum USDA 6 were expressed in E. coli by using the pET System and then examined. We found that using a histidine tag as a fusion partner for protein expression did affect the formation of inclusion bodies in these examples, suggesting that removing the fusion tag can promote the solubility of heterologous proteins. The production of soluble and highly active uridine phosphorylase, (+)-γ-lactamase, and (-)-γ-lactamase in our results shows that the traditional process needs to be reconsidered. Accordingly, a simple and efficient structure-based strategy for the production of valuable soluble recombinant proteins in E. coli is proposed.

  12. Illumina sequencing of 16S rRNA tag revealed spatial variations of bacterial communities in a mangrove wetland.

    PubMed

    Jiang, Xiao-Tao; Peng, Xin; Deng, Guan-Hua; Sheng, Hua-Fang; Wang, Yu; Zhou, Hong-Wei; Tam, Nora Fung-Yee

    2013-07-01

    The microbial community plays an essential role in the high productivity in mangrove wetlands. A proper understanding of the spatial variations of microbial communities will provide clues about the underline mechanisms that structure microbial groups and the isolation of bacterial strains of interest. In the present study, the diversity and composition of the bacterial community in sediments collected from four locations, namely mudflat, edge, bulk, and rhizosphere, within the Mai Po Ramsar Wetland in Hong Kong, SAR, China were compared using the barcoded Illumina paired-end sequencing technique. Rarefaction results showed that the bulk sediment inside the mature mangrove forest had the highest bacterial α-diversity, while the mudflat sediment without vegetation had the lowest. The comparison of β-diversity using principal component analysis and principal coordinate analysis with UniFrac metrics both showed that the spatial effects on bacterial communities were significant. All sediment samples could be clustered into two major groups, inner (bulk and rhizosphere sediments collected inside the mangrove forest) and outer mangrove sediments (the sediments collected at the mudflat and the edge of the mangrove forest). With the linear discriminate analysis scores larger than 3, four phyla, namely Actinobacteria, Acidobacteria, Nitrospirae, and Verrucomicrobia, were enriched in the nutrient-rich inner mangrove sediments, while abundances of Proteobacteria and Deferribacterias were higher in outer mangrove sediments. The rhizosphere effect of mangrove plants was also significant, which had a lower α-diversity, a higher amount of Nitrospirae, and a lower abundance of Proteobacteria than the bulk sediment nearby.

  13. A high-density genetic recombination map of sequence-tagged sites for sorghum, as a framework for comparative structural and evolutionary genomics of tropical grains and grasses.

    PubMed Central

    Bowers, John E; Abbey, Colette; Anderson, Sharon; Chang, Charlene; Draye, Xavier; Hoppe, Alison H; Jessup, Russell; Lemke, Cornelia; Lennington, Jennifer; Li, Zhikang; Lin, Yann-Rong; Liu, Sin-Chieh; Luo, Lijun; Marler, Barry S; Ming, Reiguang; Mitchell, Sharon E; Qiang, Dou; Reischmann, Kim; Schulze, Stefan R; Skinner, D Neil; Wang, Yue-Wen; Kresovich, Stephen; Schertz, Keith F; Paterson, Andrew H

    2003-01-01

    We report a genetic recombination map for Sorghum of 2512 loci spaced at average 0.4 cM ( approximately 300 kb) intervals based on 2050 RFLP probes, including 865 heterologous probes that foster comparative genomics of Saccharum (sugarcane), Zea (maize), Oryza (rice), Pennisetum (millet, buffelgrass), the Triticeae (wheat, barley, oat, rye), and Arabidopsis. Mapped loci identify 61.5% of the recombination events in this progeny set and reveal strong positive crossover interference acting across intervals of sequence-tagged sites will foster many structural, functional and evolutionary genomic studies in major food, feed, and biomass crops. PMID:14504243

  14. Sequence and expression of ferredoxin mRNA in barley

    SciTech Connect

    Zielinski, R.; Funder, P.M.; Ling, V. )

    1990-05-01

    We have isolated and structurally characterized a full-length cDNA clone encoding ferredoxin from a {lambda}gt10 cDNA library prepared from barley leaf mRNA. The ferredoxin clone (pBFD-1) was fused head-to-head with a partial-length cDNA clone encoding calmodulin, and was fortuitously isolated by screening the library with a calmodulin-specific oligonucleotide probe. The mRNA sequence from which pBFD-1 was derived is expressed exclusively in the leaf tissues of 7-d old barley seedlings. Barley pre-ferredoxin has a predicted size of 15.3 kDal, of which 4.6 kDal are accounted for by the transit peptide. The polypeptide encoded by pBFD-1 is identical to wheat ferredoxin, and shares slightly more amino acid sequence similarity with spinach ferredoxin I than with ferredoxin II. Ferredoxin mRNA levels are rapidly increased 10-fold by white light in etiolated barley leaves.

  15. Murine Brca2: Sequence, map position, and expression pattern

    SciTech Connect

    Sharan, S.K.; Bradley, A.

    1997-03-01

    Mutations in the human BRCA2 gene are responsible for about 45% of hereditary early onset breast cancer. Recently, the human BRCA2 gene was cloned, and several germline mutations were identified. Here we describe the cloning of the mouse homologue of BRCA2. The mouse cDNA sequence predicts a 3328-amino-acid Brca2 protein, 90 amino acids shorter than the human protein. The overall identity between the mouse and the human proteins is 59%, while the similarity is 72%. At the nucleotide level the homology is 74%. By comparing the amino acid sequences of the two homologues we have identified five highly conserved novel domains that may be functionally significant. Brca2 has been mapped to the distal end of mouse chromosome 5, a region of the mouse genome that contains other genes that also map to human chromosome 13q12-q13, confirming the conservation of this linkage group between the two species. Expression of Brca2 was detected in midgestation embryos and adult testis, thymus, and ovary. 21 refs., 5 figs.

  16. Expression analysis of rice pathogenesis-related proteins involved in stress response and endophytic colonization properties of gfp-tagged Bacillus subtilis CB-R05.

    PubMed

    Ji, Sang Hye; Gururani, Mayank Anand; Chun, Se-Chul

    2014-09-01

    Bacillus subtilis CB-R05, possessing antagonistic effects against several fungal pathogens, is a diazotrophic plant growth-promoting bacteria marked with the green fluorescent protein (gfp) gene. To confirm the expression level of the pathogenesis-related (PR) proteins in rice inoculated with CB-R05, the expressions of four pathogenesis-related (PR) proteins (PR2, PR6, PR15, and PR16) were examined in the rice leaves treated with wounding stress over a time period. The PR proteins were generally more strongly expressed in the rice leaves inoculated with CB-R05 compared with the untreated control. The marked gfp-tagged B. subtilis CB-R05 strain was inoculated onto the rice seedlings under axenic conditions. Under the confocal laser scanning microscope (CLSM), the gfp-tagged CB-R05 bacterial cells were observed to penetrate the rhizoplane, especially in the elongation and differentiation zones of the rice roots, and colonize the root intracellularly. The bacteria, 24 h after the gfp-tagged CB-R05 inoculation, were seen to penetrate into the cell wall, cortex, xylem, and concentrate mainly in the vascular bundle. Numerous bacteria were observed within the intercellular spaces, root cortical cells, and xylem vessels. Over time, these bacteria dispersed to the lateral root junctions and propagated slowly from the roots to the stems and leaves. The B. subtilis CB-R05 population in the rice root rhizosphere was also monitored. These results show a very widespread colonization of the B. subtilis CB-R05 in the rice rhizosphere. Further attempts are under way to investigate the competition between the CB-R05 bacteria and the fungal pathogen in vivo.

  17. Shift in prokaryotic diversity in Arctic sediment along a continuum Glacier -River - Fjord using massive 16S rRNA gene tag sequencing

    NASA Astrophysics Data System (ADS)

    Laghdass, M.; Deloffre, J.; Lafite, R.; Hänni, C.; Gillet, B.; Cecillon, S.; Simonet, P.; Petit, F.

    2012-04-01

    In Arctic environment, one of indirect consequences of the global climate warming is the significant amplification of the amount of inland water during the spring thaw resulting from the snow cover and permafrost melting. These freshwater transfers to the coast cause sedimentary transfers. The Arctic fjords that represent deep glacial valleys of the sea are particularly vulnerable systems. Although the previous studies have highlighted potentially the high bacterial diversity in Arctic environment by the pyrosequencing, a new-generation sequencing and high throughput method, does not escape the same bias as the one of classical molecular biology techniques involved at different stages of the analysis. In this context, our objective was to characterize the prokaryotic diversity associated to the sediment transfer along a gradient from the head of the glacier to mud patch sediment in the Goule river streaming in Kongsfjorden (Svalbard) during an active thaw. The prokaryotic diversity in sediment was characterized by combining a massive of 16S rRNA gene tag sequencing with a specific and original approach in order to overcome the bias associated to the sampling and extraction. The sediment was extracted by three different methods. One method was done in duplicate. Negative controls performed at extraction and PCR stages were also sequenced. The phylogenetic analysis of the environmental samples below phylum level revealed significantly changes in the diversity and the function of the prokaryotic community along the gradient. The subglacial Goule river sediment is characterized by bacteria with specific functions methylotroph bacteria, aerobic chemoautolithotrophic bacteria (Alphaproteobacteria with Methylobacteriaceae) whereas the mouth of the river Goule and the freshwater part of the Goule River was dominated by sulphate-reducing-bacteria, anaerobic chemooorganotroph (Deltaprotobacteria with the Desulfobulbaceae and Desulfuromonadaceae) and by

  18. Transcriptome profiling and digital gene expression by deep sequencing in early somatic embryogenesis of endangered medicinal Eleutherococcus senticosus Maxim.

    PubMed

    Tao, Lei; Zhao, Yue; Wu, Ying; Wang, Qiuyu; Yuan, Hongmei; Zhao, Lijuan; Guo, Wendong; You, Xiangling

    2016-03-01

    Somatic embryogenesis (SE) has been studied as a model system to understand molecular events in physiology, biochemistry, and cytology during plant embryo development. In particular, it is exceedingly difficult to access the morphological and early regulatory events in zygotic embryos. To understand the molecular mechanisms regulating early SE in Eleutherococcus senticosus Maxim., we used high-throughput RNA-Seq technology to investigate its transcriptome. We obtained 58,327,688 reads, which were assembled into 75,803 unique unigenes. To better understand their functions, the unigenes were annotated using the Clusters of Orthologous Groups, Gene Ontology, and Kyoto Encyclopedia of Genes and Genomes databases. Digital gene expression libraries revealed differences in gene expression profiles at different developmental stages (embryogenic callus, yellow embryogenic callus, global embryo). We obtained a sequencing depth of >5.6 million tags per sample and identified many differentially expressed genes at various stages of SE. The initiation of SE affected gene expression in many KEGG pathways, but predominantly that in metabolic pathways, biosynthesis of secondary metabolites, and plant hormone signal transduction. This information on the changes in the multiple pathways related to SE induction in E. senticosus Maxim. embryogenic tissue will contribute to a more comprehensive understanding of the mechanisms involved in early SE. Additionally, the differentially expressed genes may act as molecular markers and could play very important roles in the early stage of SE. The results are a comprehensive molecular biology resource for investigating SE of E. senticosus Maxim.

  19. The Xenopus laevis Atg4B Protease: Insights into Substrate Recognition and Application for Tag Removal from Proteins Expressed in Pro- and Eukaryotic Hosts.

    PubMed

    Frey, Steffen; Görlich, Dirk

    2015-01-01

    During autophagy, members of the ubiquitin-like Atg8 protein family get conjugated to phosphatidylethanolamine and act as protein-recruiting scaffolds on the autophagosomal membrane. The Atg4 protease produces mature Atg8 from C-terminally extended precursors and deconjugates lipid-bound Atg8. We now found that Xenopus laevis Atg4B (xAtg4B) is ideally suited for proteolytic removal of N-terminal tags from recombinant proteins. To implement this strategy, an Atg8 cleavage module is inserted in between tag and target protein. An optimized xAtg4B protease fragment includes the so far uncharacterized C-terminus, which crucially contributes to recognition of the Xenopus Atg8 homologs xLC3B and xGATE16. xAtg4B-mediated tag cleavage is very robust in solution or on-column, efficient at 4°C and orthogonal to TEV protease and the recently introduced proteases bdSENP1, bdNEDP1 and xUsp2. Importantly, xLC3B fusions are stable in wheat germ extract or when expressed in Saccharomyces cerevisiae, but cleavable by xAtg4B during or following purification. We also found that fusions to the bdNEDP1 substrate bdNEDD8 are stable in S. cerevisiae. In combination, or findings now provide a system, where proteins and complexes fused to xLC3B or bdNEDD8 can be expressed in a eukaryotic host and purified by successive affinity capture and proteolytic release steps.

  20. DNA sequence variation and selection of tag single-nucleotide polymorphisms at candidate genes for drought-stress response in Pinus taeda L.

    PubMed

    González-Martínez, Santiago C; Ersoz, Elhan; Brown, Garth R; Wheeler, Nicholas C; Neale, David B

    2006-03-01

    Genetic association studies are rapidly becoming the experimental approach of choice to dissect complex traits, including tolerance to drought stress, which is the most common cause of mortality and yield losses in forest trees. Optimization of association mapping requires knowledge of the patterns of nucleotide diversity and linkage disequilibrium and the selection of suitable polymorphisms for genotyping. Moreover, standard neutrality tests applied to DNA sequence variation data can be used to select candidate genes or amino acid sites that are putatively under selection for association mapping. In this article, we study the pattern of polymorphism of 18 candidate genes for drought-stress response in Pinus taeda L., an important tree crop. Data analyses based on a set of 21 putatively neutral nuclear microsatellites did not show population genetic structure or genomewide departures from neutrality. Candidate genes had moderate average nucleotide diversity at silent sites (pi(sil) = 0.00853), varying 100-fold among single genes. The level of within-gene LD was low, with an average pairwise r2 of 0.30, decaying rapidly from approximately 0.50 to approximately 0.20 at 800 bp. No apparent LD among genes was found. A selective sweep may have occurred at the early-response-to-drought-3 (erd3) gene, although population expansion can also explain our results and evidence for selection was not conclusive. One other gene, ccoaomt-1, a methylating enzyme involved in lignification, showed dimorphism (i.e., two highly divergent haplotype lineages at equal frequency), which is commonly associated with the long-term action of balancing selection. Finally, a set of haplotype-tagging SNPs (htSNPs) was selected. Using htSNPs, a reduction of genotyping effort of approximately 30-40%, while sampling most common allelic variants, can be gained in our ongoing association studies for drought tolerance in pine.

  1. Substantial prevalence of microdeletions of the Y-chromosome in infertile men with idiopathic azoospermia and oligozoospermia detected using a sequence-tagged site-based mapping strategy

    SciTech Connect

    Najmabadi, H.; Huang, V.; Bhasin, D.

    1996-04-01

    Genes on the long arm of Y (Yq), particularly within interval 6, are believed to play a critical role in human spermatogenesis. Cytogenetically detectable deletions of this region are associated with azoospermia in men, but are relatively uncommon. The objective of this study was to validate a sequence-tagged site (STS)-mapping strategy for the detection of Yq microdeletions and to use this method to determine the proportion of men with idiopathic azoospermia or severe oligozoospermia who carry microdeletions in Yq. STS mapping of a sufficiently large sample of infertile men should also help further localize the putative gene(s) involved in the pathogenesis of male infertility. Genomic DNA was extracted from peripheral leukocytes of 16 normal fertile men, 7 normal fertile women, 60 infertile men, and 15 patients with the X-linked disorder, ichthyosis. PCR primers were synthesized for 26 STSs that span Yq interval 6. None of the 16 normal men of known fertility had microdeletions. Seven normal fertile women failed to amplify any of the 26 STSs, providing evidence of their Y specificity. No microdeletions were detected in any of the 15 patients with ichthyosis. Of the 60 infertile men typed with 26 STSs, 11 (18%; 10 azoospermic and 1 oligozoospermic) failed to amplify 1 or more STS. Interestingly, 4 of the 11 patients had microdeletions in a region that is outside the Yq region from which the DAZ (deleted in azoospermia gene region) gene was cloned. In an additional 3 patients, microdeletions were present both inside and outside the DAZ region. The physical locations of these microdeletions provide further support for the concept that a gene(s) on Yq deletion interval 6 plays an important role in spermatogenesis. The presence of deletions that do not overlap with the DAZ region suggests that genes other than the DAZ gene may also be implicated in the pathogenesis of some subsets of male infertility. 48 refs., 2 figs., 2 tabs.

  2. Construction of a Genetic Linkage Map Based on Amplified Fragment Length Polymorphism Markers and Development of Sequence-Tagged Site Markers for Marker-Assisted Selection of the Sporeless Trait in the Oyster Mushroom (Pleurotus eryngii)

    PubMed Central

    Ueda, Jun; Obatake, Yasushi; Murakami, Shigeyuki; Fukumasa, Yukitaka; Matsumoto, Teruyuki

    2012-01-01

    A large number of spores from fruiting bodies can lead to allergic reactions and other problems during the cultivation of edible mushrooms, including Pleurotus eryngii (DC.) Quél. A cultivar harboring a sporulation-deficient (sporeless) mutation would be useful for preventing these problems, but traditional breeding requires extensive time and labor. In this study, using a sporeless P. eryngii strain, we constructed a genetic linkage map to introduce a molecular breeding program like marker-assisted selection. Based on the segregation of 294 amplified fragment length polymorphism markers, two mating type factors, and the sporeless trait, the linkage map consisted of 11 linkage groups with a total length of 837.2 centimorgans (cM). The gene region responsible for the sporeless trait was located in linkage group IX with 32 amplified fragment length polymorphism markers and the B mating type factor. We also identified eight markers closely linked (within 1.2 cM) to the sporeless locus using bulked-segregant analysis-based amplified fragment length polymorphism. One such amplified fragment length polymorphism marker was converted into two sequence-tagged site markers, SD488-I and SD488-II. Using 14 wild isolates, sequence-tagged site analysis indicated the potential usefulness of the combination of two sequence-tagged site markers in cross-breeding of the sporeless strain. It also suggested that a map constructed for P. eryngii has adequate accuracy for marker-assisted selection. PMID:22210222

  3. Sequence-Modified Antibiotic Resistance Genes Provide Sustained Plasmid-Mediated Transgene Expression in Mammals.

    PubMed

    Lu, Jiamiao; Zhang, Feijie; Fire, Andrew Z; Kay, Mark A

    2017-03-30

    Conventional plasmid vectors are incapable of achieving sustained levels of transgene expression in vivo even in quiescent mammalian tissues because the transgene expression cassette is silenced. Transcriptional silencing results from the presence of the bacterial plasmid backbone or virtually any DNA sequence of >1 kb in length placed outside of the expression cassette. Here, we show that transcriptional silencing can be substantially forestalled by increasing the An/Tn sequence composition in the plasmid bacterial backbone. Increasing numbers of An/Tn sequences increased sustained transcription of both backbone sequences and adjacent expression cassettes. In order to recapitulate these expression profiles in compact and portable plasmid DNA backbones, we engineered the standard kanamycin or ampicillin antibiotic resistance genes, optimizing the number of An/Tn sequence without altering the encoded amino acids. The resulting vector backbones yield sustained transgene expression from mouse liver, providing generic DNA vectors capable of sustained transgene expression without additional genes or mammalian regulatory elements.

  4. Loss of monocyte chemoattractant protein-1 expression delays mammary tumorigenesis and reduces localized inflammation in the C3(1)/SV40Tag triple negative breast cancer model.

    PubMed

    Cranford, Taryn L; Velázquez, Kandy T; Enos, Reilly T; Bader, Jackie E; Carson, Meredith S; Chatzistamou, Ioulia; Nagarkatti, Mitzi; Murphy, E Angela

    2017-02-01

    Monocyte chemoattractant protein 1 (MCP-1) has been implicated as a major modulator in the progression of mammary tumorigenesis, largely due to its ability to recruit macrophages to the tumor microenvironment. Macrophages are key mediators in the connection between inflammation and cancer progression and have been shown to play an important role in tumorigenesis. Thus, MCP-1 may be a potential therapeutic target in inflammatory and difficult-to-treat cancers such as triple negative breast cancer (TNBC). We examined the effect of MCP-1 depletion on mammary tumorigenesis in a model of TNBC. Tumor measurements were conducted weekly (until 22 weeks of age) and at sacrifice (23 weeks of age) in female C3(1)/SV40Tag and C3(1)/SV40Tag MCP-1 deficient mice to determine tumor numbers and tumorvolumes. Histopathological scoring was performed at 12 weeks of age and 23 weeks of age. Gene expression of macrophage markers and inflammatory mediators were measured in the mammary gland and tumor microenvironment at sacrifice. As expected, MCP-1 depletion resulted in decreased tumorigenesis, indicated by reduced primary tumor volume and multiplicity, and a delay in tumor progression represented by histopathological scoring (12 weeks of age). Deficiency in MCP-1 significantly downregulated expression of macrophage markers in the mammary gland (Mertk and CD64) and the tumor microenvironment (CD64), and also reduced expression of inflammatory cytokines in the mammary gland (TNFα and IL-1β) and the tumor microenvironment (IL-6). These data support the hypothesis that MCP-1 expression contributes to increased tumorigenesis in a model of TNBC via recruitment of macrophages and subsequent increase in inflammatory mediators.

  5. Modules for C-terminal epitope tagging of Tetrahymena genes

    PubMed Central

    Kataoka, Kensuke; Schoeberl, Ursula E.; Mochizuki, Kazufumi

    2010-01-01

    Although epitope tagging has been widely used for analyzing protein function in many organisms, there are few genetic tools for epitope tagging in Tetrahymena. In this study, we describe several C-terminal epitope tagging modules that can be used to express tagged proteins in Tetrahymena cells by both plasmid- and PCR-based strategies. PMID:20624430

  6. Shark Tagging Activities.

    ERIC Educational Resources Information Center

    Current: The Journal of Marine Education, 1998

    1998-01-01

    In this group activity, children learn about the purpose of tagging and how scientists tag a shark. Using a cut-out of a shark, students identify, measure, record data, read coordinates, and tag a shark. Includes introductory information about the purpose of tagging and the procedure, a data sheet showing original tagging data from Tampa Bay, and…

  7. Tags, micro-tags and tag editing: improving internet search

    NASA Astrophysics Data System (ADS)

    Rogowitz, Bernice E.; Topkara, Mercan

    2009-02-01

    Social tagging is an emerging methodology that allows individual users to assign semantic keywords to content on the web. Popular web services allow the community of users to search for content based on these user-defined tags. Tags are typically attached to a whole entity such as a web page (e.g., del.icio.us), a video (e.g., YouTube), a product description (e.g., Amazon) or a photograph (e.g., Flickr). However, finding specific information within a whole entity can be a difficult, time-intensive process. This is especially true for content such as video, where the information sought may be a small segment within a very long presentation. Moreover, the tags provided by a community of users may be incorrect, conflicting, or incomplete when used as search terms. In this paper we introduce a system that allows users to create "micro-tags," that is, semantic markers that are attached to subsets of information. These micro-tags give the user the ability to direct attention to specific subsets within a larger and more complex entity, and the set of micro-tags provides a more nuanced description of the full content. Also, when these micro-tags are used as search terms, there is no need to do a serial search of the content, since micro-tags draw attention to the semantic content of interest. This system also provides a mechanism that allows users in the community to edit and delete each others' tags, using the community to refine and improve tag quality. We will also report on empirical studies that demonstrate the value of micro-tagging and tag editing and explore the role micro-tags and tag editing will play in future applications.

  8. A Review of Recommendations for Sequencing Receptive and Expressive Language Instruction

    ERIC Educational Resources Information Center

    Petursdottir, Anna Ingeborg; Carr, James E.

    2011-01-01

    We review recommendations for sequencing instruction in receptive and expressive language objectives in early and intensive behavioral intervention (EIBI) programs. Several books recommend completing receptive protocols before introducing corresponding expressive protocols. However, this recommendation has little empirical support, and some…

  9. Assessment of the Fusion Tags on Increasing Soluble Production of the Active TEV Protease Variant and Other Target Proteins in E. coli.

    PubMed

    Yu, Xuelian; Sun, Jiaqi; Wang, Weiyu; Jiang, Li; Cheng, Beijiu; Fan, Jun

    2016-12-17

    In this study, five fusion tags affecting soluble production and cleavage activity of the tobacco etch virus (TEV) protease (TEVp) variant in Escherichia coli strains BL21 (DE3) and Rosetta™ (DE3) are investigated. Combination of the augmenting rare transfer RNAs (tRNAs) and the fused expressivity tag (N-terminal seven amino acid residues of E. coli translation initiation factor II) promotes the soluble TEVp partner expressed at relatively high level. Attachment of the maltose-binding protein (MBP) tag increases soluble expression of the protease released from the fusion protein in E. coli cells, but the incorporated TEVp recognition sequence slightly decreases expressivity of the fusion construct. Except for the green fluorescent protein, the attached expressivity tag shows less efficiency than the MBP tag in enhancing expression levels of the selected five target proteins in the Rosetta™ (DE3) cells under different induction conditions. Our results identified that high-level production of the functional target protein as the fusion partner in E. coli is combined with the intrinsic property of fusion tag, fusion protein stability, inherent folding of target protein, rare tRNA abundance, and the incorporated linker. Purified TEVp fusion constructs with the N-terminal expressivity tag, as well as the MBP partner, are the ideal alternatives for removing fusion tag.

  10. Sequences and expression of pyruvate dehydrogenase genes from Pseudomonas aeruginosa.

    PubMed Central

    Rae, J L; Cutfield, J F; Lamont, I L

    1997-01-01

    A mutant of Pseudomonas aeruginosa, OT2100, which appeared to be defective in the production of the fluorescent yellow-green siderophore pyoverdine had been isolated previously following transposon mutagenesis (T. R. Merriman and I. L. Lamont, Gene 126:17-23, 1993). DNA from either side of the transposon insertion site was cloned, and the sequence was determined. The mutated gene had strong identity with the dihydrolipoamide acetyltransferase (E2) components of pyruvate dehydrogenase (PDH) from other bacterial species. Enzyme assays revealed that the mutant was defective in the E2 subunit of PDH, preventing assembly of a functional complex. PDH activity in OT2100 cell extracts was restored when extract from an E1 mutant was added. On the basis of this evidence, OT2100 was identified as an aceB or E2 mutant. A second gene, aceA, which is likely to encode the E1 component of PDH, was identified upstream from aceB. Transcriptional analysis revealed that aceA and aceB are expressed as a 5-kb polycistronic transcript from a promoter upstream of aceA. An intergenic region of 146 bp was located between aceA and aceB, and a 2-kb aceB transcript that originated from a promoter in the intergenic region was identified. DNA fragments upstream of aceA and aceB were shown to have promoter activities in P. aeruginosa, although only the aceA promoter was active in Escherichia coli. It is likely that the apparent pyoverdine-deficient phenotype of mutant OT2100 is a consequence of acidification of the growth medium due to accumulation of pyruvic acid in the absence of functional PDH. PMID:9171401

  11. Fractional factorial approach combining 4 Escherichia coli strains, 3 culture media, 3 expression temperatures and 5 N-terminal fusion tags for screening the soluble expression of recombinant proteins.

    PubMed

    Noguère, Christophe; Larsson, Anna M; Guyot, Jean-Christophe; Bignon, Christophe

    2012-08-01

    Producing recombinant proteins in Escherichia coli (E. coli) is generally performed using a trial and error approach with the different expression variables being tested independently from each other. As a consequence, variable interactions are lost which makes the trial and error approach quite time-consuming. In this paper, we report how switching from a trial and error to a fractional factorial approach allows testing in less than 2 weeks four expression variables (E. coli strains, culture media, expression temperatures and N-terminal fusion tags) in a single experiment. The method, called "Fusion-InFFact", was validated using four test proteins. In all cases, Fusion-InFFact allowed finding conditions for expressing high yields of soluble proteins. The method was originally set-up for high throughput structural genomics programs, but can be used in any recombinant protein expression project.

  12. Expressed Sequence Reference Standards for Evaluating Stage-specific Gene Expression in Southern Green Lacewings, Chrysoperla rufilabris

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Five developmental stages of Chrysoperla rufilabris were tested using nine primer pairs. Three sequences were highly expressed at all life stages and six were differentially expressed. These primer pairs may be used as standards to quantitate functional gene expression associated with physiological ...

  13. Heparin-binding peptide as a novel affinity tag for purification of recombinant proteins.

    PubMed

    Morris, Jacqueline; Jayanthi, Srinivas; Langston, Rebekah; Daily, Anna; Kight, Alicia; McNabb, David S; Henry, Ralph; Kumar, Thallapuranam Krishnaswamy Suresh

    2016-10-01

    Purification of recombinant proteins constitutes a significant part of the downstream processing in biopharmaceutical industries. Major costs involved in the production of bio-therapeutics mainly depend on the number of purification steps used during the downstream process. Affinity chromatography is a widely used method for the purification of recombinant proteins expressed in different expression host platforms. Recombinant protein purification is achieved by fusing appropriate affinity tags to either N- or C- terminus of the target recombinant proteins. Currently available protein/peptide affinity tags have proved quite useful in the purification of recombinant proteins. However, these affinity tags suffer from specific limitations in their use under different conditions of purification. In this study, we have designed a novel 34-amino acid heparin-binding affinity tag (HB-tag) for the purification of recombinant proteins expressed in Escherichia coli (E. coli) cells. HB-tag fused recombinant proteins were overexpressed in E. coli in high yields. A one-step heparin-Sepharose-based affinity chromatography protocol was developed to purify HB-fused recombinant proteins to homogeneity using a simple sodium chloride step gradient elution. The HB-tag has also been shown to facilitate the purification of target recombinant proteins from their 8 M urea denatured state(s). The HB-tag has been demonstrated to be successfully released from the fusion protein by an appropriate protease treatment to obtain the recombinant target protein(s) in high yields. Results of the two-dimensional NMR spectroscopy experiments indicate that the purified recombinant target protein(s) exist in the native conformation. Polyclonal antibodies raised against the HB-peptide sequence, exhibited high binding specificity and sensitivity to the HB-fused recombinant proteins (∼10 ng) in different crude cell extracts obtained from diverse expression hosts. In our opinion, the HB-tag provides a

  14. Reverse serial analysis of gene expression (SAGE) characterization of orphan SAGE tags from human embryonic stem cells identifies the presence of novel transcripts and antisense transcription of key pluripotency genes.

    PubMed

    Richards, Mark; Tan, Siew-Peng; Chan, Woon-Khiong; Bongso, Ariff

    2006-05-01

    Serial analysis of gene expression (SAGE) is a powerful technique for the analysis of gene expression. A significant portion of SAGE tags, designated as orphan tags, however, cannot be reliably assigned to known transcripts. We used an improved reverse SAGE (rSAGE) strategy to convert human embryonic stem cell (hESC)-specific orphan SAGE tags into longer 3' cDNAs. We show that the systematic analysis of these 3' cDNAs permitted the discovery of hESC-specific novel transcripts and cis-natural antisense transcripts (cis-NATs) and improved the assignment of SAGE tags that resulted from splice variants, insertion/deletion, and single-nucleotide polymorphisms. More importantly, this is the first description of cis-NATs for several key pluripotency markers in hESCs and mouse embryonic stem cells, suggesting that the formation of short interfering RNA could be an important regulatory mechanism. A systematic large-scale analysis of the remaining orphan SAGE tags in the hESC SAGE libraries by rSAGE or other 3' cDNA extension strategies should unravel additional novel transcripts and cis-NATs that are specifically expressed in hESCs. Besides contributing to the complete catalog of human transcripts, many of them should prove to be a valuable resource for the elucidation of the molecular pathways involved in the self-renewal and lineage commitment of hESCs.

  15. Proteomic analysis of nipple aspirate fluid from women with early-stage breast cancer using isotope-coded affinity tags and tandem mass spectrometry reveals differential expression of vitamin D binding protein

    PubMed Central

    Pawlik, Timothy M; Hawke, David H; Liu, Yanna; Krishnamurthy, Savitri; Fritsche, Herbert; Hunt, Kelly K; Kuerer, Henry M

    2006-01-01

    Background Isotope-coded affinity tag (ICAT) tandem mass spectrometry (MS) allows for qualitative and quantitative analysis of paired protein samples. We sought to determine whether ICAT technology could quantify and identify differential expression of tumor-specific proteins in nipple aspirate fluid (NAF) from the tumor-bearing and contralateral disease-free breasts of patients with unilateral early-stage breast cancer. Methods Paired NAF samples from 18 women with stage I or II unilateral invasive breast carcinoma and 4 healthy volunteers were analyzed using ICAT labeling, sodium dodecyl sulfate-polyacrylamide gel (SDS-PAGE), liquid chromatography, and MS. Proteins were identified by sequence database analysis. Western blot analysis of NAF from an independent sample set from 12 women (8 with early-stage breast cancer and 4 healthy volunteers) was also performed. Results 353 peptides were identified from tandem mass spectra and matched to peptide sequences in the National Center for Biotechnology Information database. Equal numbers of peptides were up- versus down-regulated. Alpha2HS-glycoprotein [Heavy:Light (H:L) ratio 0.63] was underexpressed in NAF from tumor-bearing breasts, while lipophilin B (H:L ratio 1.42), beta-globin (H:L ratio 1.98), hemopexin (H:L ratio 1.73), and vitamin D-binding protein precursor (H:L ratio 1.82) were overexpressed. Western blot analysis of pooled samples of NAF from healthy volunteers versus NAF from women with breast cancer confirmed the overexpression of vitamin D-binding protein in tumor-bearing breasts. Conclusion ICAT tandem MS was able to identify and quantify differences in specific protein expression between NAF samples from tumor-bearing and disease-free breasts. Proteomic screening techniques using ICAT and NAF may be used to find markers for diagnosis of breast cancer. PMID:16542425

  16. Analysis of the Changes in Expression Levels of Sialic Acid on Influenza-Virus-Infected Cells Using Lectin-Tagged Polymeric Nanoparticles

    PubMed Central

    Cho, Jaebum; Miyake, Yukari; Honda, Ayae; Kushiro, Keiichiro; Takai, Madoka

    2016-01-01

    Viral infections affect millions around the world, sometimes leading to severe consequences or even epidemics. Understanding the molecular dynamics during viral infections would provide crucial information for preventing or stopping the progress of infections. However, the current methods often involve the disruption of the infected cells or expensive and time-consuming procedures. In this study, fluorescent polymeric nanoparticles were fabricated and used as bioimaging nanoprobes that can monitor the progression of influenza viral infection through the changes in the expression levels of sialic acids expressed on the cell membrane. The nanoparticles were composed of a biocompatible monomer to prevent non-specific interactions, a hydrophobic monomer to form the core, a fluorescent monomer, and a protein-binding monomer to conjugate lectin, which binds sialic acids. It was shown that these lectin-tagged nanoparticles that specifically target sialic acids could track the changes in the expression levels of sialic acids caused by influenza viral infections in human lung epithelial cells. There was a sudden drop in the levels of sialic acid at the initial onset of virus infection (t = 0~1 h) and at approximately 4~5 h post-infection. The latter drop correlated with the production of viral proteins that was confirmed using traditional techniques. Thus, the accuracy, the rapidity and the efficacy of the nanoprobes were demonstrated. Such molecular bioimaging tools, which allow easy-handling and in situ monitoring, would be useful to directly observe and decipher the viral infection mechanisms. PMID:27493646

  17. Overexpressing tagged proteins in plants using a modified gateway cloning strategy.

    PubMed

    Dubin, Manu J; Bowler, Chris; Benvenuto, Giovanna

    2010-03-01

    In recent years, sequence-specific recombination cloning methods such as the Gateway system have become increasingly popular for (over)expressing tagged proteins in high-throughput investigations in many different organisms, including plants. Because of their versatility and ease of use, these methods have gained favor in low- and medium-throughput investigations as well. However, due to the recombination step, the resulting fusion proteins contain long and often highly charged polylinker sequences that can interfere with their physiological function. Furthermore, in some cases the gene of interest must be cloned twice (once with and once without a stop codon) for N- and C-terminal tagging. Here, we present a hybrid combinatorial cloning strategy that overcomes many of these limitations. In the first step, the gene of interest is cloned into an entry vector containing standardized cloning sites with the desired N- or C-terminal tag and an optimized polylinker sequence. A Gateway recombination reaction is used to transfer the protein-tag fusion from the entry clone to a Gateway destination vector with the desired promoter and selectable marker for the organism of interest. As experimental requirements evolve, constructs for expressing the protein of interest with the desired tag, promoter, and selectable marker or other features can rapidly and easily be created.

  18. Evaluation of Codon Biology in Citrus and Poncirus trifoliata Based on Genomic Features and Frame Corrected Expressed Sequence Tags

    PubMed Central

    Ahmad, Touqeer; Sablok, Gaurav; Tatarinova, Tatiana V.; Xu, Qiang; Deng, Xiu-Xin; Guo, Wen-Wu

    2013-01-01

    Citrus, as one of the globally important fruit trees, has been an object of interest for understanding genetics and evolutionary process in fruit crops. Meta-analyses of 19 Citrus species, including 4 globally and economically important Citrus sinensis, Citrus clementina, Citrus reticulata, and 1 Citrus relative Poncirus trifoliata, were performed. We observed that codons ending with A- or T- at the wobble position were preferred in contrast to C- or G- ending codons, indicating a close association with AT richness of Citrus species and P. trifoliata. The present study postulates a large repertoire of a set of optimal codons for the Citrus genus and P. trifoliata and demonstrates that GCT and GGT are evolutionary conserved optimal codons. Our observation suggested that mutational bias is the dominating force in shaping the codon usage bias (CUB) in Citrus and P. trifoliata. Correspondence analysis (COA) revealed that the principal axis [axis 1; COA/relative synonymous codon usage (RSCU)] contributes only a minor portion (∼10.96%) of the recorded variance. In all analysed species, except P. trifoliata, Gravy and aromaticity played minor roles in resolving CUB. Compositional constraints were found to be strongly associated with the amino acid signatures in Citrus species and P. trifoliata. Our present analysis postulates compositional constraints in Citrus species and P. trifoliata and plausible role of the stress with GC3 and coevolution pattern of amino acid. PMID:23315666

  19. Evaluation of codon biology in citrus and Poncirus trifoliata based on genomic features and frame corrected expressed sequence tags.

    PubMed

    Ahmad, Touqeer; Sablok, Gaurav; Tatarinova, Tatiana V; Xu, Qiang; Deng, Xiu-Xin; Guo, Wen-Wu

    2013-04-01

    Citrus, as one of the globally important fruit trees, has been an object of interest for understanding genetics and evolutionary process in fruit crops. Meta-analyses of 19 Citrus species, including 4 globally and economically important Citrus sinensis, Citrus clementina, Citrus reticulata, and 1 Citrus relative Poncirus trifoliata, were performed. We observed that codons ending with A- or T- at the wobble position were preferred in contrast to C- or G- ending codons, indicating a close association with AT richness of Citrus species and P. trifoliata. The present study postulates a large repertoire of a set of optimal codons for the Citrus genus and P. trifoliata and demonstrates that GCT and GGT are evolutionary conserved optimal codons. Our observation suggested that mutational bias is the dominating force in shaping the codon usage bias (CUB) in Citrus and P. trifoliata. Correspondence analysis (COA) revealed that the principal axis [axis 1; COA/relative synonymous codon usage (RSCU)] contributes only a minor portion (∼10.96%) of the recorded variance. In all analysed species, except P. trifoliata, Gravy and aromaticity played minor roles in resolving CUB. Compositional constraints were found to be strongly associated with the amino acid signatures in Citrus species and P. trifoliata. Our present analysis postulates compositional constraints in Citrus species and P. trifoliata and plausible role of the stress with GC3 and coevolution pattern of amino acid.

  20. Comparison of quantitative trait loci for adaptive traits between oak and chestnut based on an expressed sequence tag consensus map.

    PubMed

    Casasoli, Manuela; Derory, Jeremy; Morera-Dutrey, Caroline; Brendel, Oliver; Porth, Ilga; Guehl, Jean-Marc; Villani, Fiorella; Kremer, Antoine

    2006-01-01

    A comparative genetic and QTL mapping was performed between Quercus robur L. and Castanea sativa Mill., two major forest tree species belonging to the Fagaceae family. Oak EST-derived markers (STSs) were used to align the 12 linkage groups of the two species. Fifty-one and 45 STSs were mapped in oak and chestnut, respectively. These STSs, added to SSR markers previously mapped in both species, provided a total number of 55 orthologous molecular markers for comparative mapping within the Fagaceae family. Homeologous genomic regions identified between oak and chestnut allowed us to compare QTL positions for three important adaptive traits. Colocation of the QTL controlling the timing of bud burst was significant between the two species. However, conservation of QTL for height growth was not supported by statistical tests. No QTL for carbon isotope discrimination was conserved between the two species. Putative candidate genes for bud burst can be identified on the basis of colocations between EST-derived markers and QTL.

  1. Comparison of Quantitative Trait Loci for Adaptive Traits Between Oak and Chestnut Based on an Expressed Sequence Tag Consensus Map

    PubMed Central

    Casasoli, Manuela; Derory, Jeremy; Morera-Dutrey, Caroline; Brendel, Oliver; Porth, Ilga; Guehl, Jean-Marc; Villani, Fiorella; Kremer, Antoine

    2006-01-01

    A comparative genetic and QTL mapping was performed between Quercus robur L. and Castanea sativa Mill., two major forest tree species belonging to the Fagaceae family. Oak EST-derived markers (STSs) were used to align the 12 linkage groups of the two species. Fifty-one and 45 STSs were mapped in oak and chestnut, respectively. These STSs, added to SSR markers previously mapped in both species, provided a total number of 55 orthologous molecular markers for comparative mapping within the Fagaceae family. Homeologous genomic regions identified between oak and chestnut allowed us to compare QTL positions for three important adaptive traits. Colocation of the QTL controlling the timing of bud burst was significant between the two species. However, conservation of QTL for height growth was not supported by statistical tests. No QTL for carbon isotope discrimination was conserved between the two species. Putative candidate genes for bud burst can be identified on the basis of colocations between EST-derived markers and QTL. PMID:16204213

  2. Identification, sequencing, and expression of Mycobacterium leprae superoxide dismutase, a major antigen.

    PubMed Central

    Thangaraj, H S; Lamb, F I; Davis, E O; Jenner, P J; Jeyakumar, L H; Colston, M J

    1990-01-01

    The gene encoding a major 28-kilodalton antigen of Mycobacterium leprae has now been sequenced and identified as the enzyme superoxide dismutase (SOD) on the basis of the high degree of homology with known SOD sequences. The deduced amino acid sequence shows 67% homology with a human manganese-utilizing SOD and 55% homology with the Escherichia coli manganese-utilizing enzyme. The gene is not expressed from its own promoter in E. coli but is expressed from its own promoter in Mycobacterium smegmatis. The amino acid sequences of epitopes recognized by monoclonal antibodies against the 28-kilodalton antigen have been determined. Images PMID:1692812

  3. Sequence and gene expression evolution of paralogous genes in willows

    PubMed Central

    Harikrishnan, Srilakshmy L.; Pucholt, Pascal; Berlin, Sofia

    2015-01-01

    Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows. PMID:26689951

  4. Sequence and gene expression evolution of paralogous genes in willows.

    PubMed

    Harikrishnan, Srilakshmy L; Pucholt, Pascal; Berlin, Sofia

    2015-12-22

    Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows.

  5. An inducible expression system of histidine-tagged proteins in Streptomyces lividans for one-step purification by Ni2+ affinity chromatography.

    PubMed

    Enguita, F J; de la Fuente, J L; Martín, J F; Liras, P

    1996-04-01

    An expression and purification cassette containing the aminoglycoside phosphotransferase gene (aph) as selective marker has been constructed in the Escherichia coli vector pULHis2. DNA fragments inserted in the cassette can be easily subcloned in pIJ699 to give vectors for overexpression of genes in Streptomyces and purification of proteins by a one-step procedure. The expression system uses the thiostrepton-inducible promoter tipA for expression and a six histidine coding nucleotide sequence that is fused in frame to the foreign gene inserted in the polylinker. The pULHis2-derived expression vector has been used satisfactorily to express and to purify the P7 and P8 proteins of Nocardia lactamdurans which carry out the methoxylation of cephalosporin C to 7-methoxycephalosporin C.

  6. Experimental design, preprocessing, normalization and differential expression analysis of small RNA sequencing experiments

    PubMed Central

    2011-01-01

    Prior to the advent of new, deep sequencing methods, small RNA (sRNA) discovery was dependent on Sanger sequencing, which was time-consuming and limited knowledge to only the most abundant sRNA. The innovation of large-scale, next-generation sequencing has exponentially increased knowledge of the biology, diversity and abundance of sRNA populations. In this review, we discuss issues involved in the design of sRNA sequencing experiments, including choosing a sequencing platform, inherent biases that affect sRNA measurements and replication. We outline the steps involved in preprocessing sRNA sequencing data and review both the principles behind and the current options for normalization. Finally, we discuss differential expression analysis in the absence and presence of biological replicates. While our focus is on sRNA sequencing experiments, many of the principles discussed are applicable to the sequencing of other RNA populations. PMID:21356093

  7. Rice pseudomolecule-anchored cross-species DNA sequence alignments indicate regional genomic variation in expressed sequence conservation

    PubMed Central

    Armstead, Ian; Huang, Lin; King, Julie; Ougham, Helen; Thomas, Howard; King, Ian

    2007-01-01

    Background Various methods have been developed to explore inter-genomic relationships among plant species. Here, we present a sequence similarity analysis based upon comparison of transcript-assembly and methylation-filtered databases from five plant species and physically anchored rice coding sequences. Results A comparison of the frequency of sequence alignments, determined by MegaBLAST, between rice coding sequences in TIGR pseudomolecules and annotations vs 4.0 and comprehensive transcript-assembly and methylation-filtered databases from Lolium perenne (ryegrass), Zea mays (maize), Hordeum vulgare (barley), Glycine max (soybean) and Arabidopsis thaliana (thale cress) was undertaken. Each rice pseudomolecule was divided into 10 segments, each containing 10% of the functionally annotated, expressed genes. This indicated a correlation between relative segment position in the rice genome and numbers of alignments with all the queried monocot and dicot plant databases. Colour-coded moving windows of 100 functionally annotated, expressed genes along each pseudomolecule were used to generate 'heat-maps'. These revealed consistent intra- and inter-pseudomolecule variation in the relative concentrations of significant alignments with the tested plant databases. Analysis of the annotations and derived putative expression patterns of rice genes from 'hot-spots' and 'cold-spots' within the heat maps indicated possible functional differences. A similar comparison relating to ancestral duplications of the rice genome indicated that duplications were often associated with 'hot-spots'. Conclusion Physical positions of expressed genes in the rice genome are correlated with the degree of conservation of similar sequences in the transcriptomes of other plant species. This relative conservation is associated with the distribution of different sized gene families and segmentally duplicated loci and may have functional and evolutionary implications. PMID:17708759

  8. Human DNA polymerase alpha gene: sequences controlling expression in cycling and serum-stimulated cells.

    PubMed Central

    Pearson, B E; Nasheuer, H P; Wang, T S

    1991-01-01

    We have investigated the DNA polymerase alpha promoter sequence requirements for the expression of a heterologous gene in actively cycling cells and following serum addition to serum-deprived cells. An 11.4-kb genomic clone that spans the 5' end of this gene and includes 1.62 kb of sequence upstream from the translation start site was isolated. The transcription start site was mapped at 46 +/- 1 nucleotides upstream from the translation start site. The upstream sequence is GC rich and lacks a TATA sequence but has a CCAAT sequence on the opposite strand. Analysis of a set of deletion constructs in transient transfection assays demonstrated that efficient expression of the reporter in cycling cells requires 248 bp of sequence upstream from the cap site. Clustered within these 248 nucleotides are sequences similar to consensus sequences for Sp1-, Ap1-, Ap2-, and E2F-binding sites. The CCAAT sequence and the potential E2F- and Ap1-binding sites are shown to be protected from DNase I digestion by partially purified nuclear proteins. The DNA polymerase alpha promoter can confer upon the reporter an appropriate, late response to serum addition. No single sequence element could be shown to confer serum inducibility. Rather, multiple sequence elements appear to mediate the full serum response. Images PMID:2005899

  9. Expression Atlas update—a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments

    PubMed Central

    Petryszak, Robert; Burdett, Tony; Fiorelli, Benedetto; Fonseca, Nuno A.; Gonzalez-Porta, Mar; Hastings, Emma; Huber, Wolfgang; Jupp, Simon; Keays, Maria; Kryvych, Nataliya; McMurry, Julie; Marioni, John C.; Malone, James; Megy, Karine; Rustici, Gabriella; Tang, Amy Y.; Taubert, Jan; Williams, Eleanor; Mannion, Oliver; Parkinson, Helen E.; Brazma, Alvis

    2014-01-01

    Expression Atlas (http://www.ebi.ac.uk/gxa) is a value-added database providing information about gene, protein and splice variant expression in different cell types, organism parts, developmental stages, diseases and other biological and experimental conditions. The database consists of selected high-quality microarray and RNA-sequencing experiments from ArrayExpress that have been manually curated, annotated with Experimental Factor Ontology terms and processed using standardized microarray and RNA-sequencing analysis methods. The new version of Expression Atlas introduces the concept of ‘baseline’ expression, i.e. gene and splice variant abundance levels in healthy or untreated conditions, such as tissues or cell types. Differential gene expression data benefit from an in-depth curation of experimental intent, resulting in biologically meaningful ‘contrasts’, i.e. instances of differential pairwise comparisons between two sets of biological replicates. Other novel aspects of Expression Atlas are its strict quality control of raw experimental data, up-to-date RNA-sequencing analysis methods, expression data at the level of gene sets, as well as genes and a more powerful search interface designed to maximize the biological value provided to the user. PMID:24304889

  10. Single-step affinity and cost-effective purification of recombinant proteins using the Sepharose-binding lectin-tag from the mushroom Laetiporus sulphureus as fusion partner.

    PubMed

    Li, Xiao-Jing; Liu, Jin-Ling; Gao, Dong-Sheng; Wan, Wen-Yan; Yang, Xia; Li, Yong-Tao; Chang, Hong-Tao; Chen, Lu; Wang, Chuan-Qing; Zhao, Jun

    2016-03-01

    Previous research showed that a lectin from the mushroom Laetiporus sulphureus, designed LSL, bound to Sepharose and could be eluted by lactose. In this study, by taking advantage of the strong affinity of LSL-tag for Sepharose, we developed a single-step purification method for LSL-tagged fusion proteins. We utilized unmodified Sepharose-4B as a specific adsorbent and 0.2 M lactose solution as an elution buffer. Fusion proteins of LSL-tag and porcine circovirus capsid protein, designated LSL-Cap was recovered with purity of 90 ± 4%, and yield of 87 ± 3% from crude extract of recombinant Escherichia coli. To enable the remove of LSL-tag, tobacco etch virus (TEV) protease recognition sequence was placed downstream of LSL-tag in the expression vector, and LSL-tagged TEV protease, designated LSL-TEV, was also expressed in E. coli., and was recovered with purity of 82 ± 5%, and yield of 85 ± 2% from crude extract of recombinant E. coli. After digestion of LSL-tagged recombinant proteins with LSL-TEV, the LSL tag and LSL-TEV can be easily removed by passing the digested products through the Sepharose column. It is of worthy noting that the Sepharose can be reused after washing with PBS. The LSL affinity purification method enables rapid and inexpensive purification of LSL-tagged fusion proteins and scale-up production of native proteins.

  11. Subcellular localization and functional expression of the glycerol uptake protein 1 (GUP1) of Saccharomyces cerevisiae tagged with green fluorescent protein.

    PubMed

    Bleve, Gianluca; Zacheo, Giuseppe; Cappello, Maria Stella; Dellaglio, Franco; Grieco, Francesco

    2005-08-15

    GFP (green fluorescent protein) from Aequorea victoria was used as an in vivo reporter protein when fused to the N- and C-termini of the glycerol uptake protein 1 (Gup1p) of Saccharomyces cerevisiae. The subcellular localization and functional expression of biologically active Gup1-GFP chimaeras was monitored by confocal laser scanning and electron microscopy, thus supplying the first study of GUP1 dynamics in live yeast cells. The Gup1p tagged with GFP is a functional glycerol transporter localized at the plasma membrane and endoplasmic reticulum levels of induced cells. The factors involved in proper localization and turnover of Gup1p were revealed by expression of the Gup1p-GFP fusion protein in a set of strains bearing mutations in specific steps of the secretory and endocytic pathways. The chimaerical protein was targeted to the plasma membrane through a Sec6-dependent process; on treatment with glucose, it was endocytosed through END3 and targeted for degradation in the vacuole. Gup1p belongs to the list of yeast proteins rapidly down-regulated by changing the carbon source in the culture medium, in agreement with the concept that post-translational modifications triggered by glucose affect proteins of peripheral functions. The immunoelectron microscopy assays of cells expressing either Gup1-GFP or GFP-Gup1 fusions suggested the Gup1p membrane topology: the N-terminus lies in the periplasmic space, whereas its C-terminal tail has an intracellular location. An extra cytosolic location of the N-terminal tail is not generally predicted or determined in yeast membrane transporters.

  12. A REVIEW OF RECOMMENDATIONS FOR SEQUENCING RECEPTIVE AND EXPRESSIVE LANGUAGE INSTRUCTION

    PubMed Central

    Petursdottir, Anna Ingeborg; Carr, James E

    2011-01-01

    We review recommendations for sequencing instruction in receptive and expressive language objectives in early and intensive behavioral intervention (EIBI) programs. Several books recommend completing receptive protocols before introducing corresponding expressive protocols. However, this recommendation has little empirical support, and some evidence exists that the reverse sequence may be more efficient. Alternative recommendations include teaching receptive and expressive skills simultaneously (M. L. Sundberg & Partington, 1998) and building learning histories that lead to acquisition of receptive and expressive skills without direct instruction (Greer & Ross, 2008). Empirical support for these recommendations also is limited. Future research should assess the relative efficiency of receptive-before-expressive, expressive-before-receptive, and simultaneous training with children who have diagnoses of autism spectrum disorders. In addition, further evaluation is needed of the potential benefits of multiple-exemplar training and other variables that may influence the efficiency of receptive and expressive instruction. PMID:22219535

  13. A review of recommendations for sequencing receptive and expressive language instruction.

    PubMed

    Petursdottir, Anna Ingeborg; Carr, James E

    2011-01-01

    We review recommendations for sequencing instruction in receptive and expressive language objectives in early and intensive behavioral intervention (EIBI) programs. Several books recommend completing receptive protocols before introducing corresponding expressive protocols. However, this recommendation has little empirical support, and some evidence exists that the reverse sequence may be more efficient. Alternative recommendations include teaching receptive and expressive skills simultaneously (M. L. Sundberg & Partington, 1998) and building learning histories that lead to acquisition of receptive and expressive skills without direct instruction (Greer & Ross, 2008). Empirical support for these recommendations also is limited. Future research should assess the relative efficiency of receptive-before-expressive, expressive-before-receptive, and simultaneous training with children who have diagnoses of autism spectrum disorders. In addition, further evaluation is needed of the potential benefits of multiple-exemplar training and other variables that may influence the efficiency of receptive and expressive instruction.

  14. Promoting Tag Removal of a MBP-Fused Integral Membrane Protein by TEV Protease.

    PubMed

    Chen, Yanke; Li, Qichang; Yang, Jun; Xie, Hao

    2017-03-01

    Tag removal is a prerequisite issue for structural and functional analysis of affinity-purified membrane proteins. The present study took a MBP-fused membrane protein, MrpF, as a model to investigate the tag removal by TEV protease. Influences of the linking sequence between TEV cleavage site and MrpF on protein expression and predicted secondary structure were investigated. The steric accessibility of TEV protease to cleavage site of MBP-fused MrpF was explored. It was found that reducing the size of hydrophilic group of detergents and/or extending the linking sequence between cleavage site and target protein can significantly improve the accessibility of the cleavage site and promote tag removal by TEV protease.

  15. Gene gun bombardment-mediated expression and translocation of EGFP-tagged GLUT4 in skeletal muscle fibres in vivo.

    PubMed

    Lauritzen, Hans P M M; Reynet, Christine; Schjerling, Peter; Ralston, Evelyn; Thomas, Stephen; Galbo, Henrik; Ploug, Thorkil

    2002-09-01

    Cellular protein trafficking has been studied to date only in vitro or with techniques that are invasive and have a low time resolution. To establish a gentle method for analysis of glucose transporter-4 (GLUT4) trafficking in vivo in fully differentiated rat skeletal muscle fibres we combined the enhanced green fluorescent protein (EGFP) labelling technique with physical transfection methods in vivo: intramuscular plasmid injection or gene gun bombardment. During optimisation experiments with plasmid coding for the EGFP reporter alone EGFP-positive muscle fibres were counted after collagenase treatment of in vivo transfected flexor digitorum brevis (FDB) muscles. In contrast to gene gun bombardment, intramuscular injection produced EGFP expression in only a few fibres. Regardless of the transfection technique, EGFP expression was higher in muscles from 2-week-old rats than in those from 6-week-old rats and peaked around 1 week after transfection. The gene gun was used subsequently with a plasmid coding for EGFP linked to the C-terminus of GLUT4 (GLUT4-EGFP). Rats were anaesthetised 5 days after transfection and insulin given i.v. with or without accompanying electrical hindleg muscle stimulation. After stimulation, the hindlegs were fixed by perfusion. GLUT4-EGFP-positive FDB fibres were isolated and analysed by confocal microscopy. The intracellular distribution of GLUT4-EGFP under basal conditions as well as after translocation to the plasma membrane in response to insulin, contractions, or both, was in accordance with previous studies of endogenous GLUT4. Finally, GLUT4-EGFP trafficking in quadriceps muscle in vivo was studied using time-lapse microscopy analysis in anaesthetised mice and the first detailed time-lapse recordings of GLUT4-EGFP translocation in fully differentiated skeletal muscle in vivo were obtained.

  16. Myocardial Tagging With SSFP

    PubMed Central

    Herzka, Daniel A.; Guttman, Michael A.; McVeigh, Elliot R.

    2007-01-01

    This work presents the first implementation of myocardial tagging with refocused steady-state free precession (SSFP) and magnetization preparation. The combination of myocardial tagging (a noninvasive method for quantitative measurement of regional and global cardiac function) with the high tissue signal-to-noise ratio (SNR) obtained with SSFP is shown to yield improvements in terms of the myocardium–tag contrast-to-noise ratio (CNR) and tag persistence when compared to the current standard fast gradient-echo (FGRE) tagging protocol. Myocardium–tag CNR and tag persistence were studied using numerical simulations as well as phantom and human experiments. Both quantities were found to decrease with increasing imaging flip angle (α) due to an increased tag decay rate and a decrease in myocardial steady-state signal. However, higher α yielded better blood–myocardium contrast, indicating that optimal α is dependent on the application: higher α for better blood–myocardium boundary visualization, and lower α for better tag persistence. SSFP tagging provided the same myocardium–tag CNR as FGRE tagging when acquired at four times the bandwidth and better tag– and blood–myocardium CNRs than FGRE tagging when acquired at equal or twice the receiver bandwidth (RBW). The increased acquisition efficiency of SSFP allowed decreases in breath-hold duration, or increases in temporal resolution, as compared to FGRE. PMID:12541254

  17. Aggregating tags for column-free protein purification.

    PubMed

    Lin, Zhanglin; Zhao, Qing; Xing, Lei; Zhou, Bihong; Wang, Xu

    2015-12-01

    Protein purification remains a central need for biotechnology. In recent years, a class of aggregating tags has emerged, which offers a quick, cost-effective and column-free alternative for producing recombinant proteins (and also peptides) with yield and purity comparable to that of the popular His-tag. These column-free tags induce the formation of aggregates (during or after expression) when fused to a target protein or peptide, and upon separation from soluble impurities, the target protein or peptide is subsequently released via a cleavage site. In this review, we categorize these tags as follows: (i) tags that induce inactive protein aggregates in vivo; (ii) tags that induce active protein aggregates in vivo; and (iii) tags that induce soluble expression in vivo, but aggregates in vitro. The respective advantages and disadvantages of these tags are discussed, and compared to the three conventional tags (His-tag, maltose-binding protein [MBP] tag, and intein-mediated purification with a chitin-binding tag [IMPACT-CN]). While this new class of aggregating tags is promising, more systematic tests are required to further the use. It is conceivable, however, that the combination of these tags and the more traditional columns may significantly reduce the costs for resins and columns, particularly for the industrial scale.

  18. ESTs from the fibre-bearing stem tissues of flax (Linum usitatissimum L.): expression analyses of sequences related to cell wall development.

    PubMed

    Day, A; Addi, M; Kim, W; David, H; Bert, F; Mesnage, P; Rolando, C; Chabbert, B; Neutelings, G; Hawkins, S

    2005-01-01

    In order to learn more about the diversity of genes expressed during flax fibre cell wall formation, expressed sequence tags (ESTs) were obtained from a cDNA library derived from the outer fibre-bearing tissues of flax (Linum usitatissimum) stems (cv Hermes) harvested at the mid-flowering stage. After elimination of vector and unreadable sequences, 927 ESTs were grouped into 67 clusters and 754 singletons. The flax ESTs have been submitted to the dbEST and GenBank databases with the accession numbers 25939634 - 25940560 (dbEST) and CV478070 - CV478996 (GenBank). Functional analysis allowed the grouping of ESTs into 13 functional categories and revealed that 62 % of ESTs were similar to known sequences, while 12.4 % of ESTs presented no similarity to any known sequences and 25.6 % of ESTs corresponded to proteins of unknown function. The most highly expressed transcripts belonged to four functional categories: protein maturation and metabolism (31 ESTs), signalling (22 ESTs), the cell wall (21 ESTs) and photosynthesis (19 ESTs). 4.4 % (41) of the total ESTs were potentially related to cell wall formation and maturation. The most highly expressed cell wall EST (15 ESTs) corresponded to a beta-xylosidase gene--potentially involved in cell wall remodelling during growth and development. Other cell wall-related ESTs corresponded to cellulose synthase, xyloglucan endotranglucosylase/hydrolase (XTH), beta-galactosidases, and peroxidases. The expression patterns of different cell wall-related ESTs were determined at different developmental stages in flax plants grown under different field conditions. The potential roles of gene products associated with cell wall related ESTs in fibre cell wall development is discussed.

  19. The use of sequence-based SSR mining for the development of a vast collection of microsatellites in Aquilegia formosa

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Numerous microsatellite markers were developed for Aquilegia formosa from sequences deposited within the Expressed Sequence Tag (EST), Genomic Survey Sequence (GSS), and Nucleotide databases in NCBI. Microsatellites (SSRs) were identified and primers designed for 9 SSR containing sequences in the Nu...

  20. Expression and characterization of two secreted His6-tagged endo-beta-1,4-glucanases from the mollusc Ampullaria crossean in Pichia pastoris.

    PubMed

    Guo, Rui; Ding, Ming; Zhang, Siliang; Xu, Genjun; Zhao, Fukun

    2008-05-01

    Two endo-beta-1,4-glucanase cDNAs, eg27I and eg27II, from the mollusc Ampullaria crossean were expressed in Pichia pastoris cells. The secreted His6-tagged proteins were purified in a single chromatography step. The purified recombinant EG27I and EG27II showed enzymatic activity on carboxylmethyl cellulose sodium salt at 15.31 U/mg and 12.40 U/mg, respectively. The optimum pH levels of the recombinant EG27I and EG27II were 5.5 and 5.5-6.0, respectively, and the optimum temperatures were 50 degrees C and 50 degrees C-55 degrees C, respectively. The pH stability study revealed that both EG27I and EG27II showed their highest stability at pH 8.0. Analysis of their thermostability indicated that both EG27I and EG27II were relatively stable up to 40 degrees C. Site-directed mutagenesis of Asp43 and Asp153 of both EG27I and EG27II showed that the two Asp residues are critical for the enzymatic activity.

  1. Donor Tag Game

    MedlinePlus

    ... Games > Donor Tag Game Printable Version Donor Tag Game This feature requires version 6 or later of ... LGBTQ+ Donors Blood Donor Community Real Stories SleevesUp Games Facebook Avatars and Badges Banners eCards Enter your ...

  2. Proximity of AUG sequences to initiation codon in genomic 5' UTR regulates mammalian protein expression.

    PubMed

    Al-Ali, Ruslan; González-Sarmiento, Rogelio

    2016-12-15

    Protein expression can be controlled via AUG sequences located upstream to the initiation codon in the 5' end untranslated region (5' UTR). Our study was focused on the effect of distance between the initiation codon and the first upstream AUG. An inhibitory effect on protein expression was established when AUG exists in 5' UTR, and this effect is increased when multiple AUG sequences occur there. The study was performed with ATG16L2, a non-lethal gene with no introns or upstream AUG sequence to avoid any interference. New mutations were generated at different locations within the promoter region of ATG16L2 gene and added to a plasmid construct containing a luciferase gene reporter gene. The results show a clear relationship between the distance of the novel AUGs from initiation codon and protein expression. The inhibitory effect was even stronger when multiple AUG sequences were present in 5' UTR.

  3. Identification of sequences regulating the transcription of a Dictyostelium gene selectively expressed in prespore cells.

    PubMed Central

    Early, A E; Williams, J G

    1989-01-01

    There has been considerable debate about the relative contributions of transcriptional and post-transcriptional mechanisms to the regulation of prespore gene expression in Dictyostelium. We have determined the DNA sequence upstream of D19, the Dictyostelium gene encoding PsA, a prespore-specific, cell surface protein of unknown function. Our analysis of gene fusions, in which D19 upstream sequences are placed adjacent to a heterologous reporter gene, indicates that transcriptional signals alone are sufficient for the correct temporal and cell-type specific expression of this gene. We also show that the 5' and 3' boundaries of the minimal sequences necessary for correct developmental regulation lie within the region 338 to 122 nucleotides upstream of the start site of transcription but that flanking sequences seem to be necessary for optimal expression. Images PMID:2550894

  4. Cutaneous skin tag

    MedlinePlus

    Skin tag; Acrochordon; Fibroepithelial polyp ... have diabetes. They are thought to occur from skin rubbing against skin. ... The tag sticks out of the skin and may have a short, narrow stalk connecting it to the surface of the skin. Some skin tags are as long as ...

  5. Analysis of C. elegans muscle transcriptome using trans-splicing-based RNA tagging (SRT)